docs: resync reference, user-guide, developer-guide, and messaging pages against code (#17738)
Broad drift audit against origin/main (b52b63396).
Reference pages (most user-visible drift):
- slash-commands: add /busy, /curator, /footer, /indicator, /redraw, /steer
that were missing; drop non-existent /terminal-setup; fix /q footnote
(resolves to /queue, not /quit); extend CLI-only list with all 24
CLI-only commands in the registry
- cli-commands: add dedicated sections for hermes curator / fallback /
hooks (new subcommands not previously documented); remove stale
hermes honcho standalone section (the plugin registers dynamically
via hermes memory); list curator/fallback/hooks in top-level table;
fix completion to include fish
- toolsets-reference: document the real 52-toolset count; split browser
vs browser-cdp; add discord / discord_admin / spotify / yuanbao;
correct hermes-cli tool count from 36 to 38; fix misleading claim
that hermes-homeassistant adds tools (it's identical to hermes-cli)
- tools-reference: bump tool count 55 -> 68; add 7 Spotify, 5 Yuanbao,
2 Discord toolsets; move browser_cdp/browser_dialog to their own
browser-cdp toolset section
- environment-variables: add 40+ user-facing HERMES_* vars that were
undocumented (--yolo, --accept-hooks, --ignore-*, inference model
override, agent/stream/checkpoint timeouts, OAuth trace, per-platform
batch tuning for Telegram/Discord/Matrix/Feishu/WeCom, cron knobs,
gateway restart/connect timeouts); dedupe the Cron Scheduler section;
replace stale QQ_SANDBOX with QQ_PORTAL_HOST
User-guide (top level):
- cli.md: compression preserves last 20 turns, not 4 (protect_last_n: 20)
- configuration.md: display.platforms is the canonical per-platform
override key; tool_progress_overrides is deprecated and auto-migrated
- profiles.md: model.default is the config key, not model.model
- sessions.md: CLI/TUI session IDs use 6-char hex, gateway uses 8
- checkpoints-and-rollback.md: destructive-command list now matches
_DESTRUCTIVE_PATTERNS (adds rmdir, cp, install, dd)
- docker.md: the container runs as non-root hermes (UID 10000) via
gosu; fix install command (uv pip); add missing --insecure on the
dashboard compose example (required for non-loopback bind)
- security.md: systemctl danger pattern also matches 'restart'
- index.md: built-in tool count 47 -> 68
- integrations/index.md: 6 STT providers, 8 memory providers
- integrations/providers.md: drop fictional dashscope/qwen aliases
Features:
- overview.md: 9 image models (not 8), 9 TTS providers (not 5),
8 memory providers (Supermemory was missing)
- tool-gateway.md: 9 image models
- tools.md: extend common-toolsets list with search / messaging /
spotify / discord / debugging / safe
- fallback-providers.md: add 6 real providers from PROVIDER_REGISTRY
(lmstudio, kimi-coding-cn, stepfun, alibaba-coding-plan,
tencent-tokenhub, azure-foundry)
- plugins.md: Available Hooks table now includes on_session_finalize,
on_session_reset, subagent_stop
- built-in-plugins.md: add the 7 bundled plugins the page didn't
mention (spotify, google_meet, three image_gen providers, two
dashboard examples)
- web-dashboard.md: add --insecure and --tui flags
- cron.md: hermes cron create takes positional schedule/prompt, not
flags
Messaging:
- telegram.md: TELEGRAM_WEBHOOK_SECRET is now REQUIRED when
TELEGRAM_WEBHOOK_URL is set (gateway refuses to start without it
per GHSA-3vpc-7q5r-276h). Biggest user-visible drift in the batch.
- discord.md: HERMES_DISCORD_TEXT_BATCH_SPLIT_DELAY_SECONDS default
is 2.0, not 0.1
- dingtalk.md: document DINGTALK_REQUIRE_MENTION /
FREE_RESPONSE_CHATS / MENTION_PATTERNS / HOME_CHANNEL /
ALLOW_ALL_USERS that the adapter supports
- bluebubbles.md: drop fictional BLUEBUBBLES_SEND_READ_RECEIPTS env
var; the setting lives in platforms.bluebubbles.extra only
- qqbot.md: drop dead QQ_SANDBOX; add real QQ_PORTAL_HOST and
QQ_GROUP_ALLOWED_USERS
- wecom-callback.md: replace 'hermes gateway start' (service-only)
with 'hermes gateway' for first-time setup
Developer-guide:
- architecture.md: refresh tool/toolset counts (61/52), terminal
backend count (7), line counts for run_agent.py (~13.7k), cli.py
(~11.5k), main.py (~10.4k), setup.py (~3.5k), gateway/run.py
(~12.2k), mcp_tool.py (~3.1k); add yuanbao adapter, bump platform
adapter count 18 -> 20
- agent-loop.md: run_agent.py line count 10.7k -> 13.7k
- tools-runtime.md: add vercel_sandbox backend
- adding-tools.md: remove stale 'Discovery import added to
model_tools.py' checklist item (registry auto-discovery)
- adding-platform-adapters.md: mark send_typing / get_chat_info as
concrete base methods; only connect/disconnect/send are abstract
- acp-internals.md: ACP sessions now persist to SessionDB
(~/.hermes/state.db); acp.run_agent call uses
use_unstable_protocol=True
- cron-internals.md: gateway runs scheduler in a dedicated background
thread via _start_cron_ticker, not on a maintenance cycle; locking
is cross-process via fcntl.flock (Unix) / msvcrt.locking (Windows)
- gateway-internals.md: gateway/run.py ~12k lines
- provider-runtime.md: cron DOES support fallback (run_job reads
fallback_providers from config)
- session-storage.md: SCHEMA_VERSION = 11 (not 9); add migrations
10 and 11 (trigram FTS, inline-mode FTS5 re-index); add
api_call_count column to Sessions DDL; document messages_fts_trigram
and state_meta in the architecture tree
- context-compression-and-caching.md: remove the obsolete 'context
pressure warnings' section (warnings were removed for causing
models to give up early)
- context-engine-plugin.md: compress() signature now includes
focus_topic param
- extending-the-cli.md: _build_tui_layout_children signature now
includes model_picker_widget; add to default layout
Also fixed three pre-existing broken links/anchors the build warned
about (docker.md -> api-server.md, yuanbao.md -> cron-jobs.md and
tips#background-tasks, nix-setup.md -> #container-aware-cli).
Regenerated per-skill pages via website/scripts/generate-skill-docs.py
so catalog tables and sidebar are consistent with current SKILL.md
frontmatter.
docusaurus build: clean, no broken links or anchors.
This commit is contained in:
parent
51b44b6e3f
commit
289cc47631
@ -27,7 +27,7 @@ hermes acp / hermes-acp / python -m acp_adapter
|
||||
-> load ~/.hermes/.env
|
||||
-> configure stderr logging
|
||||
-> construct HermesACPAgent
|
||||
-> acp.run_agent(agent)
|
||||
-> acp.run_agent(agent, use_unstable_protocol=True)
|
||||
```
|
||||
|
||||
Stdout is reserved for ACP JSON-RPC transport. Human-readable logs go to stderr.
|
||||
@ -170,7 +170,7 @@ ACP temporarily installs an approval callback on the terminal tool during prompt
|
||||
|
||||
## Current limitations
|
||||
|
||||
- ACP sessions are process-local from the ACP server's point of view
|
||||
- ACP sessions are persisted to the shared `~/.hermes/state.db` (SessionDB) and transparently restored across process restarts; they appear in `session_search`
|
||||
- non-text prompt blocks are currently ignored for request text extraction
|
||||
- editor-specific UX varies by ACP client implementation
|
||||
|
||||
|
||||
@ -18,11 +18,11 @@ User ↔ Messaging Platform ↔ Platform Adapter ↔ Gateway Runner ↔ AIAgent
|
||||
|
||||
Every adapter extends `BasePlatformAdapter` from `gateway/platforms/base.py` and implements:
|
||||
|
||||
- **`connect()`** — Establish connection (WebSocket, long-poll, HTTP server, etc.)
|
||||
- **`disconnect()`** — Clean shutdown
|
||||
- **`send()`** — Send a text message to a chat
|
||||
- **`send_typing()`** — Show typing indicator (optional)
|
||||
- **`get_chat_info()`** — Return chat metadata
|
||||
- **`connect()`** — Establish connection (WebSocket, long-poll, HTTP server, etc.) *(abstract)*
|
||||
- **`disconnect()`** — Clean shutdown *(abstract)*
|
||||
- **`send()`** — Send a text message to a chat *(abstract)*
|
||||
- **`send_typing()`** — Show typing indicator (optional override)
|
||||
- **`get_chat_info()`** — Return chat metadata (optional override)
|
||||
|
||||
Inbound messages are received by the adapter and forwarded via `self.handle_message(event)`, which the base class routes to the gateway runner.
|
||||
|
||||
|
||||
@ -192,7 +192,6 @@ OPTIONAL_ENV_VARS = {
|
||||
|
||||
- [ ] Tool file created with handler, schema, check function, and registration
|
||||
- [ ] Added to appropriate toolset in `toolsets.py`
|
||||
- [ ] Discovery import added to `model_tools.py`
|
||||
- [ ] Handler returns JSON strings, errors returned as `{"error": "..."}`
|
||||
- [ ] Optional: API key added to `OPTIONAL_ENV_VARS` in `hermes_cli/config.py`
|
||||
- [ ] Optional: Added to `toolset_distributions.py` for batch processing
|
||||
|
||||
@ -6,7 +6,7 @@ description: "Detailed walkthrough of AIAgent execution, API modes, tools, callb
|
||||
|
||||
# Agent Loop Internals
|
||||
|
||||
The core orchestration engine is `run_agent.py`'s `AIAgent` class — roughly 10,700 lines that handle everything from prompt assembly to tool dispatch to provider failover.
|
||||
The core orchestration engine is `run_agent.py`'s `AIAgent` class — roughly 13,700 lines that handle everything from prompt assembly to tool dispatch to provider failover.
|
||||
|
||||
## Core Responsibilities
|
||||
|
||||
@ -222,7 +222,7 @@ After each turn:
|
||||
|
||||
| File | Purpose |
|
||||
|------|---------|
|
||||
| `run_agent.py` | AIAgent class — the complete agent loop (~10,700 lines) |
|
||||
| `run_agent.py` | AIAgent class — the complete agent loop (~13,700 lines) |
|
||||
| `agent/prompt_builder.py` | System prompt assembly from memory, skills, context files, personality |
|
||||
| `agent/context_engine.py` | ContextEngine ABC — pluggable context management |
|
||||
| `agent/context_compressor.py` | Default engine — lossy summarization algorithm |
|
||||
|
||||
@ -32,15 +32,15 @@ This page is the top-level map of Hermes Agent internals. Use it to orient yours
|
||||
│ ┌──────┴───────┐ ┌──────┴───────┐ ┌──────┴───────┐ │
|
||||
│ │ Compression │ │ 3 API Modes │ │ Tool Registry│ │
|
||||
│ │ & Caching │ │ chat_compl. │ │ (registry.py)│ │
|
||||
│ │ │ │ codex_resp. │ │ 47 tools │ │
|
||||
│ │ │ │ anthropic │ │ 19 toolsets │ │
|
||||
│ │ │ │ codex_resp. │ │ 61 tools │ │
|
||||
│ │ │ │ anthropic │ │ 52 toolsets │ │
|
||||
│ └──────────────┘ └──────────────┘ └──────────────┘ │
|
||||
└─────────┴─────────────────┴─────────────────┴───────────────────────┘
|
||||
│ │
|
||||
▼ ▼
|
||||
┌───────────────────┐ ┌──────────────────────┐
|
||||
│ Session Storage │ │ Tool Backends │
|
||||
│ (SQLite + FTS5) │ │ Terminal (6 backends) │
|
||||
│ (SQLite + FTS5) │ │ Terminal (7 backends) │
|
||||
│ hermes_state.py │ │ Browser (5 backends) │
|
||||
│ gateway/session.py│ │ Web (4 backends) │
|
||||
└───────────────────┘ │ MCP (dynamic) │
|
||||
@ -52,8 +52,8 @@ This page is the top-level map of Hermes Agent internals. Use it to orient yours
|
||||
|
||||
```text
|
||||
hermes-agent/
|
||||
├── run_agent.py # AIAgent — core conversation loop (~10,700 lines)
|
||||
├── cli.py # HermesCLI — interactive terminal UI (~10,000 lines)
|
||||
├── run_agent.py # AIAgent — core conversation loop (~13,700 lines)
|
||||
├── cli.py # HermesCLI — interactive terminal UI (~11,500 lines)
|
||||
├── model_tools.py # Tool discovery, schema collection, dispatch
|
||||
├── toolsets.py # Tool groupings and platform presets
|
||||
├── hermes_state.py # SQLite session/state database with FTS5
|
||||
@ -76,14 +76,14 @@ hermes-agent/
|
||||
│ └── trajectory.py # Trajectory saving helpers
|
||||
│
|
||||
├── hermes_cli/ # CLI subcommands and setup
|
||||
│ ├── main.py # Entry point — all `hermes` subcommands (~6,000 lines)
|
||||
│ ├── main.py # Entry point — all `hermes` subcommands (~10,400 lines)
|
||||
│ ├── config.py # DEFAULT_CONFIG, OPTIONAL_ENV_VARS, migration
|
||||
│ ├── commands.py # COMMAND_REGISTRY — central slash command definitions
|
||||
│ ├── auth.py # PROVIDER_REGISTRY, credential resolution
|
||||
│ ├── runtime_provider.py # Provider → api_mode + credentials
|
||||
│ ├── models.py # Model catalog, provider model lists
|
||||
│ ├── model_switch.py # /model command logic (CLI + gateway shared)
|
||||
│ ├── setup.py # Interactive setup wizard (~3,100 lines)
|
||||
│ ├── setup.py # Interactive setup wizard (~3,500 lines)
|
||||
│ ├── skin_engine.py # CLI theming engine
|
||||
│ ├── skills_config.py # hermes skills — enable/disable per platform
|
||||
│ ├── skills_hub.py # /skills slash command
|
||||
@ -102,14 +102,14 @@ hermes-agent/
|
||||
│ ├── browser_tool.py # 10 browser automation tools
|
||||
│ ├── code_execution_tool.py # execute_code sandbox
|
||||
│ ├── delegate_tool.py # Subagent delegation
|
||||
│ ├── mcp_tool.py # MCP client (~2,200 lines)
|
||||
│ ├── mcp_tool.py # MCP client (~3,100 lines)
|
||||
│ ├── credential_files.py # File-based credential passthrough
|
||||
│ ├── env_passthrough.py # Env var passthrough for sandboxes
|
||||
│ ├── ansi_strip.py # ANSI escape stripping
|
||||
│ └── environments/ # Terminal backends (local, docker, ssh, modal, daytona, singularity)
|
||||
│
|
||||
├── gateway/ # Messaging platform gateway
|
||||
│ ├── run.py # GatewayRunner — message dispatch (~9,000 lines)
|
||||
│ ├── run.py # GatewayRunner — message dispatch (~12,200 lines)
|
||||
│ ├── session.py # SessionStore — conversation persistence
|
||||
│ ├── delivery.py # Outbound message delivery
|
||||
│ ├── pairing.py # DM pairing authorization
|
||||
@ -117,10 +117,11 @@ hermes-agent/
|
||||
│ ├── mirror.py # Cross-session message mirroring
|
||||
│ ├── status.py # Token locks, profile-scoped process tracking
|
||||
│ ├── builtin_hooks/ # Extension point for always-registered hooks (none shipped)
|
||||
│ └── platforms/ # 18 adapters: telegram, discord, slack, whatsapp,
|
||||
│ └── platforms/ # 20 adapters: telegram, discord, slack, whatsapp,
|
||||
│ # signal, matrix, mattermost, email, sms,
|
||||
│ # dingtalk, feishu, wecom, wecom_callback, weixin,
|
||||
│ # bluebubbles, qqbot, homeassistant, webhook, api_server
|
||||
│ # bluebubbles, qqbot, homeassistant, webhook, api_server,
|
||||
│ # yuanbao
|
||||
│
|
||||
├── acp_adapter/ # ACP server (VS Code / Zed / JetBrains)
|
||||
├── cron/ # Scheduler (jobs.py, scheduler.py)
|
||||
@ -212,7 +213,7 @@ A shared runtime resolver used by CLI, gateway, cron, ACP, and auxiliary calls.
|
||||
|
||||
### Tool System
|
||||
|
||||
Central tool registry (`tools/registry.py`) with 47 registered tools across 19 toolsets. Each tool file self-registers at import time. The registry handles schema collection, dispatch, availability checking, and error wrapping. Terminal tools support 6 backends (local, Docker, SSH, Daytona, Modal, Singularity).
|
||||
Central tool registry (`tools/registry.py`) with 61 registered tools across 52 toolsets. Each tool file self-registers at import time. The registry handles schema collection, dispatch, availability checking, and error wrapping. Terminal tools support 7 backends (local, Docker, SSH, Daytona, Modal, Singularity, Vercel Sandbox).
|
||||
|
||||
→ [Tools Runtime](./tools-runtime.md)
|
||||
|
||||
@ -224,7 +225,7 @@ SQLite-based session storage with FTS5 full-text search. Sessions have lineage t
|
||||
|
||||
### Messaging Gateway
|
||||
|
||||
Long-running process with 18 platform adapters, unified session routing, user authorization (allowlists + DM pairing), slash command dispatch, hook system, cron ticking, and background maintenance.
|
||||
Long-running process with 20 platform adapters, unified session routing, user authorization (allowlists + DM pairing), slash command dispatch, hook system, cron ticking, and background maintenance.
|
||||
|
||||
→ [Gateway Internals](./gateway-internals.md)
|
||||
|
||||
|
||||
@ -345,14 +345,4 @@ The CLI shows caching status at startup:
|
||||
|
||||
## Context Pressure Warnings
|
||||
|
||||
The agent emits context pressure warnings at 85% of the compression threshold
|
||||
(not 85% of context — 85% of the threshold which is itself 50% of context):
|
||||
|
||||
```
|
||||
⚠️ Context is 85% to compaction threshold (42,500/50,000 tokens)
|
||||
```
|
||||
|
||||
After compression, if usage drops below 85% of threshold, the warning state
|
||||
is cleared. If compression fails to reduce below the warning level (the
|
||||
conversation is too dense), the warning persists but compression won't
|
||||
re-trigger until the threshold is exceeded again.
|
||||
Intermediate context-pressure warnings have been removed (see the iteration-budget block in `run_agent.py`, which notes: "No intermediate pressure warnings — they caused models to 'give up' prematurely on complex tasks"). Compression fires when prompt tokens reach the configured `compression.threshold` (default 50%) with no prior warning step; gateway session hygiene fires as the secondary safety net at 85% of the model's context window.
|
||||
|
||||
@ -58,10 +58,15 @@ class LCMEngine(ContextEngine):
|
||||
def should_compress(self, prompt_tokens: int = None) -> bool:
|
||||
"""Return True if compaction should fire this turn."""
|
||||
|
||||
def compress(self, messages: list, current_tokens: int = None) -> list:
|
||||
def compress(self, messages: list, current_tokens: int = None,
|
||||
focus_topic: str = None) -> list:
|
||||
"""Compact the message list and return a new (possibly shorter) list.
|
||||
|
||||
The returned list must be a valid OpenAI-format message sequence.
|
||||
|
||||
``focus_topic`` is an optional topic string from manual
|
||||
``/compress <focus>``; engines that support guided compression should
|
||||
prioritise preserving information related to it, others may ignore it.
|
||||
"""
|
||||
```
|
||||
|
||||
|
||||
@ -102,7 +102,7 @@ tick()
|
||||
|
||||
### Gateway Integration
|
||||
|
||||
In gateway mode, the scheduler tick is integrated into the gateway's main event loop. The gateway calls `scheduler.tick()` on its periodic maintenance cycle, which runs alongside message handling.
|
||||
In gateway mode, the scheduler runs in a dedicated background thread (`_start_cron_ticker` in `gateway/run.py`) that calls `scheduler.tick()` every 60 seconds alongside message handling.
|
||||
|
||||
In CLI mode, cron jobs only fire when `hermes cron` commands are run or during active CLI sessions.
|
||||
|
||||
@ -205,7 +205,7 @@ Cron-run sessions have the `cronjob` toolset disabled. This prevents:
|
||||
|
||||
## Locking
|
||||
|
||||
The scheduler uses file-based locking to prevent overlapping ticks from executing the same due-job batch twice. This is important in gateway mode where multiple maintenance cycles could overlap if a previous tick takes longer than the tick interval.
|
||||
The scheduler uses cross-process file-based locking (`fcntl.flock` on Unix, `msvcrt.locking` on Windows) to prevent overlapping ticks from executing the same due-job batch twice — even between the gateway's in-process ticker and a standalone `hermes cron` / manual `tick()` call. If the lock cannot be acquired, `tick()` returns 0 immediately.
|
||||
|
||||
## CLI Interface
|
||||
|
||||
|
||||
@ -141,12 +141,13 @@ Override this only when you need full control over widget ordering. Most extensi
|
||||
|
||||
```python
|
||||
def _build_tui_layout_children(self, *, sudo_widget, secret_widget,
|
||||
approval_widget, clarify_widget, spinner_widget, spacer,
|
||||
status_bar, input_rule_top, image_bar, input_area,
|
||||
input_rule_bot, voice_status_bar, completions_menu) -> list:
|
||||
approval_widget, clarify_widget, model_picker_widget=None,
|
||||
spinner_widget=None, spacer, status_bar, input_rule_top,
|
||||
image_bar, input_area, input_rule_bot, voice_status_bar,
|
||||
completions_menu) -> list:
|
||||
```
|
||||
|
||||
The default implementation returns:
|
||||
The default implementation returns (any `None` widgets are filtered out):
|
||||
|
||||
```python
|
||||
[
|
||||
@ -155,6 +156,7 @@ The default implementation returns:
|
||||
secret_widget, # secret input prompt (conditional)
|
||||
approval_widget, # dangerous command approval (conditional)
|
||||
clarify_widget, # clarify question UI (conditional)
|
||||
model_picker_widget, # model picker overlay (conditional)
|
||||
spinner_widget, # thinking spinner (conditional)
|
||||
spacer, # fills remaining vertical space
|
||||
*self._get_extra_tui_widgets(), # YOUR WIDGETS GO HERE
|
||||
|
||||
@ -12,7 +12,7 @@ The messaging gateway is the long-running process that connects Hermes to 14+ ex
|
||||
|
||||
| File | Purpose |
|
||||
|------|---------|
|
||||
| `gateway/run.py` | `GatewayRunner` — main loop, slash commands, message dispatch (~9,000 lines) |
|
||||
| `gateway/run.py` | `GatewayRunner` — main loop, slash commands, message dispatch (~12,000 lines) |
|
||||
| `gateway/session.py` | `SessionStore` — conversation persistence and session key construction |
|
||||
| `gateway/delivery.py` | Outbound message delivery to target platforms/channels |
|
||||
| `gateway/pairing.py` | DM pairing flow for user authorization |
|
||||
|
||||
@ -179,9 +179,10 @@ Hermes supports a configured fallback model/provider pair, allowing runtime fail
|
||||
### What does NOT support fallback
|
||||
|
||||
- **Subagent delegation** (`tools/delegate_tool.py`): subagents inherit the parent's provider but not the fallback config
|
||||
- **Cron jobs** (`cron/`): run with a fixed provider, no fallback mechanism
|
||||
- **Auxiliary tasks**: use their own independent provider auto-detection chain (see Auxiliary model routing above)
|
||||
|
||||
Cron jobs **do** support fallback: `run_job()` reads `fallback_providers` (or legacy `fallback_model`) from `config.yaml` and passes it to `AIAgent(fallback_model=...)`, matching the gateway's `_load_fallback_model()` pattern. See [Cron Internals](./cron-internals.md).
|
||||
|
||||
### Test coverage
|
||||
|
||||
See `tests/test_fallback_model.py` for comprehensive tests covering all supported providers, one-shot semantics, and edge cases.
|
||||
|
||||
@ -11,10 +11,12 @@ Source file: `hermes_state.py`
|
||||
|
||||
```
|
||||
~/.hermes/state.db (SQLite, WAL mode)
|
||||
├── sessions — Session metadata, token counts, billing
|
||||
├── messages — Full message history per session
|
||||
├── messages_fts — FTS5 virtual table for full-text search
|
||||
└── schema_version — Single-row table tracking migration state
|
||||
├── sessions — Session metadata, token counts, billing
|
||||
├── messages — Full message history per session
|
||||
├── messages_fts — FTS5 virtual table (content + tool_name + tool_calls)
|
||||
├── messages_fts_trigram — FTS5 virtual table with trigram tokenizer (CJK / substring search)
|
||||
├── state_meta — Key/value metadata table
|
||||
└── schema_version — Single-row table tracking migration state
|
||||
```
|
||||
|
||||
Key design decisions:
|
||||
@ -57,6 +59,7 @@ CREATE TABLE IF NOT EXISTS sessions (
|
||||
cost_source TEXT,
|
||||
pricing_version TEXT,
|
||||
title TEXT,
|
||||
api_call_count INTEGER DEFAULT 0,
|
||||
FOREIGN KEY (parent_session_id) REFERENCES sessions(id)
|
||||
);
|
||||
|
||||
@ -130,10 +133,9 @@ END;
|
||||
|
||||
## Schema Version and Migrations
|
||||
|
||||
Current schema version: **9**
|
||||
Current schema version: **11**
|
||||
|
||||
The `schema_version` table stores a single integer. On initialization,
|
||||
`_init_schema()` checks the current version and applies migrations sequentially:
|
||||
The `schema_version` table stores a single integer. Simple column additions are handled declaratively by `_reconcile_columns()` (which diffs live columns against `SCHEMA_SQL` and ADDs any missing ones). The version-gated chain is reserved for data migrations and index/FTS changes that can't be expressed declaratively:
|
||||
|
||||
| Version | Change |
|
||||
|---------|--------|
|
||||
@ -146,10 +148,10 @@ The `schema_version` table stores a single integer. On initialization,
|
||||
| 7 | Add `reasoning_content` column to messages |
|
||||
| 8 | Add `api_call_count` column to sessions |
|
||||
| 9 | Add `codex_message_items` column to messages for Codex Responses message id/phase replay |
|
||||
| 10 | Add `messages_fts_trigram` virtual table (trigram tokenizer for CJK / substring search) and backfill existing rows |
|
||||
| 11 | Re-index `messages_fts` and `messages_fts_trigram` to cover `tool_name` + `tool_calls` and switch from external-content to inline mode; drop old triggers and backfill every message row |
|
||||
|
||||
Each migration uses `ALTER TABLE ADD COLUMN` wrapped in try/except to handle
|
||||
the column-already-exists case (idempotent). The version number is bumped after
|
||||
each successful migration block.
|
||||
Declarative column adds use `ALTER TABLE ADD COLUMN` wrapped in try/except to handle the column-already-exists case (idempotent). The version number is bumped after each successful migration block.
|
||||
|
||||
|
||||
## Write Contention Handling
|
||||
|
||||
@ -213,6 +213,7 @@ The terminal system supports multiple backends:
|
||||
- singularity
|
||||
- modal
|
||||
- daytona
|
||||
- vercel_sandbox
|
||||
|
||||
It also supports:
|
||||
|
||||
|
||||
@ -943,6 +943,6 @@ nix-store --query --roots $(docker exec hermes-agent readlink /data/current-pack
|
||||
| `hermes version` shows old version | Container not restarted | `systemctl restart hermes-agent` |
|
||||
| Permission denied on `/var/lib/hermes` | State dir is `0750 hermes:hermes` | Use `docker exec` or `sudo -u hermes` |
|
||||
| `nix-collect-garbage` removed hermes | GC root missing | Restart the service (preStart recreates the GC root) |
|
||||
| `no container with name or ID "hermes-agent"` (Podman) | Podman rootful container not visible to regular user | Add passwordless sudo for podman (see [Container-aware CLI](#container-aware-cli) section) |
|
||||
| `no container with name or ID "hermes-agent"` (Podman) | Podman rootful container not visible to regular user | Add passwordless sudo for podman (see [Container Mode](#container-mode) section) |
|
||||
| `unable to find user hermes` | Container still starting (entrypoint hasn't created user yet) | Wait a few seconds and retry — the CLI retries automatically |
|
||||
| Tool added via `extraPackages` not found in terminal | Requires `nixos-rebuild switch` to update the per-user profile | Rebuild and restart: `nixos-rebuild switch && systemctl restart hermes-agent` |
|
||||
|
||||
@ -29,7 +29,7 @@ It's not a coding copilot tethered to an IDE or a chatbot wrapper around a singl
|
||||
| 🗺️ **[Learning Path](/docs/getting-started/learning-path)** | Find the right docs for your experience level |
|
||||
| ⚙️ **[Configuration](/docs/user-guide/configuration)** | Config file, providers, models, and options |
|
||||
| 💬 **[Messaging Gateway](/docs/user-guide/messaging)** | Set up Telegram, Discord, Slack, or WhatsApp |
|
||||
| 🔧 **[Tools & Toolsets](/docs/user-guide/features/tools)** | 47 built-in tools and how to configure them |
|
||||
| 🔧 **[Tools & Toolsets](/docs/user-guide/features/tools)** | 68 built-in tools and how to configure them |
|
||||
| 🧠 **[Memory System](/docs/user-guide/features/memory)** | Persistent memory that grows across sessions |
|
||||
| 📚 **[Skills System](/docs/user-guide/features/skills)** | Procedural memory the agent creates and reuses |
|
||||
| 🔌 **[MCP Integration](/docs/user-guide/features/mcp)** | Connect to MCP servers, filter their tools, and extend Hermes safely |
|
||||
|
||||
@ -63,7 +63,7 @@ Text-to-speech and speech-to-text across all messaging platforms:
|
||||
|| **MiniMax** | Good | Paid | `MINIMAX_API_KEY` |
|
||||
|| **NeuTTS** | Good | Free | None needed |
|
||||
|
||||
Speech-to-text supports three providers: local Whisper (free, runs on-device), Groq (fast cloud), and OpenAI Whisper API. Voice message transcription works across Telegram, Discord, WhatsApp, and other messaging platforms. See [Voice & TTS](/docs/user-guide/features/tts) and [Voice Mode](/docs/user-guide/features/voice-mode) for details.
|
||||
Speech-to-text supports six providers: local faster-whisper (free, runs on-device), a local command wrapper, Groq, OpenAI Whisper API, Mistral, and xAI. Voice message transcription works across Telegram, Discord, WhatsApp, and other messaging platforms. See [Voice & TTS](/docs/user-guide/features/tts) and [Voice Mode](/docs/user-guide/features/voice-mode) for details.
|
||||
|
||||
## IDE & Editor Integration
|
||||
|
||||
@ -76,7 +76,7 @@ Speech-to-text supports three providers: local Whisper (free, runs on-device), G
|
||||
## Memory & Personalization
|
||||
|
||||
- **[Built-in Memory](/docs/user-guide/features/memory)** — Persistent, curated memory via `MEMORY.md` and `USER.md` files. The agent maintains bounded stores of personal notes and user profile data that survive across sessions.
|
||||
- **[Memory Providers](/docs/user-guide/features/memory-providers)** — Plug in external memory backends for deeper personalization. Seven providers are supported: Honcho (dialectic reasoning), OpenViking (tiered retrieval), Mem0 (cloud extraction), Hindsight (knowledge graphs), Holographic (local SQLite), RetainDB (hybrid search), and ByteRover (CLI-based).
|
||||
- **[Memory Providers](/docs/user-guide/features/memory-providers)** — Plug in external memory backends for deeper personalization. Eight providers are supported: Honcho (dialectic reasoning), OpenViking (tiered retrieval), Mem0 (cloud extraction), Hindsight (knowledge graphs), Holographic (local SQLite), RetainDB (hybrid search), ByteRover (CLI-based), and Supermemory.
|
||||
|
||||
## Messaging Platforms
|
||||
|
||||
|
||||
@ -28,7 +28,7 @@ You need at least one way to connect to an LLM. Use `hermes model` to switch pro
|
||||
| **GMI Cloud** | `GMI_API_KEY` in `~/.hermes/.env` (provider: `gmi`; aliases: `gmi-cloud`, `gmicloud`) |
|
||||
| **MiniMax** | `MINIMAX_API_KEY` in `~/.hermes/.env` (provider: `minimax`) |
|
||||
| **MiniMax China** | `MINIMAX_CN_API_KEY` in `~/.hermes/.env` (provider: `minimax-cn`) |
|
||||
| **Alibaba Cloud** | `DASHSCOPE_API_KEY` in `~/.hermes/.env` (provider: `alibaba`, aliases: `dashscope`, `qwen`) |
|
||||
| **Alibaba Cloud** | `DASHSCOPE_API_KEY` in `~/.hermes/.env` (provider: `alibaba`) |
|
||||
| **Alibaba Coding Plan** | `DASHSCOPE_API_KEY` (provider: `alibaba-coding-plan`, alias: `alibaba_coding`) — separate billing SKU, different endpoint |
|
||||
| **Kilo Code** | `KILOCODE_API_KEY` in `~/.hermes/.env` (provider: `kilocode`) |
|
||||
| **Xiaomi MiMo** | `XIAOMI_API_KEY` in `~/.hermes/.env` (provider: `xiaomi`, aliases: `mimo`, `xiaomi-mimo`) |
|
||||
|
||||
@ -38,6 +38,7 @@ hermes [global-options] <command> [subcommand/options]
|
||||
|---------|---------|
|
||||
| `hermes chat` | Interactive or one-shot chat with the agent. |
|
||||
| `hermes model` | Interactively choose the default provider and model. |
|
||||
| `hermes fallback` | Manage fallback providers tried when the primary model errors. |
|
||||
| `hermes gateway` | Run or manage the messaging gateway service. |
|
||||
| `hermes setup` | Interactive setup wizard for all or part of the configuration. |
|
||||
| `hermes whatsapp` | Configure and pair the WhatsApp bridge. |
|
||||
@ -47,6 +48,7 @@ hermes [global-options] <command> [subcommand/options]
|
||||
| `hermes status` | Show agent, auth, and platform status. |
|
||||
| `hermes cron` | Inspect and tick the cron scheduler. |
|
||||
| `hermes webhook` | Manage dynamic webhook subscriptions for event-driven activation. |
|
||||
| `hermes hooks` | Inspect, approve, or remove shell-script hooks declared in `config.yaml`. |
|
||||
| `hermes doctor` | Diagnose config and dependency issues. |
|
||||
| `hermes dump` | Copy-pasteable setup summary for support/debugging. |
|
||||
| `hermes debug` | Debug tools — upload logs and system info for support. |
|
||||
@ -56,8 +58,8 @@ hermes [global-options] <command> [subcommand/options]
|
||||
| `hermes config` | Show, edit, migrate, and query configuration files. |
|
||||
| `hermes pairing` | Approve or revoke messaging pairing codes. |
|
||||
| `hermes skills` | Browse, install, publish, audit, and configure skills. |
|
||||
| `hermes honcho` | Manage Honcho cross-session memory integration. |
|
||||
| `hermes memory` | Configure external memory provider. |
|
||||
| `hermes curator` | Background skill maintenance — status, run, pause, pin. See [Curator](../user-guide/features/curator.md). |
|
||||
| `hermes memory` | Configure external memory provider. Plugin-specific subcommands (e.g. `hermes honcho`) register automatically when their provider is active. |
|
||||
| `hermes acp` | Run Hermes as an ACP server for editor integration. |
|
||||
| `hermes mcp` | Manage MCP server configurations and run Hermes as an MCP server. |
|
||||
| `hermes plugins` | Manage Hermes Agent plugins (install, enable, disable, remove). |
|
||||
@ -68,7 +70,7 @@ hermes [global-options] <command> [subcommand/options]
|
||||
| `hermes claw` | OpenClaw migration helpers. |
|
||||
| `hermes dashboard` | Launch the web dashboard for managing config, API keys, and sessions. |
|
||||
| `hermes profile` | Manage profiles — multiple isolated Hermes instances. |
|
||||
| `hermes completion` | Print shell completion scripts (bash/zsh). |
|
||||
| `hermes completion` | Print shell completion scripts (bash/zsh/fish). |
|
||||
| `hermes version` | Show version information. |
|
||||
| `hermes update` | Pull latest code and reinstall dependencies. `--check` prints commit diff without pulling; `--backup` takes a pre-pull `HERMES_HOME` snapshot. |
|
||||
| `hermes uninstall` | Remove Hermes from the system. |
|
||||
@ -671,33 +673,59 @@ Notes:
|
||||
- `--source well-known` lets you point Hermes at a site exposing `/.well-known/skills/index.json`.
|
||||
- Passing an `http(s)://…/*.md` URL installs a single-file SKILL.md directly. When frontmatter has no `name:` and the URL slug isn't a valid identifier, an interactive terminal prompts for a name; non-interactive surfaces (`/skills install` inside the TUI, gateway platforms) require `--name <x>` instead.
|
||||
|
||||
## `hermes honcho`
|
||||
## `hermes curator`
|
||||
|
||||
```bash
|
||||
hermes honcho [--target-profile NAME] <subcommand>
|
||||
hermes curator <subcommand>
|
||||
```
|
||||
|
||||
Manage Honcho cross-session memory integration. This command is provided by the Honcho memory provider plugin and is only available when `memory.provider` is set to `honcho` in your config.
|
||||
|
||||
The `--target-profile` flag lets you manage another profile's Honcho config without switching to it.
|
||||
|
||||
Subcommands:
|
||||
The curator is an auxiliary-model background task that periodically reviews agent-created skills, prunes stale ones, consolidates overlaps, and archives obsolete skills. Bundled and hub-installed skills are never touched. Archives are recoverable; auto-deletion never happens.
|
||||
|
||||
| Subcommand | Description |
|
||||
|------------|-------------|
|
||||
| `setup` | Redirects to `hermes memory setup` (unified setup path). |
|
||||
| `status [--all]` | Show current Honcho config and connection status. `--all` shows a cross-profile overview. |
|
||||
| `peers` | Show peer identities across all profiles. |
|
||||
| `sessions` | List known Honcho session mappings. |
|
||||
| `map [name]` | Map the current directory to a Honcho session name. Omit `name` to list current mappings. |
|
||||
| `peer` | Show or update peer names and dialectic reasoning level. Options: `--user NAME`, `--ai NAME`, `--reasoning LEVEL`. |
|
||||
| `mode [mode]` | Show or set recall mode: `hybrid`, `context`, or `tools`. Omit to show current. |
|
||||
| `tokens` | Show or set token budgets for context and dialectic. Options: `--context N`, `--dialectic N`. |
|
||||
| `identity [file] [--show]` | Seed or show the AI peer identity representation. |
|
||||
| `enable` | Enable Honcho for the active profile. |
|
||||
| `disable` | Disable Honcho for the active profile. |
|
||||
| `sync` | Sync Honcho config to all existing profiles (creates missing host blocks). |
|
||||
| `migrate` | Step-by-step migration guide from openclaw-honcho to Hermes Honcho. |
|
||||
| `status` | Show curator status and skill stats |
|
||||
| `run` | Trigger a curator review now |
|
||||
| `pause` | Pause the curator until resumed |
|
||||
| `resume` | Resume a paused curator |
|
||||
| `pin <skill>` | Pin a skill so the curator never auto-transitions it |
|
||||
| `unpin <skill>` | Unpin a skill |
|
||||
| `restore <skill>` | Restore an archived skill |
|
||||
|
||||
See [Curator](../user-guide/features/curator.md) for behavior and config.
|
||||
|
||||
## `hermes fallback`
|
||||
|
||||
```bash
|
||||
hermes fallback <subcommand>
|
||||
```
|
||||
|
||||
Manage the fallback provider chain. Fallback providers are tried in order when the primary model fails with rate-limit, overload, or connection errors.
|
||||
|
||||
| Subcommand | Description |
|
||||
|------------|-------------|
|
||||
| `list` (alias: `ls`) | Show the current fallback chain (default when no subcommand) |
|
||||
| `add` | Pick a provider + model (same picker as `hermes model`) and append to the chain |
|
||||
| `remove` (alias: `rm`) | Pick an entry to delete from the chain |
|
||||
| `clear` | Remove all fallback entries |
|
||||
|
||||
See [Fallback Providers](../user-guide/features/fallback-providers.md).
|
||||
|
||||
## `hermes hooks`
|
||||
|
||||
```bash
|
||||
hermes hooks <subcommand>
|
||||
```
|
||||
|
||||
Inspect shell-script hooks declared in `~/.hermes/config.yaml`, test them against synthetic payloads, and manage the first-use consent allowlist at `~/.hermes/shell-hooks-allowlist.json`.
|
||||
|
||||
| Subcommand | Description |
|
||||
|------------|-------------|
|
||||
| `list` (alias: `ls`) | List configured hooks with matcher, timeout, and consent status |
|
||||
| `test <event>` | Fire every hook matching `<event>` against a synthetic payload |
|
||||
| `revoke` (aliases: `remove`, `rm`) | Remove a command's allowlist entries (takes effect on next restart) |
|
||||
| `doctor` | Check each configured hook: exec bit, allowlist, mtime drift, JSON validity, and synthetic run timing |
|
||||
|
||||
See [Hooks](../user-guide/features/hooks.md) for event signatures and payload shapes.
|
||||
|
||||
## `hermes memory`
|
||||
|
||||
@ -715,6 +743,10 @@ Subcommands:
|
||||
| `status` | Show current memory provider config. |
|
||||
| `off` | Disable external provider (built-in only). |
|
||||
|
||||
:::info Provider-specific subcommands
|
||||
When an external memory provider is active, it may register its own top-level `hermes <provider>` command for provider-specific management (e.g. `hermes honcho` when Honcho is active). Inactive providers do not expose their subcommands. Run `hermes --help` to see what's currently wired in.
|
||||
:::
|
||||
|
||||
## `hermes acp`
|
||||
|
||||
```bash
|
||||
@ -935,7 +967,7 @@ hermes -p work chat -q "Hello from work profile"
|
||||
## `hermes completion`
|
||||
|
||||
```bash
|
||||
hermes completion [bash|zsh]
|
||||
hermes completion [bash|zsh|fish]
|
||||
```
|
||||
|
||||
Print a shell completion script to stdout. Source the output in your shell profile for tab-completion of Hermes commands, subcommands, and profile names.
|
||||
@ -948,6 +980,9 @@ hermes completion bash >> ~/.bashrc
|
||||
|
||||
# Zsh
|
||||
hermes completion zsh >> ~/.zshrc
|
||||
|
||||
# Fish
|
||||
hermes completion fish > ~/.config/fish/completions/hermes.fish
|
||||
```
|
||||
|
||||
## `hermes update`
|
||||
|
||||
@ -225,7 +225,7 @@ For cloud sandbox backends, persistence is filesystem-oriented. `TERMINAL_LIFETI
|
||||
| `TELEGRAM_HOME_CHANNEL_NAME` | Display name for the Telegram home channel |
|
||||
| `TELEGRAM_WEBHOOK_URL` | Public HTTPS URL for webhook mode (enables webhook instead of polling) |
|
||||
| `TELEGRAM_WEBHOOK_PORT` | Local listen port for webhook server (default: `8443`) |
|
||||
| `TELEGRAM_WEBHOOK_SECRET` | Secret token for verifying updates come from Telegram |
|
||||
| `TELEGRAM_WEBHOOK_SECRET` | Secret token Telegram echoes back in each update for verification. **Required whenever `TELEGRAM_WEBHOOK_URL` is set** — the gateway refuses to start without it (GHSA-3vpc-7q5r-276h). Generate with `openssl rand -hex 32`. |
|
||||
| `TELEGRAM_REACTIONS` | Enable emoji reactions on messages during processing (default: `false`) |
|
||||
| `TELEGRAM_REPLY_TO_MODE` | Reply-reference behavior: `off`, `first` (default), or `all`. Matches the Discord pattern. |
|
||||
| `TELEGRAM_IGNORED_THREADS` | Comma-separated Telegram forum topic/thread IDs where the bot never responds |
|
||||
@ -341,7 +341,7 @@ For cloud sandbox backends, persistence is filesystem-oriented. `TERMINAL_LIFETI
|
||||
| `QQ_ALLOW_ALL_USERS` | Allow all users (`true`/`false`, overrides `QQ_ALLOWED_USERS`) |
|
||||
| `QQBOT_HOME_CHANNEL` | QQ user/group openID for cron delivery and notifications |
|
||||
| `QQBOT_HOME_CHANNEL_NAME` | Display name for the QQ home channel |
|
||||
| `QQ_SANDBOX` | Route QQ Bot to the sandbox gateway for development testing (`true`/`false`). Use with a sandbox app credential from [q.qq.com](https://q.qq.com). |
|
||||
| `QQ_PORTAL_HOST` | Override the QQ portal host (set to `sandbox.q.qq.com` to route through the sandbox gateway; default: `q.qq.com`). |
|
||||
| `MATTERMOST_URL` | Mattermost server URL (e.g. `https://mm.example.com`) |
|
||||
| `MATTERMOST_TOKEN` | Bot token or personal access token for Mattermost |
|
||||
| `MATTERMOST_ALLOWED_USERS` | Comma-separated Mattermost user IDs allowed to message the bot |
|
||||
@ -380,11 +380,45 @@ For cloud sandbox backends, persistence is filesystem-oriented. `TERMINAL_LIFETI
|
||||
| `GATEWAY_ALLOWED_USERS` | Comma-separated user IDs allowed across all platforms |
|
||||
| `GATEWAY_ALLOW_ALL_USERS` | Allow all users without allowlists (`true`/`false`, default: `false`) |
|
||||
|
||||
### Advanced Messaging Tuning
|
||||
|
||||
Advanced per-platform knobs for throttling the outbound message batcher. Most users never need to touch these; defaults are set to respect each platform's rate limits without feeling sluggish.
|
||||
|
||||
| Variable | Description |
|
||||
|----------|-------------|
|
||||
| `HERMES_TELEGRAM_TEXT_BATCH_DELAY_SECONDS` | Grace window before flushing a queued Telegram text chunk (default: `0.6`). |
|
||||
| `HERMES_TELEGRAM_TEXT_BATCH_SPLIT_DELAY_SECONDS` | Delay between split chunks when a single Telegram message exceeds the length limit (default: `2.0`). |
|
||||
| `HERMES_TELEGRAM_MEDIA_BATCH_DELAY_SECONDS` | Grace window before flushing queued Telegram media (default: `0.6`). |
|
||||
| `HERMES_TELEGRAM_FOLLOWUP_GRACE_SECONDS` | Delay before sending a follow-up after the agent finishes, to avoid racing the last stream chunk. |
|
||||
| `HERMES_TELEGRAM_HTTP_CONNECT_TIMEOUT` / `_READ_TIMEOUT` / `_WRITE_TIMEOUT` / `_POOL_TIMEOUT` | Override the underlying `python-telegram-bot` HTTP timeouts (seconds). |
|
||||
| `HERMES_TELEGRAM_HTTP_POOL_SIZE` | Max concurrent HTTP connections to the Telegram API. |
|
||||
| `HERMES_TELEGRAM_DISABLE_FALLBACK_IPS` | Disable the hard-coded Cloudflare fallback IPs used when DNS fails (`true`/`false`). |
|
||||
| `HERMES_DISCORD_TEXT_BATCH_DELAY_SECONDS` | Grace window before flushing a queued Discord text chunk (default: `0.6`). |
|
||||
| `HERMES_DISCORD_TEXT_BATCH_SPLIT_DELAY_SECONDS` | Delay between split chunks when a Discord message exceeds the length limit (default: `2.0`). |
|
||||
| `HERMES_MATRIX_TEXT_BATCH_DELAY_SECONDS` / `_SPLIT_DELAY_SECONDS` | Matrix equivalents of the Telegram batch knobs. |
|
||||
| `HERMES_FEISHU_TEXT_BATCH_DELAY_SECONDS` / `_SPLIT_DELAY_SECONDS` / `_MAX_CHARS` / `_MAX_MESSAGES` | Feishu batcher tuning — delay, split delay, max chars per message, max messages per batch. |
|
||||
| `HERMES_FEISHU_MEDIA_BATCH_DELAY_SECONDS` | Feishu media flush delay. |
|
||||
| `HERMES_FEISHU_DEDUP_CACHE_SIZE` | Size of the Feishu webhook dedup cache (default: `1024`). |
|
||||
| `HERMES_WECOM_TEXT_BATCH_DELAY_SECONDS` / `_SPLIT_DELAY_SECONDS` | WeCom batcher tuning. |
|
||||
| `HERMES_VISION_DOWNLOAD_TIMEOUT` | Timeout in seconds for downloading an image before handing it to vision models (default: `30`). |
|
||||
| `HERMES_RESTART_DRAIN_TIMEOUT` | Gateway: seconds to wait for active runs to drain on `/restart` before forcing the restart (default: `900`). |
|
||||
| `HERMES_GATEWAY_PLATFORM_CONNECT_TIMEOUT` | Per-platform connect timeout during gateway startup (seconds). |
|
||||
| `HERMES_GATEWAY_BUSY_INPUT_MODE` | Default gateway busy-input behavior: `queue`, `steer`, or `interrupt`. Can be overridden per chat with `/busy`. |
|
||||
| `HERMES_CRON_TIMEOUT` | Inactivity timeout for cron job agent runs in seconds (default: `600`). The agent can run indefinitely while actively calling tools or receiving stream tokens — this only triggers when idle. Set to `0` for unlimited. |
|
||||
| `HERMES_CRON_SCRIPT_TIMEOUT` | Timeout for pre-run scripts attached to cron jobs in seconds (default: `120`). Override for scripts that need longer execution (e.g., randomized delays for anti-bot timing). Also configurable via `cron.script_timeout_seconds` in `config.yaml`. |
|
||||
| `HERMES_CRON_MAX_PARALLEL` | Max cron jobs run in parallel per tick (default: `4`). |
|
||||
|
||||
## Agent Behavior
|
||||
|
||||
| Variable | Description |
|
||||
|----------|-------------|
|
||||
| `HERMES_MAX_ITERATIONS` | Max tool-calling iterations per conversation (default: 90) |
|
||||
| `HERMES_INFERENCE_MODEL` | Override model name at process level (takes priority over `config.yaml` for the session). Also settable via `-m`/`--model` flag. |
|
||||
| `HERMES_YOLO_MODE` | Set to `1` to bypass dangerous-command approval prompts. Equivalent to `--yolo`. |
|
||||
| `HERMES_ACCEPT_HOOKS` | Auto-approve any unseen shell hooks declared in `config.yaml` without a TTY prompt. Equivalent to `--accept-hooks` or `hooks_auto_accept: true`. |
|
||||
| `HERMES_IGNORE_USER_CONFIG` | Skip `~/.hermes/config.yaml` and use built-in defaults (credentials in `.env` still load). Equivalent to `--ignore-user-config`. |
|
||||
| `HERMES_IGNORE_RULES` | Skip auto-injection of `AGENTS.md`, `SOUL.md`, `.cursorrules`, memory, and preloaded skills. Equivalent to `--ignore-rules`. |
|
||||
| `HERMES_MD_NAMES` | Comma-separated list of rules-file names to auto-inject (default: `AGENTS.md,CLAUDE.md,.cursorrules,SOUL.md`). |
|
||||
| `HERMES_TOOL_PROGRESS` | Deprecated compatibility variable for tool progress display. Prefer `display.tool_progress` in `config.yaml`. |
|
||||
| `HERMES_TOOL_PROGRESS_MODE` | Deprecated compatibility variable for tool progress mode. Prefer `display.tool_progress` in `config.yaml`. |
|
||||
| `HERMES_HUMAN_DELAY_MODE` | Response pacing: `off`/`natural`/`custom` |
|
||||
@ -395,10 +429,30 @@ For cloud sandbox backends, persistence is filesystem-oriented. `TERMINAL_LIFETI
|
||||
| `HERMES_API_CALL_STALE_TIMEOUT` | Non-streaming stale-call timeout in seconds (default: `300`). Auto-disabled for local providers when left unset. Also configurable via `providers.<id>.stale_timeout_seconds` or `providers.<id>.models.<model>.stale_timeout_seconds` in `config.yaml`. |
|
||||
| `HERMES_STREAM_READ_TIMEOUT` | Streaming socket read timeout in seconds (default: `120`). Auto-increased to `HERMES_API_TIMEOUT` for local providers. Increase if local LLMs time out during long code generation. |
|
||||
| `HERMES_STREAM_STALE_TIMEOUT` | Stale stream detection timeout in seconds (default: `180`). Auto-disabled for local providers. Triggers connection kill if no chunks arrive within this window. |
|
||||
| `HERMES_STREAM_RETRIES` | Number of mid-stream reconnect attempts on transient network errors (default: `3`). |
|
||||
| `HERMES_AGENT_TIMEOUT` | Gateway inactivity timeout for a running agent in seconds (default: `900`). Resets on every tool call and streamed token. Set to `0` to disable. |
|
||||
| `HERMES_AGENT_TIMEOUT_WARNING` | Gateway: send a warning message after this many seconds of inactivity (default: 75% of `HERMES_AGENT_TIMEOUT`). |
|
||||
| `HERMES_AGENT_NOTIFY_INTERVAL` | Gateway: interval in seconds between progress notifications on long-running agent turns. |
|
||||
| `HERMES_CHECKPOINT_TIMEOUT` | Timeout for filesystem checkpoint creation in seconds (default: `30`). |
|
||||
| `HERMES_EXEC_ASK` | Enable execution approval prompts in gateway mode (`true`/`false`) |
|
||||
| `HERMES_ENABLE_PROJECT_PLUGINS` | Enable auto-discovery of repo-local plugins from `./.hermes/plugins/` (`true`/`false`, default: `false`) |
|
||||
| `HERMES_BACKGROUND_NOTIFICATIONS` | Background process notification mode in gateway: `all` (default), `result`, `error`, `off` |
|
||||
| `HERMES_EPHEMERAL_SYSTEM_PROMPT` | Ephemeral system prompt injected at API-call time (never persisted to sessions) |
|
||||
| `HERMES_PREFILL_MESSAGES_FILE` | Path to a JSON file of ephemeral prefill messages injected at API-call time. |
|
||||
| `HERMES_ALLOW_PRIVATE_URLS` | `true`/`false` — allow tools to fetch localhost/private-network URLs. Off by default in gateway mode. |
|
||||
| `HERMES_REDACT_SECRETS` | `true`/`false` — control secret redaction in logs and shareable outputs (default: `true`). |
|
||||
| `HERMES_WRITE_SAFE_ROOT` | Optional directory prefix that restricts `write_file`/`patch` writes; paths outside require approval. |
|
||||
| `HERMES_DISABLE_FILE_STATE_GUARD` | Set to `1` to turn off the "file changed since you read it" guard on `patch`/`write_file`. |
|
||||
| `HERMES_CORE_TOOLS` | Comma-separated override for the canonical core tool list (advanced; rarely needed). |
|
||||
| `HERMES_BUNDLED_SKILLS` | Comma-separated override for the list of bundled skills loaded at startup. |
|
||||
| `HERMES_OPTIONAL_SKILLS` | Comma-separated list of optional-skill names to auto-install on first run. |
|
||||
| `HERMES_DEBUG_INTERRUPT` | Set to `1` to log detailed interrupt/cancel tracing to `agent.log`. |
|
||||
| `HERMES_DUMP_REQUESTS` | Dump API request payloads to log files (`true`/`false`) |
|
||||
| `HERMES_DUMP_REQUEST_STDOUT` | Dump API request payloads to stdout instead of log files. |
|
||||
| `HERMES_OAUTH_TRACE` | Set to `1` to log OAuth token exchange and refresh attempts. Includes redacted timing info. |
|
||||
| `HERMES_OAUTH_FILE` | Override the path used for OAuth credential storage (default: `~/.hermes/auth.json`). |
|
||||
| `HERMES_AGENT_HELP_GUIDANCE` | Append additional guidance text to the system prompt for custom deployments. |
|
||||
| `HERMES_AGENT_LOGO` | Override the ASCII banner logo at CLI startup. |
|
||||
| `DELEGATION_MAX_CONCURRENT_CHILDREN` | Max parallel subagents per `delegate_task` batch (default: `3`, floor of 1, no ceiling). Also configurable via `delegation.max_concurrent_children` in `config.yaml` — the config value takes priority. |
|
||||
|
||||
## Interface
|
||||
@ -411,13 +465,6 @@ For cloud sandbox backends, persistence is filesystem-oriented. `TERMINAL_LIFETI
|
||||
| `HERMES_TUI_THEME` | Force the TUI color theme: `light`, `dark`, or a raw 6-character background hex (e.g. `ffffff` or `1a1a2e`). When unset, Hermes auto-detects using `COLORFGBG` and terminal background queries; this variable overrides detection on terminals (Ghostty, Warp, iTerm2, etc.) that don't set `COLORFGBG`. |
|
||||
| `HERMES_INFERENCE_MODEL` | Force the model for `hermes -z` / `hermes chat` without mutating `config.yaml`. Pairs with `HERMES_INFERENCE_PROVIDER`. Useful for scripted callers (sweeper, CI, batch runners) that need to override the default model per run. |
|
||||
|
||||
## Cron Scheduler
|
||||
|
||||
| Variable | Description |
|
||||
|----------|-------------|
|
||||
| `HERMES_CRON_TIMEOUT` | Inactivity timeout for cron job agent runs in seconds (default: `600`). The agent can run indefinitely while actively calling tools or receiving stream tokens — this only triggers when idle. Set to `0` for unlimited. |
|
||||
| `HERMES_CRON_SCRIPT_TIMEOUT` | Timeout for pre-run scripts attached to cron jobs in seconds (default: `120`). Override for scripts that need longer execution (e.g., randomized delays for anti-bot timing). Also configurable via `cron.script_timeout_seconds` in `config.yaml`. |
|
||||
|
||||
## Session Settings
|
||||
|
||||
| Variable | Description |
|
||||
|
||||
@ -14,114 +14,119 @@ If a skill is missing from this list but present in the repo, the catalog is reg
|
||||
|
||||
| Skill | Description | Path |
|
||||
|-------|-------------|------|
|
||||
| [`apple-notes`](/docs/user-guide/skills/bundled/apple/apple-apple-notes) | Manage Apple Notes via the memo CLI on macOS (create, view, search, edit). | `apple/apple-notes` |
|
||||
| [`apple-reminders`](/docs/user-guide/skills/bundled/apple/apple-apple-reminders) | Manage Apple Reminders via remindctl CLI (list, add, complete, delete). | `apple/apple-reminders` |
|
||||
| [`findmy`](/docs/user-guide/skills/bundled/apple/apple-findmy) | Track Apple devices and AirTags via FindMy.app on macOS using AppleScript and screen capture. | `apple/findmy` |
|
||||
| [`apple-notes`](/docs/user-guide/skills/bundled/apple/apple-apple-notes) | Manage Apple Notes via memo CLI: create, search, edit. | `apple/apple-notes` |
|
||||
| [`apple-reminders`](/docs/user-guide/skills/bundled/apple/apple-apple-reminders) | Apple Reminders via remindctl: add, list, complete. | `apple/apple-reminders` |
|
||||
| [`findmy`](/docs/user-guide/skills/bundled/apple/apple-findmy) | Track Apple devices/AirTags via FindMy.app on macOS. | `apple/findmy` |
|
||||
| [`imessage`](/docs/user-guide/skills/bundled/apple/apple-imessage) | Send and receive iMessages/SMS via the imsg CLI on macOS. | `apple/imessage` |
|
||||
|
||||
## autonomous-ai-agents
|
||||
|
||||
| Skill | Description | Path |
|
||||
|-------|-------------|------|
|
||||
| [`claude-code`](/docs/user-guide/skills/bundled/autonomous-ai-agents/autonomous-ai-agents-claude-code) | Delegate coding tasks to Claude Code (Anthropic's CLI agent). Use for building features, refactoring, PR reviews, and iterative coding. Requires the claude CLI installed. | `autonomous-ai-agents/claude-code` |
|
||||
| [`codex`](/docs/user-guide/skills/bundled/autonomous-ai-agents/autonomous-ai-agents-codex) | Delegate coding tasks to OpenAI Codex CLI agent. Use for building features, refactoring, PR reviews, and batch issue fixing. Requires the codex CLI and a git repository. | `autonomous-ai-agents/codex` |
|
||||
| [`hermes-agent`](/docs/user-guide/skills/bundled/autonomous-ai-agents/autonomous-ai-agents-hermes-agent) | Complete guide to using and extending Hermes Agent — CLI usage, setup, configuration, spawning additional agents, gateway platforms, skills, voice, tools, profiles, and a concise contributor reference. Load this skill when helping users... | `autonomous-ai-agents/hermes-agent` |
|
||||
| [`opencode`](/docs/user-guide/skills/bundled/autonomous-ai-agents/autonomous-ai-agents-opencode) | Delegate coding tasks to OpenCode CLI agent for feature implementation, refactoring, PR review, and long-running autonomous sessions. Requires the opencode CLI installed and authenticated. | `autonomous-ai-agents/opencode` |
|
||||
| [`claude-code`](/docs/user-guide/skills/bundled/autonomous-ai-agents/autonomous-ai-agents-claude-code) | Delegate coding to Claude Code CLI (features, PRs). | `autonomous-ai-agents/claude-code` |
|
||||
| [`codex`](/docs/user-guide/skills/bundled/autonomous-ai-agents/autonomous-ai-agents-codex) | Delegate coding to OpenAI Codex CLI (features, PRs). | `autonomous-ai-agents/codex` |
|
||||
| [`hermes-agent`](/docs/user-guide/skills/bundled/autonomous-ai-agents/autonomous-ai-agents-hermes-agent) | Configure, extend, or contribute to Hermes Agent. | `autonomous-ai-agents/hermes-agent` |
|
||||
| [`opencode`](/docs/user-guide/skills/bundled/autonomous-ai-agents/autonomous-ai-agents-opencode) | Delegate coding to OpenCode CLI (features, PR review). | `autonomous-ai-agents/opencode` |
|
||||
|
||||
## creative
|
||||
|
||||
| Skill | Description | Path |
|
||||
|-------|-------------|------|
|
||||
| [`architecture-diagram`](/docs/user-guide/skills/bundled/creative/creative-architecture-diagram) | Generate dark-themed SVG diagrams of software systems and cloud infrastructure as standalone HTML files with inline SVG graphics. Semantic component colors (cyan=frontend, emerald=backend, violet=database, amber=cloud/AWS, rose=security,... | `creative/architecture-diagram` |
|
||||
| [`ascii-art`](/docs/user-guide/skills/bundled/creative/creative-ascii-art) | Generate ASCII art using pyfiglet (571 fonts), cowsay, boxes, toilet, image-to-ascii, remote APIs (asciified, ascii.co.uk), and LLM fallback. No API keys required. | `creative/ascii-art` |
|
||||
| [`ascii-video`](/docs/user-guide/skills/bundled/creative/creative-ascii-video) | Production pipeline for ASCII art video — any format. Converts video/audio/images/generative input into colored ASCII character video output (MP4, GIF, image sequence). Covers: video-to-ASCII conversion, audio-reactive music visualizers,... | `creative/ascii-video` |
|
||||
| [`baoyu-comic`](/docs/user-guide/skills/bundled/creative/creative-baoyu-comic) | Knowledge comic creator supporting multiple art styles and tones. Creates original educational comics with detailed panel layouts and sequential image generation. Use when user asks to create "知识漫画", "教育漫画", "biography comic", "tutorial... | `creative/baoyu-comic` |
|
||||
| [`baoyu-infographic`](/docs/user-guide/skills/bundled/creative/creative-baoyu-infographic) | Generate professional infographics with 21 layout types and 21 visual styles. Analyzes content, recommends layout×style combinations, and generates publication-ready infographics. Use when user asks to create "infographic", "visual summa... | `creative/baoyu-infographic` |
|
||||
| [`ideation`](/docs/user-guide/skills/bundled/creative/creative-creative-ideation) | Generate project ideas through creative constraints. Use when the user says 'I want to build something', 'give me a project idea', 'I'm bored', 'what should I make', 'inspire me', or any variant of 'I have tools but no direction'. Works... | `creative/creative-ideation` |
|
||||
| [`design-md`](/docs/user-guide/skills/bundled/creative/creative-design-md) | Author, validate, diff, and export DESIGN.md files — Google's open-source format spec that gives coding agents a persistent, structured understanding of a design system (tokens + rationale in one file). Use when building a design system,... | `creative/design-md` |
|
||||
| [`excalidraw`](/docs/user-guide/skills/bundled/creative/creative-excalidraw) | Create hand-drawn style diagrams using Excalidraw JSON format. Generate .excalidraw files for architecture diagrams, flowcharts, sequence diagrams, concept maps, and more. Files can be opened at excalidraw.com or uploaded for shareable l... | `creative/excalidraw` |
|
||||
| [`manim-video`](/docs/user-guide/skills/bundled/creative/creative-manim-video) | Production pipeline for mathematical and technical animations using Manim Community Edition. Creates 3Blue1Brown-style explainer videos, algorithm visualizations, equation derivations, architecture diagrams, and data stories. Use when us... | `creative/manim-video` |
|
||||
| [`p5js`](/docs/user-guide/skills/bundled/creative/creative-p5js) | Production pipeline for interactive and generative visual art using p5.js. Creates browser-based sketches, generative art, data visualizations, interactive experiences, 3D scenes, audio-reactive visuals, and motion graphics — exported as... | `creative/p5js` |
|
||||
| [`pixel-art`](/docs/user-guide/skills/bundled/creative/creative-pixel-art) | Convert images into retro pixel art with hardware-accurate palettes (NES, Game Boy, PICO-8, C64, etc.), and animate them into short videos. Presets cover arcade, SNES, and 10+ era-correct looks. Use `clarify` to let the user pick a style... | `creative/pixel-art` |
|
||||
| [`popular-web-designs`](/docs/user-guide/skills/bundled/creative/creative-popular-web-designs) | 54 production-quality design systems extracted from real websites. Load a template to generate HTML/CSS that matches the visual identity of sites like Stripe, Linear, Vercel, Notion, Airbnb, and more. Each template includes colors, typog... | `creative/popular-web-designs` |
|
||||
| [`songwriting-and-ai-music`](/docs/user-guide/skills/bundled/creative/creative-songwriting-and-ai-music) | Songwriting craft, AI music generation prompts (Suno focus), parody/adaptation techniques, phonetic tricks, and lessons learned. These are tools and ideas, not rules. Break any of them when the art calls for it. | `creative/songwriting-and-ai-music` |
|
||||
| [`architecture-diagram`](/docs/user-guide/skills/bundled/creative/creative-architecture-diagram) | Dark-themed SVG architecture/cloud/infra diagrams as HTML. | `creative/architecture-diagram` |
|
||||
| [`ascii-art`](/docs/user-guide/skills/bundled/creative/creative-ascii-art) | ASCII art: pyfiglet, cowsay, boxes, image-to-ascii. | `creative/ascii-art` |
|
||||
| [`ascii-video`](/docs/user-guide/skills/bundled/creative/creative-ascii-video) | ASCII video: convert video/audio to colored ASCII MP4/GIF. | `creative/ascii-video` |
|
||||
| [`baoyu-comic`](/docs/user-guide/skills/bundled/creative/creative-baoyu-comic) | Knowledge comics (知识漫画): educational, biography, tutorial. | `creative/baoyu-comic` |
|
||||
| [`baoyu-infographic`](/docs/user-guide/skills/bundled/creative/creative-baoyu-infographic) | Infographics: 21 layouts x 21 styles (信息图, 可视化). | `creative/baoyu-infographic` |
|
||||
| [`claude-design`](/docs/user-guide/skills/bundled/creative/creative-claude-design) | Design one-off HTML artifacts (landing, deck, prototype). | `creative/claude-design` |
|
||||
| [`comfyui`](/docs/user-guide/skills/bundled/creative/creative-comfyui) | Generate images, video, and audio with ComfyUI — install, launch, manage nodes/models, run workflows with parameter injection. Uses the official comfy-cli for lifecycle and direct REST API for execution. | `creative/comfyui` |
|
||||
| [`ideation`](/docs/user-guide/skills/bundled/creative/creative-creative-ideation) | Generate project ideas via creative constraints. | `creative/creative-ideation` |
|
||||
| [`design-md`](/docs/user-guide/skills/bundled/creative/creative-design-md) | Author/validate/export Google's DESIGN.md token spec files. | `creative/design-md` |
|
||||
| [`excalidraw`](/docs/user-guide/skills/bundled/creative/creative-excalidraw) | Hand-drawn Excalidraw JSON diagrams (arch, flow, seq). | `creative/excalidraw` |
|
||||
| [`humanizer`](/docs/user-guide/skills/bundled/creative/creative-humanizer) | Humanize text: strip AI-isms and add real voice. | `creative/humanizer` |
|
||||
| [`manim-video`](/docs/user-guide/skills/bundled/creative/creative-manim-video) | Manim CE animations: 3Blue1Brown math/algo videos. | `creative/manim-video` |
|
||||
| [`p5js`](/docs/user-guide/skills/bundled/creative/creative-p5js) | p5.js sketches: gen art, shaders, interactive, 3D. | `creative/p5js` |
|
||||
| [`pixel-art`](/docs/user-guide/skills/bundled/creative/creative-pixel-art) | Pixel art w/ era palettes (NES, Game Boy, PICO-8). | `creative/pixel-art` |
|
||||
| [`popular-web-designs`](/docs/user-guide/skills/bundled/creative/creative-popular-web-designs) | 54 real design systems (Stripe, Linear, Vercel) as HTML/CSS. | `creative/popular-web-designs` |
|
||||
| [`pretext`](/docs/user-guide/skills/bundled/creative/creative-pretext) | Use when building creative browser demos with @chenglou/pretext — DOM-free text layout for ASCII art, typographic flow around obstacles, text-as-geometry games, kinetic typography, and text-powered generative art. Produces single-file HT... | `creative/pretext` |
|
||||
| [`sketch`](/docs/user-guide/skills/bundled/creative/creative-sketch) | Throwaway HTML mockups: 2-3 design variants to compare. | `creative/sketch` |
|
||||
| [`songwriting-and-ai-music`](/docs/user-guide/skills/bundled/creative/creative-songwriting-and-ai-music) | Songwriting craft and Suno AI music prompts. | `creative/songwriting-and-ai-music` |
|
||||
| [`touchdesigner-mcp`](/docs/user-guide/skills/bundled/creative/creative-touchdesigner-mcp) | Control a running TouchDesigner instance via twozero MCP — create operators, set parameters, wire connections, execute Python, build real-time visuals. 36 native tools. | `creative/touchdesigner-mcp` |
|
||||
|
||||
## data-science
|
||||
|
||||
| Skill | Description | Path |
|
||||
|-------|-------------|------|
|
||||
| [`jupyter-live-kernel`](/docs/user-guide/skills/bundled/data-science/data-science-jupyter-live-kernel) | Use a live Jupyter kernel for stateful, iterative Python execution via hamelnb. Load this skill when the task involves exploration, iteration, or inspecting intermediate results — data science, ML experimentation, API exploration, or bui... | `data-science/jupyter-live-kernel` |
|
||||
| [`jupyter-live-kernel`](/docs/user-guide/skills/bundled/data-science/data-science-jupyter-live-kernel) | Iterative Python via live Jupyter kernel (hamelnb). | `data-science/jupyter-live-kernel` |
|
||||
|
||||
## devops
|
||||
|
||||
| Skill | Description | Path |
|
||||
|-------|-------------|------|
|
||||
| [`webhook-subscriptions`](/docs/user-guide/skills/bundled/devops/devops-webhook-subscriptions) | Create and manage webhook subscriptions for event-driven agent activation, or for direct push notifications (zero LLM cost). Use when the user wants external services to trigger agent runs OR push notifications to chats. | `devops/webhook-subscriptions` |
|
||||
| [`webhook-subscriptions`](/docs/user-guide/skills/bundled/devops/devops-webhook-subscriptions) | Webhook subscriptions: event-driven agent runs. | `devops/webhook-subscriptions` |
|
||||
|
||||
## dogfood
|
||||
|
||||
| Skill | Description | Path |
|
||||
|-------|-------------|------|
|
||||
| [`dogfood`](/docs/user-guide/skills/bundled/dogfood/dogfood-dogfood) | Systematic exploratory QA testing of web applications — find bugs, capture evidence, and generate structured reports | `dogfood` |
|
||||
| [`dogfood`](/docs/user-guide/skills/bundled/dogfood/dogfood-dogfood) | Exploratory QA of web apps: find bugs, evidence, reports. | `dogfood` |
|
||||
|
||||
## email
|
||||
|
||||
| Skill | Description | Path |
|
||||
|-------|-------------|------|
|
||||
| [`himalaya`](/docs/user-guide/skills/bundled/email/email-himalaya) | CLI to manage emails via IMAP/SMTP. Use himalaya to list, read, write, reply, forward, search, and organize emails from the terminal. Supports multiple accounts and message composition with MML (MIME Meta Language). | `email/himalaya` |
|
||||
| [`himalaya`](/docs/user-guide/skills/bundled/email/email-himalaya) | Himalaya CLI: IMAP/SMTP email from terminal. | `email/himalaya` |
|
||||
|
||||
## gaming
|
||||
|
||||
| Skill | Description | Path |
|
||||
|-------|-------------|------|
|
||||
| [`minecraft-modpack-server`](/docs/user-guide/skills/bundled/gaming/gaming-minecraft-modpack-server) | Set up a modded Minecraft server from a CurseForge/Modrinth server pack zip. Covers NeoForge/Forge install, Java version, JVM tuning, firewall, LAN config, backups, and launch scripts. | `gaming/minecraft-modpack-server` |
|
||||
| [`pokemon-player`](/docs/user-guide/skills/bundled/gaming/gaming-pokemon-player) | Play Pokemon games autonomously via headless emulation. Starts a game server, reads structured game state from RAM, makes strategic decisions, and sends button inputs — all from the terminal. | `gaming/pokemon-player` |
|
||||
| [`minecraft-modpack-server`](/docs/user-guide/skills/bundled/gaming/gaming-minecraft-modpack-server) | Host modded Minecraft servers (CurseForge, Modrinth). | `gaming/minecraft-modpack-server` |
|
||||
| [`pokemon-player`](/docs/user-guide/skills/bundled/gaming/gaming-pokemon-player) | Play Pokemon via headless emulator + RAM reads. | `gaming/pokemon-player` |
|
||||
|
||||
## github
|
||||
|
||||
| Skill | Description | Path |
|
||||
|-------|-------------|------|
|
||||
| [`codebase-inspection`](/docs/user-guide/skills/bundled/github/github-codebase-inspection) | Inspect and analyze codebases using pygount for LOC counting, language breakdown, and code-vs-comment ratios. Use when asked to check lines of code, repo size, language composition, or codebase stats. | `github/codebase-inspection` |
|
||||
| [`github-auth`](/docs/user-guide/skills/bundled/github/github-github-auth) | Set up GitHub authentication for the agent using git (universally available) or the gh CLI. Covers HTTPS tokens, SSH keys, credential helpers, and gh auth — with a detection flow to pick the right method automatically. | `github/github-auth` |
|
||||
| [`github-code-review`](/docs/user-guide/skills/bundled/github/github-github-code-review) | Review code changes by analyzing git diffs, leaving inline comments on PRs, and performing thorough pre-push review. Works with gh CLI or falls back to git + GitHub REST API via curl. | `github/github-code-review` |
|
||||
| [`github-issues`](/docs/user-guide/skills/bundled/github/github-github-issues) | Create, manage, triage, and close GitHub issues. Search existing issues, add labels, assign people, and link to PRs. Works with gh CLI or falls back to git + GitHub REST API via curl. | `github/github-issues` |
|
||||
| [`github-pr-workflow`](/docs/user-guide/skills/bundled/github/github-github-pr-workflow) | Full pull request lifecycle — create branches, commit changes, open PRs, monitor CI status, auto-fix failures, and merge. Works with gh CLI or falls back to git + GitHub REST API via curl. | `github/github-pr-workflow` |
|
||||
| [`github-repo-management`](/docs/user-guide/skills/bundled/github/github-github-repo-management) | Clone, create, fork, configure, and manage GitHub repositories. Manage remotes, secrets, releases, and workflows. Works with gh CLI or falls back to git + GitHub REST API via curl. | `github/github-repo-management` |
|
||||
| [`codebase-inspection`](/docs/user-guide/skills/bundled/github/github-codebase-inspection) | Inspect codebases w/ pygount: LOC, languages, ratios. | `github/codebase-inspection` |
|
||||
| [`github-auth`](/docs/user-guide/skills/bundled/github/github-github-auth) | GitHub auth setup: HTTPS tokens, SSH keys, gh CLI login. | `github/github-auth` |
|
||||
| [`github-code-review`](/docs/user-guide/skills/bundled/github/github-github-code-review) | Review PRs: diffs, inline comments via gh or REST. | `github/github-code-review` |
|
||||
| [`github-issues`](/docs/user-guide/skills/bundled/github/github-github-issues) | Create, triage, label, assign GitHub issues via gh or REST. | `github/github-issues` |
|
||||
| [`github-pr-workflow`](/docs/user-guide/skills/bundled/github/github-github-pr-workflow) | GitHub PR lifecycle: branch, commit, open, CI, merge. | `github/github-pr-workflow` |
|
||||
| [`github-repo-management`](/docs/user-guide/skills/bundled/github/github-github-repo-management) | Clone/create/fork repos; manage remotes, releases. | `github/github-repo-management` |
|
||||
|
||||
## mcp
|
||||
|
||||
| Skill | Description | Path |
|
||||
|-------|-------------|------|
|
||||
| [`native-mcp`](/docs/user-guide/skills/bundled/mcp/mcp-native-mcp) | Built-in MCP (Model Context Protocol) client that connects to external MCP servers, discovers their tools, and registers them as native Hermes Agent tools. Supports stdio and HTTP transports with automatic reconnection, security filterin... | `mcp/native-mcp` |
|
||||
| [`native-mcp`](/docs/user-guide/skills/bundled/mcp/mcp-native-mcp) | MCP client: connect servers, register tools (stdio/HTTP). | `mcp/native-mcp` |
|
||||
|
||||
## media
|
||||
|
||||
| Skill | Description | Path |
|
||||
|-------|-------------|------|
|
||||
| [`gif-search`](/docs/user-guide/skills/bundled/media/media-gif-search) | Search and download GIFs from Tenor using curl. No dependencies beyond curl and jq. Useful for finding reaction GIFs, creating visual content, and sending GIFs in chat. | `media/gif-search` |
|
||||
| [`heartmula`](/docs/user-guide/skills/bundled/media/media-heartmula) | Set up and run HeartMuLa, the open-source music generation model family (Suno-like). Generates full songs from lyrics + tags with multilingual support. | `media/heartmula` |
|
||||
| [`songsee`](/docs/user-guide/skills/bundled/media/media-songsee) | Generate spectrograms and audio feature visualizations (mel, chroma, MFCC, tempogram, etc.) from audio files via CLI. Useful for audio analysis, music production debugging, and visual documentation. | `media/songsee` |
|
||||
| [`spotify`](/docs/user-guide/skills/bundled/media/media-spotify) | Control Spotify — play music, search the catalog, manage playlists and library, inspect devices and playback state. Loads when the user asks to play/pause/queue music, search tracks/albums/artists, manage playlists, or check what's playi... | `media/spotify` |
|
||||
| [`youtube-content`](/docs/user-guide/skills/bundled/media/media-youtube-content) | Fetch YouTube video transcripts and transform them into structured content (chapters, summaries, threads, blog posts). Use when the user shares a YouTube URL or video link, asks to summarize a video, requests a transcript, or wants to ex... | `media/youtube-content` |
|
||||
| [`gif-search`](/docs/user-guide/skills/bundled/media/media-gif-search) | Search/download GIFs from Tenor via curl + jq. | `media/gif-search` |
|
||||
| [`heartmula`](/docs/user-guide/skills/bundled/media/media-heartmula) | HeartMuLa: Suno-like song generation from lyrics + tags. | `media/heartmula` |
|
||||
| [`songsee`](/docs/user-guide/skills/bundled/media/media-songsee) | Audio spectrograms/features (mel, chroma, MFCC) via CLI. | `media/songsee` |
|
||||
| [`spotify`](/docs/user-guide/skills/bundled/media/media-spotify) | Spotify: play, search, queue, manage playlists and devices. | `media/spotify` |
|
||||
| [`youtube-content`](/docs/user-guide/skills/bundled/media/media-youtube-content) | YouTube transcripts to summaries, threads, blogs. | `media/youtube-content` |
|
||||
|
||||
## mlops
|
||||
|
||||
| Skill | Description | Path |
|
||||
|-------|-------------|------|
|
||||
| [`audiocraft-audio-generation`](/docs/user-guide/skills/bundled/mlops/mlops-models-audiocraft) | PyTorch library for audio generation including text-to-music (MusicGen) and text-to-sound (AudioGen). Use when you need to generate music from text descriptions, create sound effects, or perform melody-conditioned music generation. | `mlops/models/audiocraft` |
|
||||
| [`axolotl`](/docs/user-guide/skills/bundled/mlops/mlops-training-axolotl) | Expert guidance for fine-tuning LLMs with Axolotl - YAML configs, 100+ models, LoRA/QLoRA, DPO/KTO/ORPO/GRPO, multimodal support | `mlops/training/axolotl` |
|
||||
| [`dspy`](/docs/user-guide/skills/bundled/mlops/mlops-research-dspy) | Build complex AI systems with declarative programming, optimize prompts automatically, create modular RAG systems and agents with DSPy - Stanford NLP's framework for systematic LM programming | `mlops/research/dspy` |
|
||||
| [`huggingface-hub`](/docs/user-guide/skills/bundled/mlops/mlops-huggingface-hub) | Hugging Face Hub CLI (hf) — search, download, and upload models and datasets, manage repos, query datasets with SQL, deploy inference endpoints, manage Spaces and buckets. | `mlops/huggingface-hub` |
|
||||
| [`audiocraft-audio-generation`](/docs/user-guide/skills/bundled/mlops/mlops-models-audiocraft) | AudioCraft: MusicGen text-to-music, AudioGen text-to-sound. | `mlops/models/audiocraft` |
|
||||
| [`axolotl`](/docs/user-guide/skills/bundled/mlops/mlops-training-axolotl) | Axolotl: YAML LLM fine-tuning (LoRA, DPO, GRPO). | `mlops/training/axolotl` |
|
||||
| [`dspy`](/docs/user-guide/skills/bundled/mlops/mlops-research-dspy) | DSPy: declarative LM programs, auto-optimize prompts, RAG. | `mlops/research/dspy` |
|
||||
| [`huggingface-hub`](/docs/user-guide/skills/bundled/mlops/mlops-huggingface-hub) | HuggingFace hf CLI: search/download/upload models, datasets. | `mlops/huggingface-hub` |
|
||||
| [`llama-cpp`](/docs/user-guide/skills/bundled/mlops/mlops-inference-llama-cpp) | llama.cpp local GGUF inference + HF Hub model discovery. | `mlops/inference/llama-cpp` |
|
||||
| [`evaluating-llms-harness`](/docs/user-guide/skills/bundled/mlops/mlops-evaluation-lm-evaluation-harness) | Evaluates LLMs across 60+ academic benchmarks (MMLU, HumanEval, GSM8K, TruthfulQA, HellaSwag). Use when benchmarking model quality, comparing models, reporting academic results, or tracking training progress. Industry standard used by El... | `mlops/evaluation/lm-evaluation-harness` |
|
||||
| [`obliteratus`](/docs/user-guide/skills/bundled/mlops/mlops-inference-obliteratus) | Remove refusal behaviors from open-weight LLMs using OBLITERATUS — mechanistic interpretability techniques (diff-in-means, SVD, whitened SVD, LEACE, SAE decomposition, etc.) to excise guardrails while preserving reasoning. 9 CLI methods,... | `mlops/inference/obliteratus` |
|
||||
| [`outlines`](/docs/user-guide/skills/bundled/mlops/mlops-inference-outlines) | Guarantee valid JSON/XML/code structure during generation, use Pydantic models for type-safe outputs, support local models (Transformers, vLLM), and maximize inference speed with Outlines - dottxt.ai's structured generation library | `mlops/inference/outlines` |
|
||||
| [`segment-anything-model`](/docs/user-guide/skills/bundled/mlops/mlops-models-segment-anything) | Foundation model for image segmentation with zero-shot transfer. Use when you need to segment any object in images using points, boxes, or masks as prompts, or automatically generate all object masks in an image. | `mlops/models/segment-anything` |
|
||||
| [`fine-tuning-with-trl`](/docs/user-guide/skills/bundled/mlops/mlops-training-trl-fine-tuning) | Fine-tune LLMs using reinforcement learning with TRL - SFT for instruction tuning, DPO for preference alignment, PPO/GRPO for reward optimization, and reward model training. Use when need RLHF, align model with preferences, or train from... | `mlops/training/trl-fine-tuning` |
|
||||
| [`unsloth`](/docs/user-guide/skills/bundled/mlops/mlops-training-unsloth) | Expert guidance for fast fine-tuning with Unsloth - 2-5x faster training, 50-80% less memory, LoRA/QLoRA optimization | `mlops/training/unsloth` |
|
||||
| [`serving-llms-vllm`](/docs/user-guide/skills/bundled/mlops/mlops-inference-vllm) | Serves LLMs with high throughput using vLLM's PagedAttention and continuous batching. Use when deploying production LLM APIs, optimizing inference latency/throughput, or serving models with limited GPU memory. Supports OpenAI-compatible... | `mlops/inference/vllm` |
|
||||
| [`weights-and-biases`](/docs/user-guide/skills/bundled/mlops/mlops-evaluation-weights-and-biases) | Track ML experiments with automatic logging, visualize training in real-time, optimize hyperparameters with sweeps, and manage model registry with W&B - collaborative MLOps platform | `mlops/evaluation/weights-and-biases` |
|
||||
| [`evaluating-llms-harness`](/docs/user-guide/skills/bundled/mlops/mlops-evaluation-lm-evaluation-harness) | lm-eval-harness: benchmark LLMs (MMLU, GSM8K, etc.). | `mlops/evaluation/lm-evaluation-harness` |
|
||||
| [`obliteratus`](/docs/user-guide/skills/bundled/mlops/mlops-inference-obliteratus) | OBLITERATUS: abliterate LLM refusals (diff-in-means). | `mlops/inference/obliteratus` |
|
||||
| [`outlines`](/docs/user-guide/skills/bundled/mlops/mlops-inference-outlines) | Outlines: structured JSON/regex/Pydantic LLM generation. | `mlops/inference/outlines` |
|
||||
| [`segment-anything-model`](/docs/user-guide/skills/bundled/mlops/mlops-models-segment-anything) | SAM: zero-shot image segmentation via points, boxes, masks. | `mlops/models/segment-anything` |
|
||||
| [`fine-tuning-with-trl`](/docs/user-guide/skills/bundled/mlops/mlops-training-trl-fine-tuning) | TRL: SFT, DPO, PPO, GRPO, reward modeling for LLM RLHF. | `mlops/training/trl-fine-tuning` |
|
||||
| [`unsloth`](/docs/user-guide/skills/bundled/mlops/mlops-training-unsloth) | Unsloth: 2-5x faster LoRA/QLoRA fine-tuning, less VRAM. | `mlops/training/unsloth` |
|
||||
| [`serving-llms-vllm`](/docs/user-guide/skills/bundled/mlops/mlops-inference-vllm) | vLLM: high-throughput LLM serving, OpenAI API, quantization. | `mlops/inference/vllm` |
|
||||
| [`weights-and-biases`](/docs/user-guide/skills/bundled/mlops/mlops-evaluation-weights-and-biases) | W&B: log ML experiments, sweeps, model registry, dashboards. | `mlops/evaluation/weights-and-biases` |
|
||||
|
||||
## note-taking
|
||||
|
||||
@ -134,49 +139,60 @@ If a skill is missing from this list but present in the repo, the catalog is reg
|
||||
| Skill | Description | Path |
|
||||
|-------|-------------|------|
|
||||
| [`airtable`](/docs/user-guide/skills/bundled/productivity/productivity-airtable) | Airtable REST API via curl. Records CRUD, filters, upserts. | `productivity/airtable` |
|
||||
| [`google-workspace`](/docs/user-guide/skills/bundled/productivity/productivity-google-workspace) | Gmail, Calendar, Drive, Contacts, Sheets, and Docs integration for Hermes. Uses Hermes-managed OAuth2 setup, prefers the Google Workspace CLI (`gws`) when available for broader API coverage, and falls back to the Python client libraries... | `productivity/google-workspace` |
|
||||
| [`linear`](/docs/user-guide/skills/bundled/productivity/productivity-linear) | Manage Linear issues, projects, and teams via the GraphQL API. Create, update, search, and organize issues. Uses API key auth (no OAuth needed). All operations via curl — no dependencies. | `productivity/linear` |
|
||||
| [`maps`](/docs/user-guide/skills/bundled/productivity/productivity-maps) | Location intelligence — geocode a place, reverse-geocode coordinates, find nearby places (46 POI categories), driving/walking/cycling distance + time, turn-by-turn directions, timezone lookup, bounding box + area for a named place, and P... | `productivity/maps` |
|
||||
| [`nano-pdf`](/docs/user-guide/skills/bundled/productivity/productivity-nano-pdf) | Edit PDFs with natural-language instructions using the nano-pdf CLI. Modify text, fix typos, update titles, and make content changes to specific pages without manual editing. | `productivity/nano-pdf` |
|
||||
| [`notion`](/docs/user-guide/skills/bundled/productivity/productivity-notion) | Notion API for creating and managing pages, databases, and blocks via curl. Search, create, update, and query Notion workspaces directly from the terminal. | `productivity/notion` |
|
||||
| [`ocr-and-documents`](/docs/user-guide/skills/bundled/productivity/productivity-ocr-and-documents) | Extract text from PDFs and scanned documents. Use web_extract for remote URLs, pymupdf for local text-based PDFs, marker-pdf for OCR/scanned docs. For DOCX use python-docx, for PPTX see the powerpoint skill. | `productivity/ocr-and-documents` |
|
||||
| [`powerpoint`](/docs/user-guide/skills/bundled/productivity/productivity-powerpoint) | Use this skill any time a .pptx file is involved in any way — as input, output, or both. This includes: creating slide decks, pitch decks, or presentations; reading, parsing, or extracting text from any .pptx file (even if the extracted... | `productivity/powerpoint` |
|
||||
| [`google-workspace`](/docs/user-guide/skills/bundled/productivity/productivity-google-workspace) | Gmail, Calendar, Drive, Docs, Sheets via gws CLI or Python. | `productivity/google-workspace` |
|
||||
| [`linear`](/docs/user-guide/skills/bundled/productivity/productivity-linear) | Linear: manage issues, projects, teams via GraphQL + curl. | `productivity/linear` |
|
||||
| [`maps`](/docs/user-guide/skills/bundled/productivity/productivity-maps) | Geocode, POIs, routes, timezones via OpenStreetMap/OSRM. | `productivity/maps` |
|
||||
| [`nano-pdf`](/docs/user-guide/skills/bundled/productivity/productivity-nano-pdf) | Edit PDF text/typos/titles via nano-pdf CLI (NL prompts). | `productivity/nano-pdf` |
|
||||
| [`notion`](/docs/user-guide/skills/bundled/productivity/productivity-notion) | Notion API via curl: pages, databases, blocks, search. | `productivity/notion` |
|
||||
| [`ocr-and-documents`](/docs/user-guide/skills/bundled/productivity/productivity-ocr-and-documents) | Extract text from PDFs/scans (pymupdf, marker-pdf). | `productivity/ocr-and-documents` |
|
||||
| [`powerpoint`](/docs/user-guide/skills/bundled/productivity/productivity-powerpoint) | Create, read, edit .pptx decks, slides, notes, templates. | `productivity/powerpoint` |
|
||||
|
||||
## red-teaming
|
||||
|
||||
| Skill | Description | Path |
|
||||
|-------|-------------|------|
|
||||
| [`godmode`](/docs/user-guide/skills/bundled/red-teaming/red-teaming-godmode) | Jailbreak API-served LLMs using G0DM0D3 techniques — Parseltongue input obfuscation (33 techniques), GODMODE CLASSIC system prompt templates, ULTRAPLINIAN multi-model racing, encoding escalation, and Hermes-native prefill/system prompt i... | `red-teaming/godmode` |
|
||||
| [`godmode`](/docs/user-guide/skills/bundled/red-teaming/red-teaming-godmode) | Jailbreak LLMs: Parseltongue, GODMODE, ULTRAPLINIAN. | `red-teaming/godmode` |
|
||||
|
||||
## research
|
||||
|
||||
| Skill | Description | Path |
|
||||
|-------|-------------|------|
|
||||
| [`arxiv`](/docs/user-guide/skills/bundled/research/research-arxiv) | Search and retrieve academic papers from arXiv using their free REST API. No API key needed. Search by keyword, author, category, or ID. Combine with web_extract or the ocr-and-documents skill to read full paper content. | `research/arxiv` |
|
||||
| [`blogwatcher`](/docs/user-guide/skills/bundled/research/research-blogwatcher) | Monitor blogs and RSS/Atom feeds for updates using the blogwatcher-cli tool. Add blogs, scan for new articles, track read status, and filter by category. | `research/blogwatcher` |
|
||||
| [`llm-wiki`](/docs/user-guide/skills/bundled/research/research-llm-wiki) | Karpathy's LLM Wiki — build and maintain a persistent, interlinked markdown knowledge base. Ingest sources, query compiled knowledge, and lint for consistency. | `research/llm-wiki` |
|
||||
| [`polymarket`](/docs/user-guide/skills/bundled/research/research-polymarket) | Query Polymarket prediction market data — search markets, get prices, orderbooks, and price history. Read-only via public REST APIs, no API key needed. | `research/polymarket` |
|
||||
| [`research-paper-writing`](/docs/user-guide/skills/bundled/research/research-research-paper-writing) | End-to-end pipeline for writing ML/AI research papers — from experiment design through analysis, drafting, revision, and submission. Covers NeurIPS, ICML, ICLR, ACL, AAAI, COLM. Integrates automated experiment monitoring, statistical ana... | `research/research-paper-writing` |
|
||||
| [`arxiv`](/docs/user-guide/skills/bundled/research/research-arxiv) | Search arXiv papers by keyword, author, category, or ID. | `research/arxiv` |
|
||||
| [`blogwatcher`](/docs/user-guide/skills/bundled/research/research-blogwatcher) | Monitor blogs and RSS/Atom feeds via blogwatcher-cli tool. | `research/blogwatcher` |
|
||||
| [`llm-wiki`](/docs/user-guide/skills/bundled/research/research-llm-wiki) | Karpathy's LLM Wiki: build/query interlinked markdown KB. | `research/llm-wiki` |
|
||||
| [`polymarket`](/docs/user-guide/skills/bundled/research/research-polymarket) | Query Polymarket: markets, prices, orderbooks, history. | `research/polymarket` |
|
||||
| [`research-paper-writing`](/docs/user-guide/skills/bundled/research/research-research-paper-writing) | Write ML papers for NeurIPS/ICML/ICLR: design→submit. | `research/research-paper-writing` |
|
||||
|
||||
## smart-home
|
||||
|
||||
| Skill | Description | Path |
|
||||
|-------|-------------|------|
|
||||
| [`openhue`](/docs/user-guide/skills/bundled/smart-home/smart-home-openhue) | Control Philips Hue lights, rooms, and scenes via the OpenHue CLI. Turn lights on/off, adjust brightness, color, color temperature, and activate scenes. | `smart-home/openhue` |
|
||||
| [`openhue`](/docs/user-guide/skills/bundled/smart-home/smart-home-openhue) | Control Philips Hue lights, scenes, rooms via OpenHue CLI. | `smart-home/openhue` |
|
||||
|
||||
## social-media
|
||||
|
||||
| Skill | Description | Path |
|
||||
|-------|-------------|------|
|
||||
| [`xurl`](/docs/user-guide/skills/bundled/social-media/social-media-xurl) | Interact with X/Twitter via xurl, the official X API CLI. Use for posting, replying, quoting, searching, timelines, mentions, likes, reposts, bookmarks, follows, DMs, media upload, and raw v2 endpoint access. | `social-media/xurl` |
|
||||
| [`xurl`](/docs/user-guide/skills/bundled/social-media/social-media-xurl) | X/Twitter via xurl CLI: post, search, DM, media, v2 API. | `social-media/xurl` |
|
||||
|
||||
## software-development
|
||||
|
||||
| Skill | Description | Path |
|
||||
|-------|-------------|------|
|
||||
| [`plan`](/docs/user-guide/skills/bundled/software-development/software-development-plan) | Plan mode for Hermes — inspect context, write a markdown plan into the active workspace's `.hermes/plans/` directory, and do not execute the work. | `software-development/plan` |
|
||||
| [`requesting-code-review`](/docs/user-guide/skills/bundled/software-development/software-development-requesting-code-review) | Pre-commit verification pipeline — static security scan, baseline-aware quality gates, independent reviewer subagent, and auto-fix loop. Use after code changes and before committing, pushing, or opening a PR. | `software-development/requesting-code-review` |
|
||||
| [`subagent-driven-development`](/docs/user-guide/skills/bundled/software-development/software-development-subagent-driven-development) | Use when executing implementation plans with independent tasks. Dispatches fresh delegate_task per task with two-stage review (spec compliance then code quality). | `software-development/subagent-driven-development` |
|
||||
| [`systematic-debugging`](/docs/user-guide/skills/bundled/software-development/software-development-systematic-debugging) | Use when encountering any bug, test failure, or unexpected behavior. 4-phase root cause investigation — NO fixes without understanding the problem first. | `software-development/systematic-debugging` |
|
||||
| [`test-driven-development`](/docs/user-guide/skills/bundled/software-development/software-development-test-driven-development) | Use when implementing any feature or bugfix, before writing implementation code. Enforces RED-GREEN-REFACTOR cycle with test-first approach. | `software-development/test-driven-development` |
|
||||
| [`writing-plans`](/docs/user-guide/skills/bundled/software-development/software-development-writing-plans) | Use when you have a spec or requirements for a multi-step task. Creates comprehensive implementation plans with bite-sized tasks, exact file paths, and complete code examples. | `software-development/writing-plans` |
|
||||
| [`debugging-hermes-tui-commands`](/docs/user-guide/skills/bundled/software-development/software-development-debugging-hermes-tui-commands) | Debug Hermes TUI slash commands: Python, gateway, Ink UI. | `software-development/debugging-hermes-tui-commands` |
|
||||
| [`hermes-agent-skill-authoring`](/docs/user-guide/skills/bundled/software-development/software-development-hermes-agent-skill-authoring) | Author in-repo SKILL.md: frontmatter, validator, structure. | `software-development/hermes-agent-skill-authoring` |
|
||||
| [`node-inspect-debugger`](/docs/user-guide/skills/bundled/software-development/software-development-node-inspect-debugger) | Debug Node.js via --inspect + Chrome DevTools Protocol CLI. | `software-development/node-inspect-debugger` |
|
||||
| [`plan`](/docs/user-guide/skills/bundled/software-development/software-development-plan) | Plan mode: write markdown plan to .hermes/plans/, no exec. | `software-development/plan` |
|
||||
| [`python-debugpy`](/docs/user-guide/skills/bundled/software-development/software-development-python-debugpy) | Debug Python: pdb REPL + debugpy remote (DAP). | `software-development/python-debugpy` |
|
||||
| [`requesting-code-review`](/docs/user-guide/skills/bundled/software-development/software-development-requesting-code-review) | Pre-commit review: security scan, quality gates, auto-fix. | `software-development/requesting-code-review` |
|
||||
| [`spike`](/docs/user-guide/skills/bundled/software-development/software-development-spike) | Throwaway experiments to validate an idea before build. | `software-development/spike` |
|
||||
| [`subagent-driven-development`](/docs/user-guide/skills/bundled/software-development/software-development-subagent-driven-development) | Execute plans via delegate_task subagents (2-stage review). | `software-development/subagent-driven-development` |
|
||||
| [`systematic-debugging`](/docs/user-guide/skills/bundled/software-development/software-development-systematic-debugging) | 4-phase root cause debugging: understand bugs before fixing. | `software-development/systematic-debugging` |
|
||||
| [`test-driven-development`](/docs/user-guide/skills/bundled/software-development/software-development-test-driven-development) | TDD: enforce RED-GREEN-REFACTOR, tests before code. | `software-development/test-driven-development` |
|
||||
| [`writing-plans`](/docs/user-guide/skills/bundled/software-development/software-development-writing-plans) | Write implementation plans: bite-sized tasks, paths, code. | `software-development/writing-plans` |
|
||||
|
||||
## yuanbao
|
||||
|
||||
| Skill | Description | Path |
|
||||
|-------|-------------|------|
|
||||
| [`yuanbao`](/docs/user-guide/skills/bundled/yuanbao/yuanbao-yuanbao) | Yuanbao (元宝) groups: @mention users, query info/members. | `yuanbao` |
|
||||
|
||||
@ -32,9 +32,10 @@ Type `/` in the CLI to open the autocomplete menu. Built-in commands are case-in
|
||||
| `/rollback` | List or restore filesystem checkpoints (usage: /rollback [number]) |
|
||||
| `/snapshot [create\|restore <id>\|prune]` (alias: `/snap`) | Create or restore state snapshots of Hermes config/state. `create [label]` saves a snapshot, `restore <id>` reverts to it, `prune [N]` removes old snapshots, or list all with no args. |
|
||||
| `/stop` | Kill all running background processes |
|
||||
| `/queue <prompt>` (alias: `/q`) | Queue a prompt for the next turn (doesn't interrupt the current agent response). **Note:** `/q` is claimed by both `/queue` and `/quit`; the last registration wins, so `/q` resolves to `/quit` in practice. Use `/queue` explicitly. |
|
||||
| `/queue <prompt>` (alias: `/q`) | Queue a prompt for the next turn (doesn't interrupt the current agent response). |
|
||||
| `/steer <prompt>` | Inject a mid-run note that arrives at the agent **after the next tool call** — no interrupt, no new user turn. The text is appended to the last tool result's content once the current tool completes, giving the agent new context without breaking the current tool-calling loop. Use this to nudge direction mid-task (e.g. "focus on the auth module" while the agent is running tests). |
|
||||
| `/resume [name]` | Resume a previously-named session |
|
||||
| `/redraw` | Force a full UI repaint (recovers from terminal drift after tmux resize, mouse selection artifacts, etc.) |
|
||||
| `/status` | Show session info |
|
||||
| `/agents` (alias: `/tasks`) | Show active agents and running tasks across the current session. |
|
||||
| `/background <prompt>` (alias: `/bg`, `/btw`) | Run a prompt in a separate background session. The agent processes your prompt independently — your current session stays free for other work. Results appear as a panel when the task finishes. See [CLI Background Sessions](/docs/user-guide/cli#background-sessions). |
|
||||
@ -54,6 +55,9 @@ Type `/` in the CLI to open the autocomplete menu. Built-in commands are case-in
|
||||
| `/statusbar` (alias: `/sb`) | Toggle the context/model status bar on or off |
|
||||
| `/voice [on\|off\|tts\|status]` | Toggle CLI voice mode and spoken playback. Recording uses `voice.record_key` (default: `Ctrl+B`). |
|
||||
| `/yolo` | Toggle YOLO mode — skip all dangerous command approval prompts. |
|
||||
| `/footer [on\|off\|status]` | Toggle the gateway runtime-metadata footer on final replies (shows model, tool counts, timing). |
|
||||
| `/busy [queue\|steer\|interrupt\|status]` | CLI-only: control what pressing Enter does while Hermes is working — queue the new message, steer mid-turn, or interrupt immediately. |
|
||||
| `/indicator [kaomoji\|emoji\|unicode\|ascii]` | CLI-only: pick the TUI busy-indicator style. |
|
||||
|
||||
### Tools & Skills
|
||||
|
||||
@ -64,6 +68,7 @@ Type `/` in the CLI to open the autocomplete menu. Built-in commands are case-in
|
||||
| `/browser [connect\|disconnect\|status]` | Manage local Chrome CDP connection. `connect` attaches browser tools to a running Chrome instance (default: `ws://localhost:9222`). `disconnect` detaches. `status` shows current connection. Auto-launches Chrome if no debugger is detected. |
|
||||
| `/skills` | Search, install, inspect, or manage skills from online registries |
|
||||
| `/cron` | Manage scheduled tasks (list, add/create, edit, pause, resume, run, remove) |
|
||||
| `/curator` | Background skill maintenance — `status`, `run`, `pin`, `archive`. See [Curator](/docs/user-guide/features/curator). |
|
||||
| `/reload-mcp` (alias: `/reload_mcp`) | Reload MCP servers from config.yaml |
|
||||
| `/reload` | Reload `.env` variables into the running session (picks up new API keys without restarting) |
|
||||
| `/plugins` | List installed plugins and their status |
|
||||
@ -79,7 +84,6 @@ Type `/` in the CLI to open the autocomplete menu. Built-in commands are case-in
|
||||
| `/paste` | Attach a clipboard image |
|
||||
| `/copy [number]` | Copy the last assistant response to clipboard (or the Nth-from-last with a number). CLI-only. |
|
||||
| `/image <path>` | Attach a local image file for your next prompt. |
|
||||
| `/terminal-setup [auto\|vscode\|cursor\|windsurf]` | TUI-only: configure local VS Code-family terminal bindings for better multiline + undo/redo parity. |
|
||||
| `/debug` | Upload debug report (system info + logs) and get shareable links. Also available in messaging. |
|
||||
| `/profile` | Show active profile name and home directory |
|
||||
| `/gquota` | Show Google Gemini Code Assist quota usage with progress bars (only available when the `google-gemini-cli` provider is active). |
|
||||
@ -88,7 +92,7 @@ Type `/` in the CLI to open the autocomplete menu. Built-in commands are case-in
|
||||
|
||||
| Command | Description |
|
||||
|---------|-------------|
|
||||
| `/quit` | Exit the CLI (also: `/exit`). See note on `/q` under `/queue` above. |
|
||||
| `/quit` | Exit the CLI (also: `/exit`). |
|
||||
|
||||
### Dynamic CLI slash commands
|
||||
|
||||
@ -147,6 +151,10 @@ The messaging gateway supports the following built-in commands inside Telegram,
|
||||
| `/voice [on\|off\|tts\|join\|channel\|leave\|status]` | Control spoken replies in chat. `join`/`channel`/`leave` manage Discord voice-channel mode. |
|
||||
| `/rollback [number]` | List or restore filesystem checkpoints. |
|
||||
| `/background <prompt>` | Run a prompt in a separate background session. Results are delivered back to the same chat when the task finishes. See [Messaging Background Sessions](/docs/user-guide/messaging/#background-sessions). |
|
||||
| `/queue <prompt>` (alias: `/q`) | Queue a prompt for the next turn without interrupting the current one. |
|
||||
| `/steer <prompt>` | Inject a message after the next tool call without interrupting — the model picks it up on its next iteration rather than as a new turn. |
|
||||
| `/footer [on\|off\|status]` | Toggle the runtime-metadata footer on final replies (shows model, tool counts, timing). |
|
||||
| `/curator [status\|run\|pin\|archive]` | Background skill maintenance controls. |
|
||||
| `/reload-mcp` (alias: `/reload_mcp`) | Reload MCP servers from config. |
|
||||
| `/yolo` | Toggle YOLO mode — skip all dangerous command approval prompts. |
|
||||
| `/commands [page]` | Browse all commands and skills (paginated). |
|
||||
@ -160,8 +168,8 @@ The messaging gateway supports the following built-in commands inside Telegram,
|
||||
|
||||
## Notes
|
||||
|
||||
- `/skin`, `/snapshot`, `/gquota`, `/reload`, `/tools`, `/toolsets`, `/browser`, `/config`, `/cron`, `/skills`, `/platforms`, `/paste`, `/image`, `/terminal-setup`, `/statusbar`, `/mouse`, `/plugins`, and `/steer` are **CLI-only** (the `/mouse` command is TUI-exclusive; `/steer` works in both classic CLI and TUI).
|
||||
- `/skin`, `/snapshot`, `/gquota`, `/reload`, `/tools`, `/toolsets`, `/browser`, `/config`, `/cron`, `/skills`, `/platforms`, `/paste`, `/image`, `/statusbar`, `/plugins`, `/busy`, `/indicator`, `/redraw`, `/clear`, `/history`, `/save`, `/copy`, and `/quit` are **CLI-only** commands.
|
||||
- `/verbose` is **CLI-only by default**, but can be enabled for messaging platforms by setting `display.tool_progress_command: true` in `config.yaml`. When enabled, it cycles the `display.tool_progress` mode and saves to config.
|
||||
- `/sethome`, `/update`, `/restart`, `/approve`, `/deny`, and `/commands` are **messaging-only** commands.
|
||||
- `/status`, `/background`, `/voice`, `/reload-mcp`, `/rollback`, `/debug`, `/fast`, and `/yolo` work in **both** the CLI and the messaging gateway.
|
||||
- `/status`, `/background`, `/queue`, `/steer`, `/voice`, `/reload-mcp`, `/rollback`, `/debug`, `/fast`, `/footer`, `/curator`, and `/yolo` work in **both** the CLI and the messaging gateway.
|
||||
- `/voice join`, `/voice channel`, and `/voice leave` are only meaningful on Discord.
|
||||
|
||||
@ -6,9 +6,9 @@ description: "Authoritative reference for Hermes built-in tools, grouped by tool
|
||||
|
||||
# Built-in Tools Reference
|
||||
|
||||
This page documents all 55 built-in tools in the Hermes tool registry, grouped by toolset. Availability varies by platform, credentials, and enabled toolsets.
|
||||
This page documents all 68 built-in tools in the Hermes tool registry, grouped by toolset. Availability varies by platform, credentials, and enabled toolsets.
|
||||
|
||||
**Quick counts:** 12 browser tools, 4 file tools, 10 RL tools, 4 Home Assistant tools, 2 terminal tools, 2 web tools, 5 Feishu tools, and 15 standalone tools across other toolsets.
|
||||
**Quick counts:** 10 browser tools (core) + 2 browser-cdp tools, 4 file tools, 10 RL tools, 4 Home Assistant tools, 2 terminal tools, 2 web tools, 5 Feishu tools, 7 Spotify tools, 5 Yuanbao tools, 2 Discord tools, and 15 standalone tools across other toolsets.
|
||||
|
||||
:::tip MCP Tools
|
||||
In addition to built-in tools, Hermes can load tools dynamically from MCP servers. MCP tools appear with a server-name prefix (e.g., `github_create_issue` for the `github` MCP server). See [MCP Integration](/docs/user-guide/features/mcp) for configuration.
|
||||
@ -19,8 +19,6 @@ In addition to built-in tools, Hermes can load tools dynamically from MCP server
|
||||
| Tool | Description | Requires environment |
|
||||
|------|-------------|----------------------|
|
||||
| `browser_back` | Navigate back to the previous page in browser history. Requires browser_navigate to be called first. | — |
|
||||
| `browser_cdp` | Send a raw Chrome DevTools Protocol (CDP) command. Escape hatch for browser operations not covered by browser_navigate, browser_click, browser_console, etc. Only available when a CDP endpoint is reachable at session start — via `/browser connect` or `browser.cdp_url` config. See https://chromedevtools.github.io/devtools-protocol/ | — |
|
||||
| `browser_dialog` | Respond to a native JavaScript dialog (alert / confirm / prompt / beforeunload). Call `browser_snapshot` first — pending dialogs appear in its `pending_dialogs` field. Then call `browser_dialog(action='accept'|'dismiss')`. Same availability as `browser_cdp` (Browserbase or `/browser connect`). | — |
|
||||
| `browser_click` | Click on an element identified by its ref ID from the snapshot (e.g., '@e5'). The ref IDs are shown in square brackets in the snapshot output. Requires browser_navigate and browser_snapshot to be called first. | — |
|
||||
| `browser_console` | Get browser console output and JavaScript errors from the current page. Returns console.log/warn/error/info messages and uncaught JS exceptions. Use this to detect silent JavaScript errors, failed API calls, and application warnings. Requi… | — |
|
||||
| `browser_get_images` | Get a list of all images on the current page with their URLs and alt text. Useful for finding images to analyze with the vision tool. Requires browser_navigate to be called first. | — |
|
||||
@ -31,6 +29,15 @@ In addition to built-in tools, Hermes can load tools dynamically from MCP server
|
||||
| `browser_type` | Type text into an input field identified by its ref ID. Clears the field first, then types the new text. Requires browser_navigate and browser_snapshot to be called first. | — |
|
||||
| `browser_vision` | Take a screenshot of the current page and analyze it with vision AI. Use this when you need to visually understand what's on the page - especially useful for CAPTCHAs, visual verification challenges, complex layouts, or when the text snaps… | — |
|
||||
|
||||
## `browser-cdp` toolset
|
||||
|
||||
Registered only when a Chrome DevTools Protocol endpoint is reachable at session start — via `/browser connect`, `browser.cdp_url` config, a Browserbase session, or Camofox.
|
||||
|
||||
| Tool | Description | Requires environment |
|
||||
|------|-------------|----------------------|
|
||||
| `browser_cdp` | Send a raw Chrome DevTools Protocol command. Escape hatch for browser operations not covered by the higher-level `browser_*` tools. See https://chromedevtools.github.io/devtools-protocol/ | CDP endpoint |
|
||||
| `browser_dialog` | Respond to a native JavaScript dialog (alert / confirm / prompt / beforeunload). Call `browser_snapshot` first — pending dialogs appear in its `pending_dialogs` field. Then call `browser_dialog(action='accept'\|'dismiss')`. | CDP endpoint |
|
||||
|
||||
## `clarify` toolset
|
||||
|
||||
| Tool | Description | Requires environment |
|
||||
@ -181,4 +188,46 @@ Scoped to the Feishu document-comment handler. Drives comment read/write operati
|
||||
|------|-------------|----------------------|
|
||||
| `text_to_speech` | Convert text to speech audio. Returns a MEDIA: path that the platform delivers as a voice message. On Telegram it plays as a voice bubble, on Discord/WhatsApp as an audio attachment. In CLI mode, saves to ~/voice-memos/. Voice and provider… | — |
|
||||
|
||||
## `discord` toolset
|
||||
|
||||
Registered on the `hermes-discord` platform toolset (gateway only). Uses the same bot token as the messaging adapter.
|
||||
|
||||
| Tool | Description | Requires environment |
|
||||
|------|-------------|----------------------|
|
||||
| `discord` | Read and participate in a Discord server. Actions include `search_members`, `fetch_messages`, `send_message`, `react`, `fetch_channel`, `list_channels`, and more. | `DISCORD_BOT_TOKEN` |
|
||||
|
||||
## `discord_admin` toolset
|
||||
|
||||
Registered on the `hermes-discord` platform toolset. Moderation actions require the bot to hold the matching Discord permissions.
|
||||
|
||||
| Tool | Description | Requires environment |
|
||||
|------|-------------|----------------------|
|
||||
| `discord_admin` | Manage a Discord server via the REST API: list guilds/channels/roles, create/edit/delete channels, manage role grants, timeouts, kicks, and bans. | `DISCORD_BOT_TOKEN` + bot permissions |
|
||||
|
||||
## `spotify` toolset
|
||||
|
||||
Registered by the bundled `spotify` plugin. Requires an OAuth token — run `hermes spotify setup` once to authorize.
|
||||
|
||||
| Tool | Description | Requires environment |
|
||||
|------|-------------|----------------------|
|
||||
| `spotify_playback` | Control Spotify playback, inspect the active playback state, or fetch recently played tracks. | Spotify OAuth |
|
||||
| `spotify_devices` | List Spotify Connect devices or transfer playback to a different device. | Spotify OAuth |
|
||||
| `spotify_queue` | Inspect the user's Spotify queue or add an item to it. | Spotify OAuth |
|
||||
| `spotify_search` | Search the Spotify catalog for tracks, albums, artists, playlists, shows, or episodes. | Spotify OAuth |
|
||||
| `spotify_playlists` | List, inspect, create, update, and modify Spotify playlists. | Spotify OAuth |
|
||||
| `spotify_albums` | Fetch Spotify album metadata or album tracks. | Spotify OAuth |
|
||||
| `spotify_library` | List, save, or remove the user's saved Spotify tracks or albums. | Spotify OAuth |
|
||||
|
||||
## `hermes-yuanbao` toolset
|
||||
|
||||
Registered only on the `hermes-yuanbao` platform toolset. Yuanbao is Tencent's chat app; these tools drive its DM/group/sticker APIs.
|
||||
|
||||
| Tool | Description | Requires environment |
|
||||
|------|-------------|----------------------|
|
||||
| `yb_query_group_info` | Query basic info about a group (called "派/Pai" in the app): name, owner, member count. | Yuanbao credentials |
|
||||
| `yb_query_group_members` | Query members of a group (for `@`-mentions, finding a user by name, listing bots). | Yuanbao credentials |
|
||||
| `yb_send_dm` | Send a private/direct message to a user in a group, with optional media files. | Yuanbao credentials |
|
||||
| `yb_search_sticker` | Search the built-in Yuanbao sticker (TIM face) catalogue by keyword. | Yuanbao credentials |
|
||||
| `yb_send_sticker` | Send a built-in sticker to the current Yuanbao chat. | Yuanbao credentials |
|
||||
|
||||
|
||||
|
||||
@ -52,37 +52,34 @@ Or in-session:
|
||||
|
||||
| Toolset | Tools | Purpose |
|
||||
|---------|-------|---------|
|
||||
| `browser` | `browser_back`, `browser_cdp`, `browser_click`, `browser_console`, `browser_dialog`, `browser_get_images`, `browser_navigate`, `browser_press`, `browser_scroll`, `browser_snapshot`, `browser_type`, `browser_vision`, `web_search` | Full browser automation. Includes `web_search` as a fallback for quick lookups. `browser_cdp` and `browser_dialog` are gated on a reachable CDP endpoint — they only appear when `/browser connect` is active, `browser.cdp_url` is set, or a Browserbase session is active. `browser_dialog` works together with the `pending_dialogs` and `frame_tree` fields that `browser_snapshot` adds when a CDP supervisor is attached. |
|
||||
| `browser` | `browser_back`, `browser_click`, `browser_console`, `browser_get_images`, `browser_navigate`, `browser_press`, `browser_scroll`, `browser_snapshot`, `browser_type`, `browser_vision`, `web_search` | Core browser automation. Includes `web_search` as a fallback for quick lookups. `browser_cdp` and `browser_dialog` live in a separate `browser-cdp` toolset and are registered only when a CDP endpoint is reachable at session start — via `/browser connect`, `browser.cdp_url` config, Browserbase, or Camofox. `browser_dialog` works together with the `pending_dialogs` and `frame_tree` fields that `browser_snapshot` adds when a CDP supervisor is attached. |
|
||||
| `clarify` | `clarify` | Ask the user a question when the agent needs clarification. |
|
||||
| `code_execution` | `execute_code` | Run Python scripts that call Hermes tools programmatically. |
|
||||
| `cronjob` | `cronjob` | Schedule and manage recurring tasks. |
|
||||
| `debugging` | composite (`file` + `terminal` + `web`) | Debug bundle — file, process/terminal, web extract/search. |
|
||||
| `delegation` | `delegate_task` | Spawn isolated subagent instances for parallel work. |
|
||||
| `discord` | `discord` | Core Discord text/embed/DM actions (gateway-only). Active on the `hermes-discord` toolset. |
|
||||
| `discord_admin` | `discord_admin` | Discord moderation (bans, role changes, channel management). Active on the `hermes-discord` toolset; requires the bot to hold the relevant Discord permissions. |
|
||||
| `feishu_doc` | `feishu_doc_read` | Read Feishu/Lark document content. Used by the Feishu document-comment intelligent-reply handler. |
|
||||
| `feishu_drive` | `feishu_drive_add_comment`, `feishu_drive_list_comments`, `feishu_drive_list_comment_replies`, `feishu_drive_reply_comment` | Feishu/Lark drive comment operations. Scoped to the comment agent; not exposed on `hermes-cli` or other messaging toolsets. |
|
||||
| `file` | `patch`, `read_file`, `search_files`, `write_file` | File reading, writing, searching, and editing. |
|
||||
| `homeassistant` | `ha_call_service`, `ha_get_state`, `ha_list_entities`, `ha_list_services` | Smart home control via Home Assistant. Only available when `HASS_TOKEN` is set. |
|
||||
| `image_gen` | `image_generate` | Text-to-image generation via FAL.ai. |
|
||||
| `image_gen` | `image_generate` | Text-to-image generation via FAL.ai (with opt-in OpenAI / xAI backends). |
|
||||
| `memory` | `memory` | Persistent cross-session memory management. |
|
||||
| `messaging` | `send_message` | Send messages to other platforms (Telegram, Discord, etc.) from within a session. |
|
||||
| `moa` | `mixture_of_agents` | Multi-model consensus via Mixture of Agents. |
|
||||
| `rl` | `rl_check_status`, `rl_edit_config`, `rl_get_current_config`, `rl_get_results`, `rl_list_environments`, `rl_list_runs`, `rl_select_environment`, `rl_start_training`, `rl_stop_training`, `rl_test_inference` | RL training environment management (Atropos). |
|
||||
| `safe` | `image_generate`, `vision_analyze`, `web_extract`, `web_search` (via `includes`) | Read-only research + media generation. No file writes, no terminal, no code execution. |
|
||||
| `search` | `web_search` | Web search only (without extract). |
|
||||
| `session_search` | `session_search` | Search past conversation sessions. |
|
||||
| `skills` | `skill_manage`, `skill_view`, `skills_list` | Skill CRUD and browsing. |
|
||||
| `spotify` | `spotify_albums`, `spotify_devices`, `spotify_library`, `spotify_playback`, `spotify_playlists`, `spotify_queue`, `spotify_search` | Native Spotify control (playback, queue, search, playlists, albums, library). Registered by the bundled `spotify` plugin. |
|
||||
| `terminal` | `process`, `terminal` | Shell command execution and background process management. |
|
||||
| `todo` | `todo` | Task list management within a session. |
|
||||
| `tts` | `text_to_speech` | Text-to-speech audio generation. |
|
||||
| `vision` | `vision_analyze` | Image analysis via vision-capable models. |
|
||||
| `web` | `web_extract`, `web_search` | Web search and page content extraction. |
|
||||
|
||||
## Composite Toolsets
|
||||
|
||||
These expand to multiple core toolsets, providing a convenient shorthand for common scenarios:
|
||||
|
||||
| Toolset | Expands to | Use case |
|
||||
|---------|-----------|----------|
|
||||
| `debugging` | `web` + `file` + `process`, `terminal` (via `includes`) — effectively `patch`, `process`, `read_file`, `search_files`, `terminal`, `web_extract`, `web_search`, `write_file` | Debug sessions — file access, terminal, and web research without browser or delegation overhead. |
|
||||
| `safe` | `image_generate`, `vision_analyze`, `web_extract`, `web_search` | Read-only research and media generation. No file writes, no terminal access, no code execution. Good for untrusted or constrained environments. |
|
||||
| `yuanbao` | `yb_query_group_info`, `yb_query_group_members`, `yb_search_sticker`, `yb_send_dm`, `yb_send_sticker` | Yuanbao DM/group actions and sticker search. Registered only on `hermes-yuanbao`. |
|
||||
|
||||
## Platform Toolsets
|
||||
|
||||
@ -90,11 +87,12 @@ Platform toolsets define the complete tool configuration for a deployment target
|
||||
|
||||
| Toolset | Differences from `hermes-cli` |
|
||||
|---------|-------------------------------|
|
||||
| `hermes-cli` | Full toolset — all 36 core tools including `clarify`. The default for interactive CLI sessions. |
|
||||
| `hermes-acp` | Drops `clarify`, `cronjob`, `image_generate`, `send_message`, `text_to_speech`, homeassistant tools. Focused on coding tasks in IDE context. |
|
||||
| `hermes-api-server` | Drops `clarify`, `send_message`, and `text_to_speech`. Adds everything else — suitable for programmatic access where user interaction isn't possible. |
|
||||
| `hermes-cli` | Full toolset — 38 tools. The default for interactive CLI sessions. |
|
||||
| `hermes-acp` | Drops `clarify`, `cronjob`, `image_generate`, `send_message`, `text_to_speech`, and all four Home Assistant tools. Focused on coding tasks in IDE context. |
|
||||
| `hermes-api-server` | Drops `clarify`, `send_message`, and `text_to_speech`. Keeps everything else — suitable for programmatic access where user interaction isn't possible. |
|
||||
| `hermes-cron` | Same as `hermes-cli`. |
|
||||
| `hermes-telegram` | Same as `hermes-cli`. |
|
||||
| `hermes-discord` | Same as `hermes-cli`. |
|
||||
| `hermes-discord` | Adds `discord` and `discord_admin` on top of `hermes-cli`. |
|
||||
| `hermes-slack` | Same as `hermes-cli`. |
|
||||
| `hermes-whatsapp` | Same as `hermes-cli`. |
|
||||
| `hermes-signal` | Same as `hermes-cli`. |
|
||||
@ -104,14 +102,15 @@ Platform toolsets define the complete tool configuration for a deployment target
|
||||
| `hermes-sms` | Same as `hermes-cli`. |
|
||||
| `hermes-bluebubbles` | Same as `hermes-cli`. |
|
||||
| `hermes-dingtalk` | Same as `hermes-cli`. |
|
||||
| `hermes-feishu` | Same as `hermes-cli`. Note: the `feishu_doc` / `feishu_drive` toolsets are used only by the document-comment handler, not by the regular Feishu chat adapter. |
|
||||
| `hermes-feishu` | Adds the five `feishu_doc_*` / `feishu_drive_*` tools (only used by the document-comment handler, not the regular chat adapter). |
|
||||
| `hermes-qqbot` | Same as `hermes-cli`. |
|
||||
| `hermes-wecom` | Same as `hermes-cli`. |
|
||||
| `hermes-wecom-callback` | Same as `hermes-cli`. |
|
||||
| `hermes-weixin` | Same as `hermes-cli`. |
|
||||
| `hermes-homeassistant` | Same as `hermes-cli` plus the `homeassistant` toolset always on. |
|
||||
| `hermes-yuanbao` | Adds the five `yb_*` tools (DM/group/sticker) on top of `hermes-cli`. |
|
||||
| `hermes-homeassistant` | Same as `hermes-cli` (the Home Assistant tools are already present by default and activate when `HASS_TOKEN` is set). |
|
||||
| `hermes-webhook` | Same as `hermes-cli`. |
|
||||
| `hermes-gateway` | Internal gateway orchestrator toolset — union of the broadest possible tool set when the gateway needs to accept any message source. |
|
||||
| `hermes-gateway` | Internal gateway orchestrator toolset — union of every `hermes-<platform>` toolset; used when the gateway needs to accept any message source. |
|
||||
|
||||
## Dynamic Toolsets
|
||||
|
||||
|
||||
@ -16,7 +16,7 @@ This safety net is powered by an internal **Checkpoint Manager** that keeps a se
|
||||
Checkpoints are taken automatically before:
|
||||
|
||||
- **File tools** — `write_file` and `patch`
|
||||
- **Destructive terminal commands** — `rm`, `mv`, `sed -i`, `truncate`, `shred`, output redirects (`>`), and `git reset`/`clean`/`checkout`
|
||||
- **Destructive terminal commands** — `rm`, `rmdir`, `cp`, `install`, `mv`, `sed -i`, `truncate`, `dd`, `shred`, output redirects (`>`), and `git reset`/`clean`/`checkout`
|
||||
|
||||
The agent creates **at most one checkpoint per directory per turn**, so long-running sessions don't spam snapshots.
|
||||
|
||||
|
||||
@ -358,7 +358,7 @@ auxiliary:
|
||||
model: "google/gemini-3-flash-preview" # Model used for summarization
|
||||
```
|
||||
|
||||
When compression triggers, middle turns are summarized while the first 3 and last 4 turns are always preserved.
|
||||
When compression triggers, middle turns are summarized while the first 3 and last 20 turns are always preserved.
|
||||
|
||||
## Background Sessions
|
||||
|
||||
|
||||
@ -1152,7 +1152,8 @@ This controls both the `text_to_speech` tool and spoken replies in voice mode (`
|
||||
display:
|
||||
tool_progress: all # off | new | all | verbose
|
||||
tool_progress_command: false # Enable /verbose slash command in messaging gateway
|
||||
tool_progress_overrides: {} # Per-platform overrides (see below)
|
||||
platforms: {} # Per-platform display overrides (see below)
|
||||
tool_progress_overrides: {} # DEPRECATED — use display.platforms instead
|
||||
interim_assistant_messages: true # Gateway: send natural mid-turn assistant updates as separate messages
|
||||
skin: default # Built-in or custom CLI skin (see user-guide/features/skins)
|
||||
personality: "kawaii" # Legacy cosmetic field still surfaced in some summaries
|
||||
@ -1194,18 +1195,21 @@ Only the **final** message of a turn gets the footer; interim updates stay clean
|
||||
|
||||
### Per-platform progress overrides
|
||||
|
||||
Different platforms have different verbosity needs. For example, Signal can't edit messages, so each progress update becomes a separate message — noisy. Use `tool_progress_overrides` to set per-platform modes:
|
||||
Different platforms have different verbosity needs. For example, Signal can't edit messages, so each progress update becomes a separate message — noisy. Use `display.platforms` to set per-platform modes:
|
||||
|
||||
```yaml
|
||||
display:
|
||||
tool_progress: all # global default
|
||||
tool_progress_overrides:
|
||||
signal: 'off' # silence progress on Signal
|
||||
telegram: verbose # detailed progress on Telegram
|
||||
slack: 'off' # quiet in shared Slack workspace
|
||||
platforms:
|
||||
signal:
|
||||
tool_progress: 'off' # silence progress on Signal
|
||||
telegram:
|
||||
tool_progress: verbose # detailed progress on Telegram
|
||||
slack:
|
||||
tool_progress: 'off' # quiet in shared Slack workspace
|
||||
```
|
||||
|
||||
Platforms without an override fall back to the global `tool_progress` value. Valid platform keys: `telegram`, `discord`, `slack`, `signal`, `whatsapp`, `matrix`, `mattermost`, `email`, `sms`, `homeassistant`, `dingtalk`, `feishu`, `wecom`, `weixin`, `bluebubbles`, `qqbot`.
|
||||
Platforms without an override fall back to the global `tool_progress` value. Valid platform keys: `telegram`, `discord`, `slack`, `signal`, `whatsapp`, `matrix`, `mattermost`, `email`, `sms`, `homeassistant`, `dingtalk`, `feishu`, `wecom`, `weixin`, `bluebubbles`, `qqbot`. The legacy `display.tool_progress_overrides` key still loads for backward compatibility but is deprecated and migrated into `display.platforms` on first load.
|
||||
|
||||
`interim_assistant_messages` is gateway-only. When enabled, Hermes sends completed mid-turn assistant updates as separate chat messages. This is independent from `tool_progress` and does not require gateway streaming.
|
||||
|
||||
|
||||
@ -39,7 +39,7 @@ docker run -d \
|
||||
nousresearch/hermes-agent gateway run
|
||||
```
|
||||
|
||||
Port 8642 exposes the gateway's [OpenAI-compatible API server](./api-server.md) and health endpoint. It's optional if you only use chat platforms (Telegram, Discord, etc.), but required if you want the dashboard or external tools to reach the gateway.
|
||||
Port 8642 exposes the gateway's [OpenAI-compatible API server](./features/api-server.md) and health endpoint. It's optional if you only use chat platforms (Telegram, Discord, etc.), but required if you want the dashboard or external tools to reach the gateway.
|
||||
|
||||
Opening any port on an internet facing machine is a security risk. You should not do it unless you understand the risks.
|
||||
|
||||
@ -208,7 +208,7 @@ services:
|
||||
image: nousresearch/hermes-agent:latest
|
||||
container_name: hermes-dashboard
|
||||
restart: unless-stopped
|
||||
command: dashboard --host 0.0.0.0
|
||||
command: dashboard --host 0.0.0.0 --insecure
|
||||
ports:
|
||||
- "9119:9119"
|
||||
volumes:
|
||||
@ -259,10 +259,10 @@ docker run -d \
|
||||
|
||||
The official image is based on `debian:13.4` and includes:
|
||||
|
||||
- Python 3 with all Hermes dependencies (`pip install -e ".[all]"`)
|
||||
- Python 3 with all Hermes dependencies (`uv pip install -e ".[all]"`)
|
||||
- Node.js + npm (for browser automation and WhatsApp bridge)
|
||||
- Playwright with Chromium (`npx playwright install --with-deps chromium`)
|
||||
- ripgrep and ffmpeg as system utilities
|
||||
- Playwright with Chromium (`npx playwright install --with-deps chromium --only-shell`)
|
||||
- ripgrep, ffmpeg, git, and tini as system utilities
|
||||
- **`docker-cli`** — so agents running inside the container can drive the host's Docker daemon (bind-mount `/var/run/docker.sock` to opt in) for `docker build`, `docker run`, container inspection, etc.
|
||||
- **`openssh-client`** — enables the [SSH terminal backend](/docs/user-guide/configuration#ssh-backend) from inside the container. The SSH backend shells out to the system `ssh` binary; without this, it failed silently in containerized installs.
|
||||
- The WhatsApp bridge (`scripts/whatsapp-bridge/`)
|
||||
@ -312,7 +312,7 @@ Check logs: `docker logs hermes`. Common causes:
|
||||
|
||||
### "Permission denied" errors
|
||||
|
||||
The container runs as root by default. If your host `~/.hermes/` was created by a non-root user, permissions should work. If you get errors, ensure the data directory is writable:
|
||||
The container's entrypoint drops privileges to the non-root `hermes` user (UID 10000) via `gosu`. If your host `~/.hermes/` is owned by a different UID, set `HERMES_UID`/`HERMES_GID` to match your host user, or ensure the data directory is writable:
|
||||
|
||||
```sh
|
||||
chmod -R 755 ~/.hermes
|
||||
|
||||
@ -51,6 +51,22 @@ hermes plugins disable disk-cleanup
|
||||
|
||||
## Currently shipped
|
||||
|
||||
The repo ships these bundled plugins under `plugins/`. All are opt-in — enable them via `hermes plugins enable <name>`.
|
||||
|
||||
| Plugin | Kind | Purpose |
|
||||
|---|---|---|
|
||||
| `disk-cleanup` | hooks + slash command | Auto-track ephemeral files and clean them on session end |
|
||||
| `observability/langfuse` | hooks | Trace turns / LLM calls / tools to [Langfuse](https://langfuse.com) |
|
||||
| `spotify` | backend (7 tools) | Native Spotify playback, queue, search, playlists, albums, library |
|
||||
| `google_meet` | standalone | Join Meet calls, live-caption transcription, optional realtime duplex audio |
|
||||
| `image_gen/openai` | image backend | OpenAI `gpt-image-2` image generation backend (alternative to FAL) |
|
||||
| `image_gen/openai-codex` | image backend | OpenAI image generation via Codex OAuth |
|
||||
| `image_gen/xai` | image backend | xAI `grok-2-image` backend |
|
||||
| `example-dashboard` | dashboard example | Reference dashboard plugin for [Extending the Dashboard](./extending-the-dashboard.md) |
|
||||
| `strike-freedom-cockpit` | dashboard skin | Sample custom dashboard skin |
|
||||
|
||||
Memory providers (`plugins/memory/*`) and context engines (`plugins/context_engine/*`) are listed separately on [Memory Providers](./memory-providers.md) — they're managed through `hermes memory` and `hermes plugins` respectively. The full per-plugin detail for the two long-running hooks-based plugins follows.
|
||||
|
||||
### disk-cleanup
|
||||
|
||||
Auto-tracks and removes ephemeral files created during sessions — test scripts, temp outputs, cron logs, stale chrome profiles — without requiring the agent to remember to call a tool.
|
||||
|
||||
@ -91,10 +91,10 @@ This is useful when you want a scheduled agent to inherit reusable workflows wit
|
||||
Cron jobs default to running detached from any repo — no `AGENTS.md`, `CLAUDE.md`, or `.cursorrules` is loaded, and the terminal / file / code-exec tools run from whatever working directory the gateway started in. Pass `--workdir` (CLI) or `workdir=` (tool call) to change that:
|
||||
|
||||
```bash
|
||||
# Standalone CLI
|
||||
hermes cron create --schedule "every 1d at 09:00" \
|
||||
--workdir /home/me/projects/acme \
|
||||
--prompt "Audit open PRs, summarize CI health, and post to #eng"
|
||||
# Standalone CLI (schedule and prompt are positional)
|
||||
hermes cron create "every 1d at 09:00" \
|
||||
"Audit open PRs, summarize CI health, and post to #eng" \
|
||||
--workdir /home/me/projects/acme
|
||||
```
|
||||
|
||||
```python
|
||||
|
||||
@ -74,6 +74,12 @@ Both `provider` and `model` are **required**. If either is missing, the fallback
|
||||
| Arcee AI | `arcee` | `ARCEEAI_API_KEY` |
|
||||
| GMI Cloud | `gmi` | `GMI_API_KEY` |
|
||||
| Alibaba / DashScope | `alibaba` | `DASHSCOPE_API_KEY` |
|
||||
| Alibaba Coding Plan | `alibaba-coding-plan` | `ALIBABA_CODING_PLAN_API_KEY` (falls back to `DASHSCOPE_API_KEY`) |
|
||||
| Kimi / Moonshot (China) | `kimi-coding-cn` | `KIMI_CN_API_KEY` |
|
||||
| StepFun | `stepfun` | `STEPFUN_API_KEY` |
|
||||
| Tencent TokenHub | `tencent-tokenhub` | `TOKENHUB_API_KEY` |
|
||||
| Azure AI Foundry | `azure-foundry` | `AZURE_FOUNDRY_API_KEY` + `AZURE_FOUNDRY_BASE_URL` |
|
||||
| LM Studio (local) | `lmstudio` | `LM_API_KEY` (or none for local) + `LM_BASE_URL` |
|
||||
| Hugging Face | `huggingface` | `HF_TOKEN` |
|
||||
| Custom endpoint | `custom` | `base_url` + `key_env` (see below) |
|
||||
|
||||
|
||||
@ -30,8 +30,8 @@ Hermes Agent includes a rich set of capabilities that extend far beyond basic ch
|
||||
- **[Voice Mode](voice-mode.md)** — Full voice interaction across CLI and messaging platforms. Talk to the agent using your microphone, hear spoken replies, and have live voice conversations in Discord voice channels.
|
||||
- **[Browser Automation](browser.md)** — Full browser automation with multiple backends: Browserbase cloud, Browser Use cloud, local Chrome via CDP, or local Chromium. Navigate websites, fill forms, and extract information.
|
||||
- **[Vision & Image Paste](vision.md)** — Multimodal vision support. Paste images from your clipboard into the CLI and ask the agent to analyze, describe, or work with them using any vision-capable model.
|
||||
- **[Image Generation](image-generation.md)** — Generate images from text prompts using FAL.ai. Eight models supported (FLUX 2 Klein/Pro, GPT-Image 1.5, Nano Banana Pro, Ideogram V3, Recraft V4 Pro, Qwen, Z-Image Turbo); pick one via `hermes tools`.
|
||||
- **[Voice & TTS](tts.md)** — Text-to-speech output and voice message transcription across all messaging platforms, with five provider options: Edge TTS (free), ElevenLabs, OpenAI TTS, MiniMax, and NeuTTS.
|
||||
- **[Image Generation](image-generation.md)** — Generate images from text prompts using FAL.ai. Nine models supported (FLUX 2 Klein/Pro, GPT-Image 1.5/2, Nano Banana Pro, Ideogram V3, Recraft V4 Pro, Qwen, Z-Image Turbo); pick one via `hermes tools`.
|
||||
- **[Voice & TTS](tts.md)** — Text-to-speech output and voice message transcription across all messaging platforms, with nine provider options: Edge TTS (free), ElevenLabs, OpenAI TTS, MiniMax, Mistral Voxtral, Google Gemini, xAI, NeuTTS, and KittenTTS.
|
||||
|
||||
## Integrations
|
||||
|
||||
@ -39,7 +39,7 @@ Hermes Agent includes a rich set of capabilities that extend far beyond basic ch
|
||||
- **[Provider Routing](provider-routing.md)** — Fine-grained control over which AI providers handle your requests. Optimize for cost, speed, or quality with sorting, whitelists, blacklists, and priority ordering.
|
||||
- **[Fallback Providers](fallback-providers.md)** — Automatic failover to backup LLM providers when your primary model encounters errors, including independent fallback for auxiliary tasks like vision and compression.
|
||||
- **[Credential Pools](credential-pools.md)** — Distribute API calls across multiple keys for the same provider. Automatic rotation on rate limits or failures.
|
||||
- **[Memory Providers](memory-providers.md)** — Plug in external memory backends (Honcho, OpenViking, Mem0, Hindsight, Holographic, RetainDB, ByteRover) for cross-session user modeling and personalization beyond the built-in memory system.
|
||||
- **[Memory Providers](memory-providers.md)** — Plug in external memory backends (Honcho, OpenViking, Mem0, Hindsight, Holographic, RetainDB, ByteRover, Supermemory) for cross-session user modeling and personalization beyond the built-in memory system.
|
||||
- **[API Server](api-server.md)** — Expose Hermes as an OpenAI-compatible HTTP endpoint. Connect any frontend that speaks the OpenAI format — Open WebUI, LobeChat, LibreChat, and more.
|
||||
- **[IDE Integration (ACP)](acp.md)** — Use Hermes inside ACP-compatible editors such as VS Code, Zed, and JetBrains. Chat, tool activity, file diffs, and terminal commands render inside your editor.
|
||||
- **[RL Training](rl-training.md)** — Generate trajectory data from agent sessions for reinforcement learning and model fine-tuning.
|
||||
|
||||
@ -142,6 +142,9 @@ Plugins can register callbacks for these lifecycle events. See the **[Event Hook
|
||||
| [`post_llm_call`](/docs/user-guide/features/hooks#post_llm_call) | Once per turn, after the LLM loop (successful turns only) |
|
||||
| [`on_session_start`](/docs/user-guide/features/hooks#on_session_start) | New session created (first turn only) |
|
||||
| [`on_session_end`](/docs/user-guide/features/hooks#on_session_end) | End of every `run_conversation` call + CLI exit handler |
|
||||
| [`on_session_finalize`](/docs/user-guide/features/hooks#on_session_finalize) | CLI/gateway tears down an active session (`/new`, GC, CLI quit) |
|
||||
| [`on_session_reset`](/docs/user-guide/features/hooks#on_session_reset) | Gateway swaps in a new session key (`/new`, `/reset`, `/clear`, idle rotation) |
|
||||
| [`subagent_stop`](/docs/user-guide/features/hooks#subagent_stop) | Once per child after `delegate_task` finishes |
|
||||
| [`pre_gateway_dispatch`](/docs/user-guide/features/hooks#pre_gateway_dispatch) | Gateway received a user message, before auth + dispatch. Return `{"action": "skip" \| "rewrite" \| "allow", ...}` to influence flow. |
|
||||
|
||||
## Plugin types
|
||||
|
||||
@ -18,7 +18,7 @@ The **Tool Gateway** lets paid [Nous Portal](https://portal.nousresearch.com) su
|
||||
| Tool | What It Does | Direct Alternative |
|
||||
|------|--------------|--------------------|
|
||||
| **Web search & extract** | Search the web and extract page content via Firecrawl | `FIRECRAWL_API_KEY`, `EXA_API_KEY`, `PARALLEL_API_KEY`, `TAVILY_API_KEY` |
|
||||
| **Image generation** | Generate images via FAL (8 models: FLUX 2 Klein/Pro, GPT-Image, Nano Banana Pro, Ideogram, Recraft V4 Pro, Qwen, Z-Image) | `FAL_KEY` |
|
||||
| **Image generation** | Generate images via FAL (9 models: FLUX 2 Klein/Pro, GPT-Image 1.5/2, Nano Banana Pro, Ideogram V3, Recraft V4 Pro, Qwen, Z-Image Turbo) | `FAL_KEY` |
|
||||
| **Text-to-speech** | Convert text to speech via OpenAI TTS | `VOICE_TOOLS_OPENAI_KEY`, `ELEVENLABS_API_KEY` |
|
||||
| **Browser automation** | Control cloud browsers via Browser Use | `BROWSER_USE_API_KEY`, `BROWSERBASE_API_KEY` |
|
||||
|
||||
|
||||
@ -48,7 +48,7 @@ hermes tools
|
||||
hermes tools
|
||||
```
|
||||
|
||||
Common toolsets include `web`, `terminal`, `file`, `browser`, `vision`, `image_gen`, `moa`, `skills`, `tts`, `todo`, `memory`, `session_search`, `cronjob`, `code_execution`, `delegation`, `clarify`, `homeassistant`, and `rl`.
|
||||
Common toolsets include `web`, `search`, `terminal`, `file`, `browser`, `vision`, `image_gen`, `moa`, `skills`, `tts`, `todo`, `memory`, `session_search`, `cronjob`, `code_execution`, `delegation`, `clarify`, `homeassistant`, `messaging`, `spotify`, `discord`, `discord_admin`, `debugging`, `safe`, and `rl`.
|
||||
|
||||
See [Toolsets Reference](/docs/reference/toolsets-reference) for the full set, including platform presets such as `hermes-cli`, `hermes-telegram`, and dynamic MCP toolsets like `mcp-<server>`.
|
||||
|
||||
|
||||
@ -23,6 +23,8 @@ This starts a local web server and opens `http://127.0.0.1:9119` in your browser
|
||||
| `--port` | `9119` | Port to run the web server on |
|
||||
| `--host` | `127.0.0.1` | Bind address |
|
||||
| `--no-open` | — | Don't auto-open the browser |
|
||||
| `--insecure` | off | Allow binding to non-localhost hosts (**DANGEROUS** — exposes API keys on the network; pair with a firewall and strong auth) |
|
||||
| `--tui` | off | Expose the in-browser Chat tab (embedded `hermes --tui` via PTY/WebSocket). Alternatively set `HERMES_DASHBOARD_TUI=1`. |
|
||||
|
||||
```bash
|
||||
# Custom port
|
||||
|
||||
@ -90,7 +90,8 @@ Hermes → BlueBubbles REST API → Messages.app → iMessage
|
||||
| `BLUEBUBBLES_HOME_CHANNEL` | No | — | Phone/email for cron delivery |
|
||||
| `BLUEBUBBLES_ALLOWED_USERS` | No | — | Comma-separated authorized users |
|
||||
| `BLUEBUBBLES_ALLOW_ALL_USERS` | No | `false` | Allow all users |
|
||||
| `BLUEBUBBLES_SEND_READ_RECEIPTS` | No | `true` | Auto-mark messages as read |
|
||||
|
||||
Auto-marking messages as read is controlled by the `send_read_receipts` key under `platforms.bluebubbles.extra` in `~/.hermes/config.yaml` (default: `true`). There is no corresponding environment variable.
|
||||
|
||||
## Features
|
||||
|
||||
|
||||
@ -123,6 +123,13 @@ DINGTALK_ALLOWED_USERS=user-id-1
|
||||
|
||||
# Multiple allowed users (comma-separated)
|
||||
# DINGTALK_ALLOWED_USERS=user-id-1,user-id-2
|
||||
|
||||
# Optional: group-chat gating (mirrors Slack/Telegram/Discord/WhatsApp)
|
||||
# DINGTALK_REQUIRE_MENTION=true
|
||||
# DINGTALK_FREE_RESPONSE_CHATS=cidABC==,cidDEF==
|
||||
# DINGTALK_MENTION_PATTERNS=^小马
|
||||
# DINGTALK_HOME_CHANNEL=cidXXXX==
|
||||
# DINGTALK_ALLOW_ALL_USERS=true
|
||||
```
|
||||
|
||||
Optional behavior settings in `~/.hermes/config.yaml`:
|
||||
|
||||
@ -292,7 +292,7 @@ Discord behavior is controlled through two files: **`~/.hermes/.env`** for crede
|
||||
| `DISCORD_ALLOW_MENTION_REPLIED_USER` | No | `true` | When `true` (default), replying to a message pings the original author. |
|
||||
| `DISCORD_PROXY` | No | — | Proxy URL for Discord connections (HTTP, WebSocket, REST). Overrides `HTTPS_PROXY`/`ALL_PROXY`. Supports `http://`, `https://`, and `socks5://` schemes. |
|
||||
| `HERMES_DISCORD_TEXT_BATCH_DELAY_SECONDS` | No | `0.6` | Grace window the adapter waits before flushing a queued text chunk. Useful for smoothing streamed output. |
|
||||
| `HERMES_DISCORD_TEXT_BATCH_SPLIT_DELAY_SECONDS` | No | `0.1` | Delay between split chunks when a single message exceeds Discord's length limit. |
|
||||
| `HERMES_DISCORD_TEXT_BATCH_SPLIT_DELAY_SECONDS` | No | `2.0` | Delay between split chunks when a single message exceeds Discord's length limit. |
|
||||
|
||||
### Config File (`config.yaml`)
|
||||
|
||||
|
||||
@ -51,8 +51,9 @@ QQ_CLIENT_SECRET=your-app-secret
|
||||
| `QQBOT_HOME_CHANNEL` | OpenID for cron/notification delivery | — |
|
||||
| `QQBOT_HOME_CHANNEL_NAME` | Display name for home channel | `Home` |
|
||||
| `QQ_ALLOWED_USERS` | Comma-separated user OpenIDs for DM access | open (all users) |
|
||||
| `QQ_GROUP_ALLOWED_USERS` | Comma-separated group OpenIDs for group access | — |
|
||||
| `QQ_ALLOW_ALL_USERS` | Set to `true` to allow all DMs | `false` |
|
||||
| `QQ_SANDBOX` | Route requests to the QQ sandbox gateway for development testing | `false` |
|
||||
| `QQ_PORTAL_HOST` | Override the QQ portal host (set to `sandbox.q.qq.com` for sandbox routing) | `q.qq.com` |
|
||||
| `QQ_STT_API_KEY` | API key for voice-to-text provider | — |
|
||||
| `QQ_STT_BASE_URL` | Base URL for STT provider | `https://open.bigmodel.cn/api/coding/paas/v4` |
|
||||
| `QQ_STT_MODEL` | STT model name | `glm-asr` |
|
||||
|
||||
@ -179,15 +179,15 @@ Add the following to `~/.hermes/.env`:
|
||||
|
||||
```bash
|
||||
TELEGRAM_WEBHOOK_URL=https://my-app.fly.dev/telegram
|
||||
TELEGRAM_WEBHOOK_SECRET="$(openssl rand -hex 32)" # required
|
||||
# TELEGRAM_WEBHOOK_PORT=8443 # optional, default 8443
|
||||
# TELEGRAM_WEBHOOK_SECRET=mysecret # optional, recommended
|
||||
```
|
||||
|
||||
| Variable | Required | Description |
|
||||
|----------|----------|-------------|
|
||||
| `TELEGRAM_WEBHOOK_URL` | Yes | Public HTTPS URL where Telegram will send updates. The URL path is auto-extracted (e.g., `/telegram` from the example above). |
|
||||
| `TELEGRAM_WEBHOOK_SECRET` | **Yes** (when `TELEGRAM_WEBHOOK_URL` is set) | Secret token that Telegram echoes in every webhook request for verification. The gateway refuses to start without it — see [GHSA-3vpc-7q5r-276h](https://github.com/NousResearch/hermes-agent/security/advisories/GHSA-3vpc-7q5r-276h). Generate with `openssl rand -hex 32`. |
|
||||
| `TELEGRAM_WEBHOOK_PORT` | No | Local port the webhook server listens on (default: `8443`). |
|
||||
| `TELEGRAM_WEBHOOK_SECRET` | No | Secret token for verifying that updates actually come from Telegram. **Strongly recommended** for production deployments. |
|
||||
|
||||
When `TELEGRAM_WEBHOOK_URL` is set, the gateway starts an HTTP webhook server instead of polling. When unset, polling mode is used — no behavior change from previous versions.
|
||||
|
||||
|
||||
@ -60,9 +60,11 @@ WECOM_CALLBACK_ALLOWED_USERS=user1,user2
|
||||
### 3. Start the Gateway
|
||||
|
||||
```bash
|
||||
hermes gateway start
|
||||
hermes gateway
|
||||
```
|
||||
|
||||
(Use `hermes gateway start` only after `hermes gateway install` has registered the systemd/launchd service.)
|
||||
|
||||
The callback adapter starts an HTTP server on the configured port. WeCom will verify the callback URL via a GET request, then begin sending messages via POST.
|
||||
|
||||
## Configuration Reference
|
||||
|
||||
@ -337,5 +337,5 @@ hermes chat -q "Send 'Hello from CLI' to yuanbao:group:group_code"
|
||||
|
||||
- [Messaging Gateway Overview](./index.md)
|
||||
- [Slash Commands Reference](/docs/reference/slash-commands.md)
|
||||
- [Cron Jobs](/docs/user-guide/features/cron-jobs.md)
|
||||
- [Background Tasks](/docs/guides/tips.md#background-tasks)
|
||||
- [Cron Jobs](/docs/user-guide/features/cron.md)
|
||||
- [Background Sessions](/docs/user-guide/cli#background-sessions)
|
||||
@ -70,7 +70,7 @@ coder setup # configure coder's settings
|
||||
coder gateway start # start coder's gateway
|
||||
coder doctor # check coder's health
|
||||
coder skills list # list coder's skills
|
||||
coder config set model.model anthropic/claude-sonnet-4
|
||||
coder config set model.default anthropic/claude-sonnet-4
|
||||
```
|
||||
|
||||
The alias works with every hermes subcommand — it's just `hermes -p <name>` under the hood.
|
||||
@ -173,7 +173,7 @@ Each profile has its own:
|
||||
- **`SOUL.md`** — personality and instructions
|
||||
|
||||
```bash
|
||||
coder config set model.model anthropic/claude-sonnet-4
|
||||
coder config set model.default anthropic/claude-sonnet-4
|
||||
echo "You are a focused coding assistant." > ~/.hermes/profiles/coder/SOUL.md
|
||||
```
|
||||
|
||||
|
||||
@ -119,7 +119,7 @@ The following patterns trigger approval prompts (defined in `tools/approval.py`)
|
||||
| `DELETE FROM` (without WHERE) | SQL DELETE without WHERE |
|
||||
| `TRUNCATE TABLE` | SQL TRUNCATE |
|
||||
| `> /etc/` | Overwrite system config |
|
||||
| `systemctl stop/disable/mask` | Stop/disable system services |
|
||||
| `systemctl stop/restart/disable/mask` | Stop/restart/disable system services |
|
||||
| `kill -9 -1` | Kill all processes |
|
||||
| `pkill -9` | Force kill processes |
|
||||
| Fork bomb patterns | Fork bombs |
|
||||
|
||||
@ -124,7 +124,7 @@ display:
|
||||
```
|
||||
|
||||
:::tip
|
||||
Session IDs follow the format `YYYYMMDD_HHMMSS_<8-char-hex>`, e.g. `20250305_091523_a1b2c3d4`. You can resume by ID or by title — both work with `-c` and `-r`.
|
||||
Session IDs follow the format `YYYYMMDD_HHMMSS_<hex>` — CLI/TUI sessions use a 6-char hex suffix (e.g. `20250305_091523_a1b2c3`), gateway sessions use an 8-char suffix (e.g. `20250305_091523_a1b2c3d4`). You can resume by ID (full or unique prefix) or by title — both work with `-c` and `-r`.
|
||||
:::
|
||||
|
||||
## Session Naming
|
||||
|
||||
@ -1,14 +1,14 @@
|
||||
---
|
||||
title: "Apple Notes — Manage Apple Notes via the memo CLI on macOS (create, view, search, edit)"
|
||||
title: "Apple Notes — Manage Apple Notes via memo CLI: create, search, edit"
|
||||
sidebar_label: "Apple Notes"
|
||||
description: "Manage Apple Notes via the memo CLI on macOS (create, view, search, edit)"
|
||||
description: "Manage Apple Notes via memo CLI: create, search, edit"
|
||||
---
|
||||
|
||||
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
|
||||
|
||||
# Apple Notes
|
||||
|
||||
Manage Apple Notes via the memo CLI on macOS (create, view, search, edit).
|
||||
Manage Apple Notes via memo CLI: create, search, edit.
|
||||
|
||||
## Skill metadata
|
||||
|
||||
|
||||
@ -1,14 +1,14 @@
|
||||
---
|
||||
title: "Apple Reminders — Manage Apple Reminders via remindctl CLI (list, add, complete, delete)"
|
||||
title: "Apple Reminders — Apple Reminders via remindctl: add, list, complete"
|
||||
sidebar_label: "Apple Reminders"
|
||||
description: "Manage Apple Reminders via remindctl CLI (list, add, complete, delete)"
|
||||
description: "Apple Reminders via remindctl: add, list, complete"
|
||||
---
|
||||
|
||||
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
|
||||
|
||||
# Apple Reminders
|
||||
|
||||
Manage Apple Reminders via remindctl CLI (list, add, complete, delete).
|
||||
Apple Reminders via remindctl: add, list, complete.
|
||||
|
||||
## Skill metadata
|
||||
|
||||
|
||||
@ -1,14 +1,14 @@
|
||||
---
|
||||
title: "Findmy — Track Apple devices and AirTags via FindMy"
|
||||
title: "Findmy — Track Apple devices/AirTags via FindMy"
|
||||
sidebar_label: "Findmy"
|
||||
description: "Track Apple devices and AirTags via FindMy"
|
||||
description: "Track Apple devices/AirTags via FindMy"
|
||||
---
|
||||
|
||||
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
|
||||
|
||||
# Findmy
|
||||
|
||||
Track Apple devices and AirTags via FindMy.app on macOS using AppleScript and screen capture.
|
||||
Track Apple devices/AirTags via FindMy.app on macOS.
|
||||
|
||||
## Skill metadata
|
||||
|
||||
|
||||
@ -1,14 +1,14 @@
|
||||
---
|
||||
title: "Claude Code — Delegate coding tasks to Claude Code (Anthropic's CLI agent)"
|
||||
title: "Claude Code — Delegate coding to Claude Code CLI (features, PRs)"
|
||||
sidebar_label: "Claude Code"
|
||||
description: "Delegate coding tasks to Claude Code (Anthropic's CLI agent)"
|
||||
description: "Delegate coding to Claude Code CLI (features, PRs)"
|
||||
---
|
||||
|
||||
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
|
||||
|
||||
# Claude Code
|
||||
|
||||
Delegate coding tasks to Claude Code (Anthropic's CLI agent). Use for building features, refactoring, PR reviews, and iterative coding. Requires the claude CLI installed.
|
||||
Delegate coding to Claude Code CLI (features, PRs).
|
||||
|
||||
## Skill metadata
|
||||
|
||||
|
||||
@ -1,14 +1,14 @@
|
||||
---
|
||||
title: "Codex — Delegate coding tasks to OpenAI Codex CLI agent"
|
||||
title: "Codex — Delegate coding to OpenAI Codex CLI (features, PRs)"
|
||||
sidebar_label: "Codex"
|
||||
description: "Delegate coding tasks to OpenAI Codex CLI agent"
|
||||
description: "Delegate coding to OpenAI Codex CLI (features, PRs)"
|
||||
---
|
||||
|
||||
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
|
||||
|
||||
# Codex
|
||||
|
||||
Delegate coding tasks to OpenAI Codex CLI agent. Use for building features, refactoring, PR reviews, and batch issue fixing. Requires the codex CLI and a git repository.
|
||||
Delegate coding to OpenAI Codex CLI (features, PRs).
|
||||
|
||||
## Skill metadata
|
||||
|
||||
@ -32,6 +32,15 @@ The following is the complete skill definition that Hermes loads when this skill
|
||||
|
||||
Delegate coding tasks to [Codex](https://github.com/openai/codex) via the Hermes terminal. Codex is OpenAI's autonomous coding agent CLI.
|
||||
|
||||
## When to use
|
||||
|
||||
- Building features
|
||||
- Refactoring
|
||||
- PR reviews
|
||||
- Batch issue fixing
|
||||
|
||||
Requires the codex CLI and a git repository.
|
||||
|
||||
## Prerequisites
|
||||
|
||||
- Codex installed: `npm install -g @openai/codex`
|
||||
|
||||
@ -1,14 +1,14 @@
|
||||
---
|
||||
title: "Hermes Agent"
|
||||
title: "Hermes Agent — Configure, extend, or contribute to Hermes Agent"
|
||||
sidebar_label: "Hermes Agent"
|
||||
description: "Complete guide to using and extending Hermes Agent — CLI usage, setup, configuration, spawning additional agents, gateway platforms, skills, voice, tools, pr..."
|
||||
description: "Configure, extend, or contribute to Hermes Agent"
|
||||
---
|
||||
|
||||
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
|
||||
|
||||
# Hermes Agent
|
||||
|
||||
Complete guide to using and extending Hermes Agent — CLI usage, setup, configuration, spawning additional agents, gateway platforms, skills, voice, tools, profiles, and a concise contributor reference. Load this skill when helping users configure Hermes, troubleshoot issues, spawn agent instances, or make code contributions.
|
||||
Configure, extend, or contribute to Hermes Agent.
|
||||
|
||||
## Skill metadata
|
||||
|
||||
@ -132,7 +132,7 @@ hermes tools disable NAME Disable a toolset
|
||||
|
||||
hermes skills list List installed skills
|
||||
hermes skills search QUERY Search the skills hub
|
||||
hermes skills install ID Install a skill
|
||||
hermes skills install ID Install a skill (ID can be a hub identifier OR a direct https://…/SKILL.md URL; pass --name to override when frontmatter has no name)
|
||||
hermes skills inspect ID Preview without installing
|
||||
hermes skills config Enable/disable skills per platform
|
||||
hermes skills check Check for updates
|
||||
@ -419,6 +419,63 @@ Tool changes take effect on `/reset` (new session). They do NOT apply mid-conver
|
||||
|
||||
---
|
||||
|
||||
## Security & Privacy Toggles
|
||||
|
||||
Common "why is Hermes doing X to my output / tool calls / commands?" toggles — and the exact commands to change them. Most of these need a fresh session (`/reset` in chat, or start a new `hermes` invocation) because they're read once at startup.
|
||||
|
||||
### Secret redaction in tool output
|
||||
|
||||
Secret redaction is **off by default** — tool output (terminal stdout, `read_file`, web content, subagent summaries, etc.) passes through unmodified. If the user wants Hermes to auto-mask strings that look like API keys, tokens, and secrets before they enter the conversation context and logs:
|
||||
|
||||
```bash
|
||||
hermes config set security.redact_secrets true # enable globally
|
||||
```
|
||||
|
||||
**Restart required.** `security.redact_secrets` is snapshotted at import time — toggling it mid-session (e.g. via `export HERMES_REDACT_SECRETS=true` from a tool call) will NOT take effect for the running process. Tell the user to run `hermes config set security.redact_secrets true` in a terminal, then start a new session. This is deliberate — it prevents an LLM from flipping the toggle on itself mid-task.
|
||||
|
||||
Disable again with:
|
||||
```bash
|
||||
hermes config set security.redact_secrets false
|
||||
```
|
||||
|
||||
### PII redaction in gateway messages
|
||||
|
||||
Separate from secret redaction. When enabled, the gateway hashes user IDs and strips phone numbers from the session context before it reaches the model:
|
||||
|
||||
```bash
|
||||
hermes config set privacy.redact_pii true # enable
|
||||
hermes config set privacy.redact_pii false # disable (default)
|
||||
```
|
||||
|
||||
### Command approval prompts
|
||||
|
||||
By default (`approvals.mode: manual`), Hermes prompts the user before running shell commands flagged as destructive (`rm -rf`, `git reset --hard`, etc.). The modes are:
|
||||
|
||||
- `manual` — always prompt (default)
|
||||
- `smart` — use an auxiliary LLM to auto-approve low-risk commands, prompt on high-risk
|
||||
- `off` — skip all approval prompts (equivalent to `--yolo`)
|
||||
|
||||
```bash
|
||||
hermes config set approvals.mode smart # recommended middle ground
|
||||
hermes config set approvals.mode off # bypass everything (not recommended)
|
||||
```
|
||||
|
||||
Per-invocation bypass without changing config:
|
||||
- `hermes --yolo …`
|
||||
- `export HERMES_YOLO_MODE=1`
|
||||
|
||||
Note: YOLO / `approvals.mode: off` does NOT turn off secret redaction. They are independent.
|
||||
|
||||
### Shell hooks allowlist
|
||||
|
||||
Some shell-hook integrations require explicit allowlisting before they fire. Managed via `~/.hermes/shell-hooks-allowlist.json` — prompted interactively the first time a hook wants to run.
|
||||
|
||||
### Disabling the web/browser/image-gen tools
|
||||
|
||||
To keep the model away from network or media tools entirely, open `hermes tools` and toggle per-platform. Takes effect on next session (`/reset`). See the Tools & Skills section above.
|
||||
|
||||
---
|
||||
|
||||
## Voice & Transcription
|
||||
|
||||
### STT (Voice → Text)
|
||||
@ -617,6 +674,7 @@ For occasional contributors and PR authors. Full developer docs: https://hermes-
|
||||
|
||||
### Project Layout
|
||||
|
||||
<!-- ascii-guard-ignore -->
|
||||
```
|
||||
hermes-agent/
|
||||
├── run_agent.py # AIAgent — core conversation loop
|
||||
@ -637,6 +695,7 @@ hermes-agent/
|
||||
├── tests/ # ~3000 pytest tests
|
||||
└── website/ # Docusaurus docs site
|
||||
```
|
||||
<!-- ascii-guard-ignore-end -->
|
||||
|
||||
Config: `~/.hermes/config.yaml` (settings), `~/.hermes/.env` (API keys).
|
||||
|
||||
|
||||
@ -1,14 +1,14 @@
|
||||
---
|
||||
title: "Opencode"
|
||||
title: "Opencode — Delegate coding to OpenCode CLI (features, PR review)"
|
||||
sidebar_label: "Opencode"
|
||||
description: "Delegate coding tasks to OpenCode CLI agent for feature implementation, refactoring, PR review, and long-running autonomous sessions"
|
||||
description: "Delegate coding to OpenCode CLI (features, PR review)"
|
||||
---
|
||||
|
||||
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
|
||||
|
||||
# Opencode
|
||||
|
||||
Delegate coding tasks to OpenCode CLI agent for feature implementation, refactoring, PR review, and long-running autonomous sessions. Requires the opencode CLI installed and authenticated.
|
||||
Delegate coding to OpenCode CLI (features, PR review).
|
||||
|
||||
## Skill metadata
|
||||
|
||||
|
||||
@ -1,14 +1,14 @@
|
||||
---
|
||||
title: "Architecture Diagram"
|
||||
title: "Architecture Diagram — Dark-themed SVG architecture/cloud/infra diagrams as HTML"
|
||||
sidebar_label: "Architecture Diagram"
|
||||
description: "Generate dark-themed SVG diagrams of software systems and cloud infrastructure as standalone HTML files with inline SVG graphics"
|
||||
description: "Dark-themed SVG architecture/cloud/infra diagrams as HTML"
|
||||
---
|
||||
|
||||
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
|
||||
|
||||
# Architecture Diagram
|
||||
|
||||
Generate dark-themed SVG diagrams of software systems and cloud infrastructure as standalone HTML files with inline SVG graphics. Semantic component colors (cyan=frontend, emerald=backend, violet=database, amber=cloud/AWS, rose=security, orange=message bus), JetBrains Mono font, grid background. Best suited for software architecture, cloud/VPC topology, microservice maps, service-mesh diagrams, database + API layer diagrams, security groups, message buses — anything that fits a tech-infra deck with a dark aesthetic. If a more specialized diagramming skill exists for the subject (scientific, educational, hand-drawn, animated, etc.), prefer that — otherwise this skill can also serve as a general-purpose SVG diagram fallback. Based on Cocoon AI's architecture-diagram-generator (MIT).
|
||||
Dark-themed SVG architecture/cloud/infra diagrams as HTML.
|
||||
|
||||
## Skill metadata
|
||||
|
||||
|
||||
@ -1,14 +1,14 @@
|
||||
---
|
||||
title: "Ascii Art"
|
||||
title: "Ascii Art — ASCII art: pyfiglet, cowsay, boxes, image-to-ascii"
|
||||
sidebar_label: "Ascii Art"
|
||||
description: "Generate ASCII art using pyfiglet (571 fonts), cowsay, boxes, toilet, image-to-ascii, remote APIs (asciified, ascii"
|
||||
description: "ASCII art: pyfiglet, cowsay, boxes, image-to-ascii"
|
||||
---
|
||||
|
||||
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
|
||||
|
||||
# Ascii Art
|
||||
|
||||
Generate ASCII art using pyfiglet (571 fonts), cowsay, boxes, toilet, image-to-ascii, remote APIs (asciified, ascii.co.uk), and LLM fallback. No API keys required.
|
||||
ASCII art: pyfiglet, cowsay, boxes, image-to-ascii.
|
||||
|
||||
## Skill metadata
|
||||
|
||||
|
||||
@ -1,14 +1,14 @@
|
||||
---
|
||||
title: "Ascii Video — Production pipeline for ASCII art video — any format"
|
||||
title: "Ascii Video — ASCII video: convert video/audio to colored ASCII MP4/GIF"
|
||||
sidebar_label: "Ascii Video"
|
||||
description: "Production pipeline for ASCII art video — any format"
|
||||
description: "ASCII video: convert video/audio to colored ASCII MP4/GIF"
|
||||
---
|
||||
|
||||
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
|
||||
|
||||
# Ascii Video
|
||||
|
||||
Production pipeline for ASCII art video — any format. Converts video/audio/images/generative input into colored ASCII character video output (MP4, GIF, image sequence). Covers: video-to-ASCII conversion, audio-reactive music visualizers, generative ASCII art animations, hybrid video+audio reactive, text/lyrics overlays, real-time terminal rendering. Use when users request: ASCII video, text art video, terminal-style video, character art animation, retro text visualization, audio visualizer in ASCII, converting video to ASCII art, matrix-style effects, or any animated ASCII output.
|
||||
ASCII video: convert video/audio to colored ASCII MP4/GIF.
|
||||
|
||||
## Skill metadata
|
||||
|
||||
@ -25,6 +25,14 @@ The following is the complete skill definition that Hermes loads when this skill
|
||||
|
||||
# ASCII Video Production Pipeline
|
||||
|
||||
## When to use
|
||||
|
||||
Use when users request: ASCII video, text art video, terminal-style video, character art animation, retro text visualization, audio visualizer in ASCII, converting video to ASCII art, matrix-style effects, or any animated ASCII output.
|
||||
|
||||
## What's inside
|
||||
|
||||
Production pipeline for ASCII art video — any format. Converts video/audio/images/generative input into colored ASCII character video output (MP4, GIF, image sequence). Covers: video-to-ASCII conversion, audio-reactive music visualizers, generative ASCII art animations, hybrid video+audio reactive, text/lyrics overlays, real-time terminal rendering.
|
||||
|
||||
## Creative Standard
|
||||
|
||||
This is visual art. ASCII characters are the medium; cinema is the standard.
|
||||
|
||||
@ -1,14 +1,14 @@
|
||||
---
|
||||
title: "Baoyu Comic — Knowledge comic creator supporting multiple art styles and tones"
|
||||
title: "Baoyu Comic — Knowledge comics (知识漫画): educational, biography, tutorial"
|
||||
sidebar_label: "Baoyu Comic"
|
||||
description: "Knowledge comic creator supporting multiple art styles and tones"
|
||||
description: "Knowledge comics (知识漫画): educational, biography, tutorial"
|
||||
---
|
||||
|
||||
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
|
||||
|
||||
# Baoyu Comic
|
||||
|
||||
Knowledge comic creator supporting multiple art styles and tones. Creates original educational comics with detailed panel layouts and sequential image generation. Use when user asks to create "知识漫画", "教育漫画", "biography comic", "tutorial comic", or "Logicomix-style comic".
|
||||
Knowledge comics (知识漫画): educational, biography, tutorial.
|
||||
|
||||
## Skill metadata
|
||||
|
||||
|
||||
@ -1,14 +1,14 @@
|
||||
---
|
||||
title: "Baoyu Infographic — Generate professional infographics with 21 layout types and 21 visual styles"
|
||||
title: "Baoyu Infographic — Infographics: 21 layouts x 21 styles (信息图, 可视化)"
|
||||
sidebar_label: "Baoyu Infographic"
|
||||
description: "Generate professional infographics with 21 layout types and 21 visual styles"
|
||||
description: "Infographics: 21 layouts x 21 styles (信息图, 可视化)"
|
||||
---
|
||||
|
||||
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
|
||||
|
||||
# Baoyu Infographic
|
||||
|
||||
Generate professional infographics with 21 layout types and 21 visual styles. Analyzes content, recommends layout×style combinations, and generates publication-ready infographics. Use when user asks to create "infographic", "visual summary", "信息图", "可视化", or "高密度信息大图".
|
||||
Infographics: 21 layouts x 21 styles (信息图, 可视化).
|
||||
|
||||
## Skill metadata
|
||||
|
||||
@ -139,6 +139,7 @@ If a shortcut has **Prompt Notes**, append them to the generated prompt (Step 5)
|
||||
|
||||
## Output Structure
|
||||
|
||||
<!-- ascii-guard-ignore -->
|
||||
```
|
||||
infographic/{topic-slug}/
|
||||
├── source-{slug}.{ext}
|
||||
@ -147,6 +148,7 @@ infographic/{topic-slug}/
|
||||
├── prompts/infographic.md
|
||||
└── infographic.png
|
||||
```
|
||||
<!-- ascii-guard-ignore-end -->
|
||||
|
||||
Slug: 2-4 words kebab-case from topic. Conflict: append `-YYYYMMDD-HHMMSS`.
|
||||
|
||||
|
||||
@ -0,0 +1,608 @@
|
||||
---
|
||||
title: "Claude Design — Design one-off HTML artifacts (landing, deck, prototype)"
|
||||
sidebar_label: "Claude Design"
|
||||
description: "Design one-off HTML artifacts (landing, deck, prototype)"
|
||||
---
|
||||
|
||||
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
|
||||
|
||||
# Claude Design
|
||||
|
||||
Design one-off HTML artifacts (landing, deck, prototype).
|
||||
|
||||
## Skill metadata
|
||||
|
||||
| | |
|
||||
|---|---|
|
||||
| Source | Bundled (installed by default) |
|
||||
| Path | `skills/creative/claude-design` |
|
||||
| Version | `1.0.0` |
|
||||
| Author | BadTechBandit |
|
||||
| License | MIT |
|
||||
| Tags | `design`, `html`, `prototype`, `ux`, `ui`, `creative`, `artifact`, `deck`, `motion`, `design-system` |
|
||||
| Related skills | [`design-md`](/docs/user-guide/skills/bundled/creative/creative-design-md), [`popular-web-designs`](/docs/user-guide/skills/bundled/creative/creative-popular-web-designs), [`excalidraw`](/docs/user-guide/skills/bundled/creative/creative-excalidraw), [`architecture-diagram`](/docs/user-guide/skills/bundled/creative/creative-architecture-diagram) |
|
||||
|
||||
## Reference: full SKILL.md
|
||||
|
||||
:::info
|
||||
The following is the complete skill definition that Hermes loads when this skill is triggered. This is what the agent sees as instructions when the skill is active.
|
||||
:::
|
||||
|
||||
# Claude Design for CLI/API Agents
|
||||
|
||||
Use this skill when the user asks for design work that would normally fit Claude Design, but the agent is running in a CLI/API environment instead of the hosted Claude Design web UI.
|
||||
|
||||
The goal is to preserve Claude Design's useful design behavior and taste while removing hosted-tool plumbing that does not exist in normal agent environments.
|
||||
|
||||
**Before starting, check for other web-design skills like `popular-web-designs` (ready-to-paste design systems for Stripe, Linear, Vercel, Notion, etc.) and `design-md` (Google's DESIGN.md token spec format).** If the user wants a known brand's look, load `popular-web-designs` alongside this one and let it supply the visual vocabulary. If the deliverable is a token spec file rather than a rendered artifact, use `design-md` instead. Full decision table below.
|
||||
|
||||
## When To Use This Skill vs `popular-web-designs` vs `design-md`
|
||||
|
||||
Hermes has three design-related skills under `skills/creative/`. They do different jobs — load the right one (or combine them):
|
||||
|
||||
| Skill | What it gives you | Use when the user wants... |
|
||||
|---|---|---|
|
||||
| **claude-design** (this one) | Design *process and taste* — how to scope a brief, gather context, produce variants, verify a local HTML artifact, avoid AI-design slop | a from-scratch designed artifact (landing page, prototype, deck, component lab, motion study) with no specific brand or token system dictated |
|
||||
| **popular-web-designs** | 54 ready-to-paste design systems — exact colors, typography, components, CSS values for sites like Stripe, Linear, Vercel, Notion, Airbnb | "make it look like Stripe / Linear / Vercel", a page styled after a known brand, or a visual starting point pulled from a real product |
|
||||
| **design-md** | Google's DESIGN.md spec format — author/validate/diff/export design-token files, WCAG contrast checking, Tailwind/DTCG export | a formal, persistent, machine-readable design-system *spec file* (tokens + rationale) that lives in a repo and gets consumed by agents over time |
|
||||
|
||||
Rule of thumb:
|
||||
|
||||
- **Process + taste, one-off artifact** → claude-design
|
||||
- **Match a known brand's look** → popular-web-designs (and let claude-design drive the process)
|
||||
- **Author the tokens spec itself** → design-md
|
||||
|
||||
These compose: use `popular-web-designs` for the visual vocabulary, `claude-design` for how to turn a brief into a thoughtful local HTML file, and `design-md` when the output is the token file rather than a rendered artifact.
|
||||
|
||||
## Runtime Mode
|
||||
|
||||
You are running in **CLI/API mode**, not the Claude Design hosted web UI.
|
||||
|
||||
Ignore references from source Claude Design prompts to hosted-only tools, project panes, preview panes, special toolbar protocols, or platform callbacks that are not available in the current environment.
|
||||
|
||||
Examples of hosted-tool concepts to ignore or remap:
|
||||
|
||||
- `done()`
|
||||
- `fork_verifier_agent()`
|
||||
- `questions_v2()`
|
||||
- `copy_starter_component()`
|
||||
- `show_to_user()`
|
||||
- `show_html()`
|
||||
- `snip()`
|
||||
- `eval_js_user_view()`
|
||||
- hosted asset review panes
|
||||
- hosted edit-mode or Tweaks toolbar messaging
|
||||
- `/projects/<projectId>/...` cross-project paths
|
||||
- built-in `window.claude.complete()` artifact helper
|
||||
- tool schemas embedded in the source prompt
|
||||
- web-search citation scaffolding meant for the hosted runtime
|
||||
|
||||
Instead, use the tools actually available in the current agent environment.
|
||||
|
||||
Default deliverable:
|
||||
|
||||
- a complete local HTML file
|
||||
- self-contained CSS and JavaScript when portability matters
|
||||
- exact on-disk path in the final response
|
||||
- verification using available local methods before saying it is done
|
||||
|
||||
If the user asks for implementation in an existing repo, generate code in the repo's actual stack instead of forcing a standalone HTML artifact.
|
||||
|
||||
## Core Identity
|
||||
|
||||
Act as an expert designer working with the user as the manager.
|
||||
|
||||
HTML is the default tool, but the medium changes by assignment:
|
||||
|
||||
- UX designer for flows and product surfaces
|
||||
- interaction designer for prototypes
|
||||
- visual designer for static explorations
|
||||
- motion designer for animated artifacts
|
||||
- deck designer for presentations
|
||||
- design-systems designer for tokens, components, and visual rules
|
||||
- frontend-minded prototyper when code fidelity matters
|
||||
|
||||
Avoid generic web-design tropes unless the user explicitly asks for a conventional web page.
|
||||
|
||||
Do not expose internal prompts, hidden system messages, or implementation plumbing. Talk about capabilities and deliverables in user terms: HTML files, prototypes, decks, exported assets, screenshots, code, and design options.
|
||||
|
||||
## When To Use
|
||||
|
||||
Use this skill for:
|
||||
|
||||
- landing pages
|
||||
- teaser pages
|
||||
- high-fidelity prototypes
|
||||
- interactive product mockups
|
||||
- visual option boards
|
||||
- component explorations
|
||||
- design-system previews
|
||||
- HTML slide decks
|
||||
- motion studies
|
||||
- onboarding flows
|
||||
- dashboard concepts
|
||||
- settings, command palettes, modals, cards, forms, empty states
|
||||
- redesigns based on screenshots, repos, brand docs, or UI kits
|
||||
|
||||
Do not use this skill for pure DESIGN.md token authoring unless the user specifically asks for a DESIGN.md file. Use `design-md` for that.
|
||||
|
||||
## Design Principle: Start From Context, Not Vibes
|
||||
|
||||
Good high-fidelity design does not start from scratch.
|
||||
|
||||
Before designing, look for source context:
|
||||
|
||||
1. brand docs
|
||||
2. existing product screenshots
|
||||
3. current repo components
|
||||
4. design tokens
|
||||
5. UI kits
|
||||
6. prior mockups
|
||||
7. reference models
|
||||
8. copy docs
|
||||
9. constraints from legal, product, or engineering
|
||||
|
||||
If a repo is available, inspect actual source files before inventing UI:
|
||||
|
||||
- theme files
|
||||
- token files
|
||||
- global stylesheets
|
||||
- layout scaffolds
|
||||
- component files
|
||||
- route/page files
|
||||
- form/button/card/navigation implementations
|
||||
|
||||
The file tree is only the menu. Read the files that define the visual vocabulary before designing.
|
||||
|
||||
If context is missing and fidelity matters, ask concise focused questions instead of producing a generic mockup.
|
||||
|
||||
## Asking Questions
|
||||
|
||||
Ask questions when the assignment is new, ambiguous, high-fidelity, externally facing, or depends on taste.
|
||||
|
||||
Keep questions short. Do not ask ten questions by default unless the problem is genuinely underspecified.
|
||||
|
||||
Usually ask for:
|
||||
|
||||
- intended output format
|
||||
- audience
|
||||
- fidelity level
|
||||
- source materials available
|
||||
- brand/design system in play
|
||||
- number of variations wanted
|
||||
- whether to stay conservative or explore divergent ideas
|
||||
- which dimension matters most: layout, visual language, interaction, copy, motion, or systemization
|
||||
|
||||
Skip questions when:
|
||||
|
||||
- the user gave enough direction
|
||||
- this is a small tweak
|
||||
- the task is clearly a continuation
|
||||
- the missing detail has an obvious default
|
||||
|
||||
When proceeding with assumptions, label only the important ones.
|
||||
|
||||
## Workflow
|
||||
|
||||
1. **Understand the brief**
|
||||
- What is being designed?
|
||||
- Who is it for?
|
||||
- What artifact should exist at the end?
|
||||
- What constraints are locked?
|
||||
|
||||
2. **Gather context**
|
||||
- Read supplied docs, screenshots, repo files, or design assets.
|
||||
- Identify the visual vocabulary before writing code.
|
||||
|
||||
3. **Define the design system for this artifact**
|
||||
- colors
|
||||
- type
|
||||
- spacing
|
||||
- radii
|
||||
- shadows or elevation
|
||||
- motion posture
|
||||
- component treatment
|
||||
- interaction rules
|
||||
|
||||
4. **Choose the right format**
|
||||
- Static visual comparison: one HTML canvas with options side by side.
|
||||
- Interaction/flow: clickable prototype.
|
||||
- Presentation: fixed-size HTML deck with slide navigation.
|
||||
- Component exploration: component lab with variants.
|
||||
- Motion: timeline or state-based animation.
|
||||
|
||||
5. **Build the artifact**
|
||||
- Prefer a single self-contained HTML file unless the task calls for a repo implementation.
|
||||
- Preserve prior versions for major revisions.
|
||||
- Avoid unnecessary dependencies.
|
||||
|
||||
6. **Verify**
|
||||
- Confirm files exist.
|
||||
- Run any available syntax/static checks.
|
||||
- If browser tools are available, open the file and check console errors.
|
||||
- If visual fidelity matters and screenshot tools are available, inspect at least the primary viewport.
|
||||
|
||||
7. **Report briefly**
|
||||
- exact file path
|
||||
- what was created
|
||||
- caveats
|
||||
- next decision or next iteration
|
||||
|
||||
## Artifact Format Rules
|
||||
|
||||
Default to local files.
|
||||
|
||||
For standalone artifacts:
|
||||
|
||||
- create a descriptive filename, e.g. `Landing Page.html`, `Command Palette Prototype.html`, `Design System Board.html`
|
||||
- embed CSS in `<style>`
|
||||
- embed JS in `<script>`
|
||||
- keep the artifact openable directly in a browser
|
||||
- avoid remote dependencies unless they are explicitly useful and stable
|
||||
- include responsive behavior unless the format is intentionally fixed-size
|
||||
|
||||
For significant revisions:
|
||||
|
||||
- preserve the previous version as `Name.html`
|
||||
- create `Name v2.html`, `Name v3.html`, etc.
|
||||
- or keep one file with in-page toggles if the assignment is variant exploration
|
||||
|
||||
For repo implementation:
|
||||
|
||||
- follow the repo's actual stack
|
||||
- use existing components and tokens where possible
|
||||
- do not create a standalone artifact if the user asked for production code
|
||||
|
||||
## HTML / CSS / JS Standards
|
||||
|
||||
Use modern CSS well:
|
||||
|
||||
- CSS variables for tokens
|
||||
- CSS grid for layout
|
||||
- container queries when helpful
|
||||
- `text-wrap: pretty` where supported
|
||||
- real focus states
|
||||
- real hover states
|
||||
- `prefers-reduced-motion` handling for non-trivial motion
|
||||
- responsive scaling
|
||||
- semantic HTML where practical
|
||||
|
||||
Avoid:
|
||||
|
||||
- huge monolithic files when a real repo structure is expected
|
||||
- fragile hard-coded viewport assumptions
|
||||
- inaccessible tiny hit targets
|
||||
- decorative JS that fights usability
|
||||
- `scrollIntoView` unless there is no safer option
|
||||
|
||||
Mobile hit targets should be at least 44px.
|
||||
|
||||
For print documents, text should be at least 12pt.
|
||||
|
||||
For 1920×1080 slide decks, text should generally be 24px or larger.
|
||||
|
||||
## React Guidance for Standalone HTML
|
||||
|
||||
Use plain HTML/CSS/JS by default.
|
||||
|
||||
Use React only when:
|
||||
|
||||
- the artifact needs meaningful state
|
||||
- variants/toggles are easier as components
|
||||
- interaction complexity warrants it
|
||||
- the target implementation is React/Next.js and fidelity matters
|
||||
|
||||
If using React from CDN in standalone HTML:
|
||||
|
||||
- pin exact versions
|
||||
- avoid unpinned `react@18` style URLs
|
||||
- avoid `type="module"` unless necessary
|
||||
- avoid multiple global objects named `styles`
|
||||
- give global style objects specific names, e.g. `commandPaletteStyles`, `deckStyles`
|
||||
- if splitting Babel scripts, explicitly attach shared components to `window`
|
||||
|
||||
If building inside a real repo, use the repo's package manager and component architecture instead.
|
||||
|
||||
## Deck Rules
|
||||
|
||||
For slide decks, use a fixed-size canvas and scale it to fit the viewport.
|
||||
|
||||
Default slide size: 1920×1080, 16:9.
|
||||
|
||||
Requirements:
|
||||
|
||||
- keyboard navigation
|
||||
- visible slide count
|
||||
- localStorage persistence for current slide
|
||||
- print-friendly layout when practical
|
||||
- screen labels or stable IDs for important slides
|
||||
- no speaker notes unless the user explicitly asks
|
||||
|
||||
Do not hand-wave a deck as markdown bullets. Create a designed artifact if asked for a deck.
|
||||
|
||||
Use 1–2 background colors max unless the brand system requires more.
|
||||
|
||||
Keep slides sparse. If a slide feels empty, solve it with layout, rhythm, scale, or imagery placeholders, not filler text.
|
||||
|
||||
## Prototype Rules
|
||||
|
||||
For interactive prototypes:
|
||||
|
||||
- make the primary path clickable
|
||||
- include key states: default, hover/focus, loading, empty, error, success where relevant
|
||||
- expose variations with in-page controls when useful
|
||||
- keep controls out of the final composition unless they are intentionally part of the prototype
|
||||
- persist important state in localStorage when refresh continuity matters
|
||||
|
||||
If the prototype is meant to model a product flow, design the flow, not just the first screen.
|
||||
|
||||
## Variation Rules
|
||||
|
||||
When exploring, default to at least three options:
|
||||
|
||||
1. **Conservative** — closest to existing patterns / lowest risk
|
||||
2. **Strong-fit** — best interpretation of the brief
|
||||
3. **Divergent** — more novel, useful for discovering taste boundaries
|
||||
|
||||
Variations can explore:
|
||||
|
||||
- layout
|
||||
- hierarchy
|
||||
- type scale
|
||||
- density
|
||||
- color posture
|
||||
- surface treatment
|
||||
- motion
|
||||
- interaction model
|
||||
- copy structure
|
||||
- component shape
|
||||
|
||||
Do not create variations that are merely color swaps unless color is the actual question.
|
||||
|
||||
When the user picks a direction, consolidate. Do not leave the project as a pile of options forever.
|
||||
|
||||
## Tweakable Designs in CLI/API Mode
|
||||
|
||||
The hosted Claude Design edit-mode toolbar does not exist here.
|
||||
|
||||
Still preserve the idea: when useful, add in-page controls called `Tweaks`.
|
||||
|
||||
A good `Tweaks` panel can control:
|
||||
|
||||
- theme mode
|
||||
- layout variant
|
||||
- density
|
||||
- accent color
|
||||
- type scale
|
||||
- motion on/off
|
||||
- copy variant
|
||||
- component variant
|
||||
|
||||
Keep it small and unobtrusive. The design should look final when tweaks are hidden.
|
||||
|
||||
Persist tweak values with localStorage when helpful.
|
||||
|
||||
## Content Discipline
|
||||
|
||||
Do not add filler content.
|
||||
|
||||
Every element must earn its place.
|
||||
|
||||
Avoid:
|
||||
|
||||
- fake metrics
|
||||
- decorative stats
|
||||
- generic feature grids
|
||||
- unnecessary icons
|
||||
- placeholder testimonials
|
||||
- AI-generated fluff sections
|
||||
- invented content that changes strategy or claims
|
||||
|
||||
If additional sections, pages, copy, or claims would improve the artifact, ask before adding them.
|
||||
|
||||
When copy is necessary but not final, mark it as draft or placeholder.
|
||||
|
||||
## Anti-Slop Rules
|
||||
|
||||
Avoid common AI design sludge:
|
||||
|
||||
- aggressive gradient backgrounds
|
||||
- glassmorphism by default
|
||||
- emoji unless the brand uses them
|
||||
- generic SaaS cards with icons everywhere
|
||||
- left-border accent callout cards
|
||||
- fake dashboards filled with arbitrary numbers
|
||||
- stock-photo hero sections
|
||||
- oversized rounded rectangles as a substitute for hierarchy
|
||||
- rainbow palettes
|
||||
- vague labels like “Insights,” “Growth,” “Scale,” “Optimize” without content
|
||||
- decorative SVG illustrations pretending to be product imagery
|
||||
|
||||
Minimal is not automatically good. Dense is not automatically cluttered. Choose intentionally.
|
||||
|
||||
## Typography
|
||||
|
||||
Use the existing type system if one exists.
|
||||
|
||||
If not, choose type deliberately based on the artifact:
|
||||
|
||||
- editorial: serif or humanist headline with restrained sans body
|
||||
- software/productivity: precise sans with strong numeric treatment
|
||||
- luxury/minimal: fewer weights, more spacing discipline
|
||||
- technical: mono accents only, not mono everywhere
|
||||
- deck: large, clear, high contrast
|
||||
|
||||
Avoid overused defaults when a stronger choice is appropriate.
|
||||
|
||||
If using web fonts, keep the number of families and weights low.
|
||||
|
||||
Use type as hierarchy before adding boxes, icons, or color.
|
||||
|
||||
## Color
|
||||
|
||||
Use brand/design-system colors first.
|
||||
|
||||
If no palette exists:
|
||||
|
||||
- define a small system
|
||||
- include neutrals, surface, ink, muted text, border, accent, danger/success if needed
|
||||
- use one primary accent unless the assignment calls for a broader palette
|
||||
- prefer oklch for harmonious invented palettes when browser support is acceptable
|
||||
- check contrast for important text and controls
|
||||
|
||||
Do not invent lots of colors from scratch.
|
||||
|
||||
## Layout and Composition
|
||||
|
||||
Design with rhythm:
|
||||
|
||||
- scale
|
||||
- whitespace
|
||||
- density
|
||||
- alignment
|
||||
- repetition
|
||||
- contrast
|
||||
- interruption
|
||||
|
||||
Avoid making every section the same card grid.
|
||||
|
||||
For product UIs, prioritize speed of comprehension over decoration.
|
||||
|
||||
For marketing surfaces, make one idea land per section.
|
||||
|
||||
For dashboards, avoid “data slop.” Only show data that helps the user decide or act.
|
||||
|
||||
## Motion
|
||||
|
||||
Use motion as discipline, not theater.
|
||||
|
||||
Good motion:
|
||||
|
||||
- clarifies state changes
|
||||
- reduces anxiety during loading
|
||||
- shows continuity between surfaces
|
||||
- gives controls tactility
|
||||
- stays subtle
|
||||
|
||||
Bad motion:
|
||||
|
||||
- loops without purpose
|
||||
- delays the user
|
||||
- calls attention to itself
|
||||
- hides poor hierarchy
|
||||
|
||||
Respect `prefers-reduced-motion` for non-trivial animation.
|
||||
|
||||
## Images and Icons
|
||||
|
||||
Use real supplied imagery when available.
|
||||
|
||||
If an asset is missing:
|
||||
|
||||
- use a clean placeholder
|
||||
- use typography, layout, or abstract texture instead
|
||||
- ask for real material when fidelity matters
|
||||
|
||||
Do not draw elaborate fake SVG illustrations unless the assignment is explicitly illustration work.
|
||||
|
||||
Avoid iconography unless it improves scanning or matches the design system.
|
||||
|
||||
## Source-Code Fidelity
|
||||
|
||||
When recreating or extending a UI from a repo:
|
||||
|
||||
1. inspect the repo tree
|
||||
2. identify the actual UI source files
|
||||
3. read theme/token/global style/component files
|
||||
4. lift exact values where appropriate
|
||||
5. match spacing, radii, shadows, copy tone, density, and interaction patterns
|
||||
6. only then design or modify
|
||||
|
||||
Do not build from memory when source files are available.
|
||||
|
||||
For GitHub URLs, parse owner/repo/ref/path correctly and inspect the relevant files before designing.
|
||||
|
||||
## Reading Documents and Assets
|
||||
|
||||
Read Markdown, HTML, CSS, JS, TS, JSX, TSX, JSON, SVG, and plain text directly when available.
|
||||
|
||||
For DOCX/PPTX/PDF, use available local extraction tools if present. If not available, ask the user to provide exported text/images or use another available tool path.
|
||||
|
||||
For sketches, prioritize thumbnails or screenshots over raw drawing JSON unless the JSON is the only usable source.
|
||||
|
||||
## Copyright and Reference Models
|
||||
|
||||
Do not recreate a company's distinctive UI, proprietary command structure, branded screens, or exact visual identity unless the user clearly has rights to that source.
|
||||
|
||||
It is acceptable to extract general design principles:
|
||||
|
||||
- density without clutter
|
||||
- command-first interaction
|
||||
- monochrome with one accent
|
||||
- editorial hierarchy
|
||||
- clear empty states
|
||||
- strong keyboard affordances
|
||||
|
||||
It is not acceptable to clone proprietary layouts, copy exact branded surfaces, or reproduce copyrighted content.
|
||||
|
||||
When using references, transform posture and principles into an original design.
|
||||
|
||||
## Verification
|
||||
|
||||
Before final response, verify as much as the environment allows.
|
||||
|
||||
Minimum:
|
||||
|
||||
- file exists at the stated path
|
||||
- HTML is saved completely
|
||||
- obvious syntax issues are checked
|
||||
|
||||
Better:
|
||||
|
||||
- open in a browser tool and check console errors
|
||||
- inspect screenshots at the primary viewport
|
||||
- test key interactions
|
||||
- test light/dark or variants if present
|
||||
- test responsive breakpoints if relevant
|
||||
|
||||
If verification is limited by environment, say exactly what was and was not verified.
|
||||
|
||||
Never say “done” if the file was not actually written.
|
||||
|
||||
## Final Response Format
|
||||
|
||||
Keep final responses short.
|
||||
|
||||
Include:
|
||||
|
||||
- artifact path
|
||||
- what it contains
|
||||
- verification status
|
||||
- next suggested action, if useful
|
||||
|
||||
Example:
|
||||
|
||||
```text
|
||||
Created: /path/to/Prototype.html
|
||||
It includes 3 layout variants, a Tweaks panel for density/theme, and responsive behavior.
|
||||
Verified: file exists and opened cleanly in browser, no console errors.
|
||||
Next: pick the strongest direction and I’ll tighten copy + motion.
|
||||
```
|
||||
|
||||
## Portable Opening Prompt Pattern
|
||||
|
||||
When adapting a Claude Design style request into CLI/API mode, use this mental translation:
|
||||
|
||||
```text
|
||||
You are running in CLI/API mode, not hosted Claude Design. Ignore references to hosted-only tools or preview panes. Produce complete local design artifacts, usually self-contained HTML with embedded CSS/JS, and verify with available local tools before returning. Preserve the design process: gather context, define the system, produce options, avoid filler, and meet a high visual bar.
|
||||
```
|
||||
|
||||
## Pitfalls
|
||||
|
||||
- Do not paste hosted tool schemas into a skill. They cause fake tool calls.
|
||||
- Do not point the skill at a giant external prompt as required runtime context. That creates drift.
|
||||
- Do not strip the design doctrine while removing tool plumbing.
|
||||
- Do not over-ask when the user already gave enough direction.
|
||||
- Do not under-ask for high-fidelity work with no brand context.
|
||||
- Do not produce generic SaaS layouts and call them designed.
|
||||
- Do not claim browser verification unless it actually happened.
|
||||
@ -0,0 +1,652 @@
|
||||
---
|
||||
title: "Comfyui"
|
||||
sidebar_label: "Comfyui"
|
||||
description: "Generate images, video, and audio with ComfyUI — install, launch, manage nodes/models, run workflows with parameter injection"
|
||||
---
|
||||
|
||||
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
|
||||
|
||||
# Comfyui
|
||||
|
||||
Generate images, video, and audio with ComfyUI — install, launch, manage nodes/models, run workflows with parameter injection. Uses the official comfy-cli for lifecycle and direct REST API for execution.
|
||||
|
||||
## Skill metadata
|
||||
|
||||
| | |
|
||||
|---|---|
|
||||
| Source | Bundled (installed by default) |
|
||||
| Path | `skills/creative/comfyui` |
|
||||
| Version | `4.1.0` |
|
||||
| Author | ['kshitijk4poor', 'alt-glitch'] |
|
||||
| License | MIT |
|
||||
| Platforms | macos, linux, windows |
|
||||
| Tags | `comfyui`, `image-generation`, `stable-diffusion`, `flux`, `creative`, `generative-ai`, `video-generation` |
|
||||
| Related skills | [`stable-diffusion-image-generation`](/docs/user-guide/skills/optional/mlops/mlops-stable-diffusion), `image_gen` |
|
||||
|
||||
## Reference: full SKILL.md
|
||||
|
||||
:::info
|
||||
The following is the complete skill definition that Hermes loads when this skill is triggered. This is what the agent sees as instructions when the skill is active.
|
||||
:::
|
||||
|
||||
# ComfyUI
|
||||
|
||||
Generate images, video, and audio through ComfyUI using the official `comfy-cli` for
|
||||
setup/management and direct REST API calls for workflow execution.
|
||||
|
||||
**Reference files in this skill:**
|
||||
|
||||
- `references/official-cli.md` — comfy-cli command reference (install, launch, nodes, models)
|
||||
- `references/rest-api.md` — ComfyUI REST API endpoints (local + cloud)
|
||||
- `references/workflow-format.md` — workflow JSON format, common node types, parameter mapping
|
||||
|
||||
**Scripts in this skill:**
|
||||
|
||||
- `scripts/hardware_check.py` — detect GPU/VRAM/Apple Silicon, decide local vs Comfy Cloud
|
||||
- `scripts/comfyui_setup.sh` — full setup automation (hardware check + install + launch + verify)
|
||||
- `scripts/extract_schema.py` — reads workflow JSON, outputs which parameters are controllable
|
||||
- `scripts/run_workflow.py` — injects user args, submits workflow, monitors progress, downloads outputs
|
||||
- `scripts/check_deps.py` — checks if required custom nodes and models are installed
|
||||
|
||||
## When to Use
|
||||
|
||||
- User asks to generate images with Stable Diffusion, SDXL, Flux, or other diffusion models
|
||||
- User wants to run a specific ComfyUI workflow
|
||||
- User wants to chain generative steps (txt2img → upscale → face restore)
|
||||
- User needs ControlNet, inpainting, img2img, or other advanced pipelines
|
||||
- User asks to manage ComfyUI queue, check models, or install custom nodes
|
||||
- User wants video/audio generation via AnimateDiff, Hunyuan, AudioCraft, etc.
|
||||
|
||||
## Architecture: Two Layers
|
||||
|
||||
<!-- ascii-guard-ignore -->
|
||||
```
|
||||
┌─────────────────────────────────────────────────────┐
|
||||
│ Layer 1: comfy-cli (official) │
|
||||
│ Setup, lifecycle, nodes, models │
|
||||
│ comfy install / launch / stop / node / model │
|
||||
└─────────────────────────┬───────────────────────────┘
|
||||
│
|
||||
┌─────────────────────────▼───────────────────────────┐
|
||||
│ Layer 2: REST API + skill scripts │
|
||||
│ Workflow execution, param injection, monitoring │
|
||||
│ POST /api/prompt, GET /api/view, WebSocket │
|
||||
│ scripts/run_workflow.py, extract_schema.py │
|
||||
└─────────────────────────────────────────────────────┘
|
||||
```
|
||||
<!-- ascii-guard-ignore-end -->
|
||||
|
||||
**Why two layers?** The official CLI handles installation and server management excellently
|
||||
but has minimal workflow execution support (just raw file submission, no param injection,
|
||||
no structured output). The REST API fills that gap — the scripts in this skill handle the
|
||||
param injection, execution monitoring, and output download that the CLI doesn't do.
|
||||
|
||||
## Quick Start
|
||||
|
||||
### Detect Environment
|
||||
|
||||
```bash
|
||||
# What's available?
|
||||
command -v comfy >/dev/null 2>&1 && echo "comfy-cli: installed"
|
||||
curl -s http://127.0.0.1:8188/system_stats 2>/dev/null && echo "server: running"
|
||||
|
||||
# Can this machine actually run ComfyUI locally? (GPU/VRAM/Apple Silicon check)
|
||||
python3 scripts/hardware_check.py
|
||||
```
|
||||
|
||||
If nothing is installed, go to **Setup & Onboarding** below — but always run the
|
||||
hardware check first, before picking an install path.
|
||||
If the server is already running, skip to **Core Workflow**.
|
||||
|
||||
## Core Workflow
|
||||
|
||||
### Step 1: Get a Workflow
|
||||
|
||||
Users provide workflow JSON files. These come from:
|
||||
- ComfyUI web editor → "Save (API Format)" button
|
||||
- Community downloads (civitai, Reddit, Discord)
|
||||
- The `scripts/` directory of this skill (example workflows)
|
||||
|
||||
**The workflow must be in API format** (node IDs as keys with `class_type`).
|
||||
If user has editor format (has `nodes[]` and `links[]` at top level), they
|
||||
need to re-export using "Save (API Format)" in the ComfyUI web editor.
|
||||
|
||||
### Step 2: Understand What's Controllable
|
||||
|
||||
```bash
|
||||
python3 scripts/extract_schema.py workflow_api.json
|
||||
```
|
||||
|
||||
Output (JSON):
|
||||
```json
|
||||
{
|
||||
"parameters": {
|
||||
"prompt": {"node_id": "6", "field": "text", "type": "string", "value": "a cat"},
|
||||
"negative_prompt": {"node_id": "7", "field": "text", "type": "string", "value": "bad quality"},
|
||||
"seed": {"node_id": "3", "field": "seed", "type": "int", "value": 42},
|
||||
"steps": {"node_id": "3", "field": "steps", "type": "int", "value": 20},
|
||||
"width": {"node_id": "5", "field": "width", "type": "int", "value": 512},
|
||||
"height": {"node_id": "5", "field": "height", "type": "int", "value": 512}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Step 3: Run with Parameters
|
||||
|
||||
**Local:**
|
||||
```bash
|
||||
python3 scripts/run_workflow.py \
|
||||
--workflow workflow_api.json \
|
||||
--args '{"prompt": "a beautiful sunset over mountains", "seed": 123, "steps": 30}' \
|
||||
--output-dir ./outputs
|
||||
```
|
||||
|
||||
**Cloud:**
|
||||
```bash
|
||||
python3 scripts/run_workflow.py \
|
||||
--workflow workflow_api.json \
|
||||
--args '{"prompt": "a beautiful sunset", "seed": 123}' \
|
||||
--host https://cloud.comfy.org \
|
||||
--api-key "$COMFY_CLOUD_API_KEY" \
|
||||
--output-dir ./outputs
|
||||
```
|
||||
|
||||
### Step 4: Present Results
|
||||
|
||||
The script outputs JSON with file paths:
|
||||
```json
|
||||
{
|
||||
"status": "success",
|
||||
"outputs": [
|
||||
{"file": "./outputs/ComfyUI_00001_.png", "node_id": "9", "type": "image"}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
Show images to the user via `vision_analyze` or return the file path directly.
|
||||
|
||||
## Decision Tree
|
||||
|
||||
| User says | Tool | Command |
|
||||
|-----------|------|---------|
|
||||
| "install ComfyUI" | comfy-cli | `comfy install` |
|
||||
| "start ComfyUI" | comfy-cli | `comfy launch --background` |
|
||||
| "stop ComfyUI" | comfy-cli | `comfy stop` |
|
||||
| "install X node" | comfy-cli | `comfy node install <name>` |
|
||||
| "download X model" | comfy-cli | `comfy model download --url <url>` |
|
||||
| "list installed models" | comfy-cli | `comfy model list` |
|
||||
| "list installed nodes" | comfy-cli | `comfy node show installed` |
|
||||
| "generate an image" | script | `run_workflow.py --args '{"prompt": "..."}'` |
|
||||
| "use this image" (img2img) | REST | upload image, then run_workflow.py |
|
||||
| "what can I change in this workflow?" | script | `extract_schema.py workflow.json` |
|
||||
| "check if workflow deps are met" | script | `check_deps.py workflow.json` |
|
||||
| "what's in the queue?" | REST | `curl http://HOST:8188/queue` |
|
||||
| "cancel that" | REST | `curl -X POST http://HOST:8188/interrupt` |
|
||||
| "free GPU memory" | REST | `curl -X POST http://HOST:8188/free` |
|
||||
|
||||
## Setup & Onboarding
|
||||
|
||||
When a user asks to set up ComfyUI, the FIRST thing to do is ask them whether
|
||||
they want **Comfy Cloud** (hosted, zero install, API key) or **Local** (install
|
||||
ComfyUI on their machine). Do NOT start running install commands or hardware
|
||||
checks until they've answered.
|
||||
|
||||
**Official docs:** https://docs.comfy.org/installation
|
||||
**CLI docs:** https://docs.comfy.org/comfy-cli/getting-started
|
||||
**Cloud docs:** https://docs.comfy.org/get_started/cloud
|
||||
|
||||
### Step 0: Ask Local vs Cloud (ALWAYS FIRST)
|
||||
|
||||
Present the tradeoff clearly and wait for the user to choose. Suggested script:
|
||||
|
||||
> "Do you want to run ComfyUI locally on your machine, or use Comfy Cloud?
|
||||
>
|
||||
> - **Comfy Cloud** — hosted on RTX 6000 Pro GPUs, all models pre-installed, zero setup. Requires an API key (paid subscription). Best if you don't have a capable GPU or want to skip installation.
|
||||
> - **Local** — free, but your machine MUST meet the hardware requirements:
|
||||
> - NVIDIA GPU with **≥6 GB VRAM** (≥8 GB recommended for SDXL, ≥12 GB for Flux/video), OR
|
||||
> - AMD GPU with ROCm support (Linux), OR
|
||||
> - Apple Silicon Mac (M1 or newer) with **≥16 GB unified memory** (≥32 GB recommended).
|
||||
> - Intel Macs and machines with no GPU will NOT work — use Cloud instead.
|
||||
>
|
||||
> Which would you like?"
|
||||
|
||||
Route based on their answer:
|
||||
|
||||
- **User picks Cloud** → skip to **Path A** (no hardware check needed).
|
||||
- **User picks Local** → go to **Step 1: Hardware Check** to verify their machine actually meets the requirements, then pick an install path from Paths B-E based on the verdict.
|
||||
- **User is unsure / asks for a recommendation** → run the hardware check anyway and let the verdict decide.
|
||||
|
||||
### Step 1: Verify Hardware (ONLY if user chose local)
|
||||
|
||||
```bash
|
||||
python3 scripts/hardware_check.py --json
|
||||
```
|
||||
|
||||
It detects OS, GPU (NVIDIA CUDA / AMD ROCm / Apple Silicon / Intel Arc), VRAM,
|
||||
and unified/system RAM, then returns a verdict plus a suggested `comfy-cli` flag:
|
||||
|
||||
| Verdict | Meaning | Action |
|
||||
|------------|-----------------------------------------------------------|-------------------------------------------------|
|
||||
| `ok` | ≥8 GB VRAM (discrete) OR ≥32 GB unified (Apple Silicon) | Local install — use `comfy_cli_flag` from report |
|
||||
| `marginal` | SD1.5 works; SDXL tight; Flux/video unlikely | Local OK for light workflows, else **Path A (Cloud)** |
|
||||
| `cloud` | No usable GPU, <6 GB VRAM, <16 GB Apple unified, Intel Mac | **User chose local but their machine doesn't meet requirements** — surface the `notes` and ask if they want to switch to Cloud |
|
||||
|
||||
Hardware thresholds the skill enforces:
|
||||
|
||||
- **Discrete GPU minimum:** 6 GB VRAM. Below that, most modern models won't load.
|
||||
- **Apple Silicon:** M1 or newer (ARM64). Intel Macs have no MPS backend — Cloud only.
|
||||
- **Apple Silicon memory:** 16 GB unified minimum. 8 GB M1/M2 will swap/OOM on SDXL/Flux.
|
||||
- **No accelerator at all:** CPU-only is listed as a comfy-cli option but a single SDXL
|
||||
image takes 10+ minutes — treat it as unusable and route to Cloud.
|
||||
|
||||
If verdict is `cloud` but the user explicitly wanted local, DO NOT proceed
|
||||
silently. Show the `notes` array verbatim, explain which requirement they
|
||||
don't meet, and ask whether they want to (a) switch to Cloud or (b) force
|
||||
a local install anyway (marginal/cloud-verdict local installs will OOM or
|
||||
be unusably slow on modern models).
|
||||
|
||||
The report's `comfy_cli_flag` field gives you the exact flag for Step 2 below:
|
||||
`--nvidia`, `--amd`, or `--m-series`. For Intel Arc, use Path E (manual install).
|
||||
|
||||
Surface the `notes` array verbatim to the user so they understand why a
|
||||
particular path was recommended.
|
||||
|
||||
### Choosing an Installation Path
|
||||
|
||||
Use the hardware check result first. The table below is a fallback for when the user
|
||||
has already told you their hardware or you need to narrow down between multiple
|
||||
viable paths:
|
||||
|
||||
| Situation | Recommended Path |
|
||||
|-----------|-----------------|
|
||||
| `verdict: cloud` from hardware check | **Path A: Comfy Cloud** |
|
||||
| No GPU / just want to try it | **Path A: Comfy Cloud** (zero setup) |
|
||||
| Windows + NVIDIA GPU + non-technical | **Path B: ComfyUI Desktop** (one-click installer) |
|
||||
| Windows + NVIDIA GPU + technical | **Path C: Portable** or **Path D: comfy-cli** |
|
||||
| Linux + any GPU | **Path D: comfy-cli** (easiest) or Path E manual |
|
||||
| macOS + Apple Silicon | **Path B: ComfyUI Desktop** or **Path D: comfy-cli** |
|
||||
| Headless / server / CI | **Path D: comfy-cli** |
|
||||
|
||||
For the fully automated path (hardware check → install → launch), just run:
|
||||
|
||||
```bash
|
||||
bash scripts/comfyui_setup.sh
|
||||
```
|
||||
|
||||
It runs `hardware_check.py` internally, refuses to install locally when the verdict
|
||||
is `cloud`, picks the right `comfy-cli` flag otherwise, then installs and launches.
|
||||
|
||||
---
|
||||
|
||||
### Path A: Comfy Cloud (No Local Install)
|
||||
|
||||
For users without a capable GPU or who want zero setup.
|
||||
Powered by RTX 6000 Pro GPUs, all models pre-installed.
|
||||
|
||||
**Docs:** https://docs.comfy.org/get_started/cloud
|
||||
|
||||
1. Go to https://comfy.org/cloud and sign up
|
||||
2. Get an API key at https://platform.comfy.org/login
|
||||
- Click `+ New` in API Keys section → Generate
|
||||
- Save immediately (only visible once)
|
||||
3. Set the key:
|
||||
```bash
|
||||
export COMFY_CLOUD_API_KEY="comfyui-xxxxxxxxxxxx"
|
||||
```
|
||||
4. Run workflows via the script or web UI:
|
||||
```bash
|
||||
python3 scripts/run_workflow.py \
|
||||
--workflow workflow_api.json \
|
||||
--args '{"prompt": "a cat"}' \
|
||||
--host https://cloud.comfy.org \
|
||||
--api-key "$COMFY_CLOUD_API_KEY" \
|
||||
--output-dir ./outputs
|
||||
```
|
||||
|
||||
**Pricing:** https://www.comfy.org/cloud/pricing
|
||||
Subscription required. Concurrent limits: Free/Standard: 1 job, Creator: 3, Pro: 5.
|
||||
|
||||
---
|
||||
|
||||
### Path B: ComfyUI Desktop (Windows/macOS)
|
||||
|
||||
One-click installer for non-technical users. Currently Beta.
|
||||
|
||||
**Docs:** https://docs.comfy.org/installation/desktop
|
||||
|
||||
- **Windows (NVIDIA):** https://download.comfy.org/windows/nsis/x64
|
||||
- **macOS (Apple Silicon):** Available from https://comfy.org (download page)
|
||||
|
||||
Steps:
|
||||
1. Download and run installer
|
||||
2. Select GPU type (NVIDIA recommended, or CPU mode)
|
||||
3. Choose install location (SSD recommended, ~15GB needed)
|
||||
4. Optionally migrate from existing ComfyUI Portable install
|
||||
5. Desktop launches automatically — web UI opens in browser
|
||||
|
||||
Desktop manages its own Python environment. For CLI access to the bundled env:
|
||||
```bash
|
||||
cd <install_dir>/ComfyUI
|
||||
.venv/Scripts/activate # Windows
|
||||
# or use the built-in terminal in the Desktop UI
|
||||
```
|
||||
|
||||
**Limitations:** Desktop uses stable releases (may lag behind latest).
|
||||
Linux not supported for Desktop — use comfy-cli or manual install.
|
||||
|
||||
---
|
||||
|
||||
### Path C: ComfyUI Portable (Windows Only)
|
||||
|
||||
Standalone package with embedded Python. Extract and run. No install.
|
||||
|
||||
**Docs:** https://docs.comfy.org/installation/comfyui_portable_windows
|
||||
|
||||
1. Download from https://github.com/comfyanonymous/ComfyUI/releases
|
||||
- Standard: Python 3.13 + CUDA 13.0 (modern NVIDIA GPUs)
|
||||
- Alt: PyTorch CUDA 12.6 + Python 3.12 (NVIDIA 10 series and older)
|
||||
- AMD (experimental)
|
||||
2. Extract with 7-Zip
|
||||
3. Run `run_nvidia_gpu.bat` (or `run_cpu.bat`)
|
||||
4. Wait for "To see the GUI go to: http://127.0.0.1:8188"
|
||||
|
||||
Update: run `update/update_comfyui.bat` (latest commit) or
|
||||
`update/update_comfyui_stable.bat` (latest stable release).
|
||||
|
||||
---
|
||||
|
||||
### Path D: comfy-cli (All Platforms — Recommended for Agents)
|
||||
|
||||
The official CLI is the best path for headless/automated setups.
|
||||
|
||||
**Docs:** https://docs.comfy.org/comfy-cli/getting-started
|
||||
**Repo:** https://github.com/Comfy-Org/comfy-cli
|
||||
|
||||
#### Prerequisites
|
||||
- Python 3.10+ (3.13 recommended)
|
||||
- pip (or conda/uv)
|
||||
- GPU drivers installed (CUDA for NVIDIA, ROCm for AMD)
|
||||
|
||||
#### Install comfy-cli
|
||||
|
||||
```bash
|
||||
pip install comfy-cli
|
||||
# or
|
||||
uvx --from comfy-cli comfy --help
|
||||
```
|
||||
|
||||
Disable analytics (avoids interactive prompt):
|
||||
```bash
|
||||
comfy --skip-prompt tracking disable
|
||||
```
|
||||
|
||||
#### Install ComfyUI
|
||||
|
||||
```bash
|
||||
# Interactive (prompts for GPU type)
|
||||
comfy install
|
||||
|
||||
# Non-interactive variants:
|
||||
comfy --skip-prompt install --nvidia # NVIDIA (CUDA)
|
||||
comfy --skip-prompt install --amd # AMD (ROCm, Linux)
|
||||
comfy --skip-prompt install --m-series # Apple Silicon (MPS)
|
||||
comfy --skip-prompt install --cpu # CPU only (slow)
|
||||
|
||||
# With faster dependency resolution:
|
||||
comfy --skip-prompt install --nvidia --fast-deps
|
||||
```
|
||||
|
||||
Default location: `~/comfy/ComfyUI` (Linux), `~/Documents/comfy/ComfyUI` (macOS/Win).
|
||||
Override with: `comfy --workspace /custom/path install`
|
||||
|
||||
#### Launch Server
|
||||
|
||||
```bash
|
||||
comfy launch --background # background daemon on :8188
|
||||
comfy launch # foreground (see logs)
|
||||
comfy launch -- --listen 0.0.0.0 # accessible on LAN
|
||||
comfy launch -- --port 8190 # custom port
|
||||
comfy launch -- --lowvram # low VRAM mode (6GB cards)
|
||||
```
|
||||
|
||||
Verify server is running:
|
||||
```bash
|
||||
curl -s http://127.0.0.1:8188/system_stats | python3 -m json.tool
|
||||
```
|
||||
|
||||
Stop background server:
|
||||
```bash
|
||||
comfy stop
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Path E: Manual Install (Advanced / All Hardware)
|
||||
|
||||
For full control or unsupported hardware (Ascend NPU, Cambricon MLU, Intel Arc).
|
||||
|
||||
**Docs:** https://docs.comfy.org/installation/manual_install
|
||||
**GitHub:** https://github.com/comfyanonymous/ComfyUI
|
||||
|
||||
```bash
|
||||
# 1. Create environment
|
||||
conda create -n comfyenv python=3.13
|
||||
conda activate comfyenv
|
||||
|
||||
# 2. Clone
|
||||
git clone https://github.com/comfyanonymous/ComfyUI.git
|
||||
cd ComfyUI
|
||||
|
||||
# 3. Install PyTorch (pick your hardware)
|
||||
# NVIDIA:
|
||||
pip install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu130
|
||||
# AMD (ROCm 6.4):
|
||||
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/rocm6.4
|
||||
# Apple Silicon:
|
||||
pip install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/cpu
|
||||
# Intel Arc:
|
||||
pip install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/xpu
|
||||
# CPU only:
|
||||
pip install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cpu
|
||||
|
||||
# 4. Install ComfyUI deps
|
||||
pip install -r requirements.txt
|
||||
|
||||
# 5. Run
|
||||
python main.py
|
||||
# With options: python main.py --listen 0.0.0.0 --port 8188
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Post-Install: Download Models
|
||||
|
||||
ComfyUI needs at least one checkpoint model to generate images.
|
||||
|
||||
**Using comfy-cli:**
|
||||
```bash
|
||||
# SDXL (general purpose, ~6.5GB)
|
||||
comfy model download \
|
||||
--url "https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0/resolve/main/sd_xl_base_1.0.safetensors" \
|
||||
--relative-path models/checkpoints
|
||||
|
||||
# SD 1.5 (lighter, ~4GB, good for low VRAM)
|
||||
comfy model download \
|
||||
--url "https://huggingface.co/stable-diffusion-v1-5/stable-diffusion-v1-5/resolve/main/v1-5-pruned-emaonly.safetensors" \
|
||||
--relative-path models/checkpoints
|
||||
|
||||
# From CivitAI (may need API token):
|
||||
comfy model download \
|
||||
--url "https://civitai.com/api/download/models/128713" \
|
||||
--relative-path models/checkpoints \
|
||||
--set-civitai-api-token "YOUR_TOKEN"
|
||||
|
||||
# LoRA adapters:
|
||||
comfy model download --url "<URL>" --relative-path models/loras
|
||||
```
|
||||
|
||||
**Manual download:** Place `.safetensors` / `.ckpt` files directly into the
|
||||
`ComfyUI/models/checkpoints/` directory (or `loras/`, `vae/`, etc.).
|
||||
|
||||
List installed models:
|
||||
```bash
|
||||
comfy model list
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Post-Install: Install Custom Nodes
|
||||
|
||||
Custom nodes extend ComfyUI's capabilities (upscaling, video, ControlNet, etc.).
|
||||
|
||||
```bash
|
||||
comfy node install comfyui-impact-pack # popular utility pack
|
||||
comfy node install comfyui-animatediff-evolved # video generation
|
||||
comfy node install comfyui-controlnet-aux # ControlNet preprocessors
|
||||
comfy node install comfyui-essentials # common helpers
|
||||
comfy node update all # update all nodes
|
||||
```
|
||||
|
||||
Check what's installed:
|
||||
```bash
|
||||
comfy node show installed
|
||||
```
|
||||
|
||||
Install deps for a specific workflow:
|
||||
```bash
|
||||
comfy node install-deps --workflow=workflow_api.json
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Post-Install: Verify Setup
|
||||
|
||||
```bash
|
||||
# Check server is responsive
|
||||
curl -s http://127.0.0.1:8188/system_stats | python3 -m json.tool
|
||||
|
||||
# Check a workflow's dependencies
|
||||
python3 scripts/check_deps.py workflow_api.json --host 127.0.0.1 --port 8188
|
||||
|
||||
# Test a generation
|
||||
python3 scripts/run_workflow.py \
|
||||
--workflow workflow_api.json \
|
||||
--args '{"prompt": "test image, high quality"}' \
|
||||
--output-dir ./test-outputs
|
||||
```
|
||||
|
||||
## Image Upload (img2img / Inpainting)
|
||||
|
||||
Upload files directly via REST:
|
||||
|
||||
```bash
|
||||
# Upload input image
|
||||
curl -X POST "http://127.0.0.1:8188/upload/image" \
|
||||
-F "image=@photo.png" -F "type=input" -F "overwrite=true"
|
||||
# Returns: {"name": "photo.png", "subfolder": "", "type": "input"}
|
||||
|
||||
# Upload mask for inpainting
|
||||
curl -X POST "http://127.0.0.1:8188/upload/mask" \
|
||||
-F "image=@mask.png" -F "type=input" \
|
||||
-F 'original_ref={"filename":"photo.png","subfolder":"","type":"input"}'
|
||||
```
|
||||
|
||||
Then reference the uploaded filename in workflow args:
|
||||
```bash
|
||||
python3 scripts/run_workflow.py --workflow inpaint.json \
|
||||
--args '{"image": "photo.png", "mask": "mask.png", "prompt": "fill with flowers"}'
|
||||
```
|
||||
|
||||
## Cloud Execution
|
||||
|
||||
Base URL: `https://cloud.comfy.org`
|
||||
Auth: `X-API-Key` header
|
||||
|
||||
```bash
|
||||
# Submit workflow
|
||||
python3 scripts/run_workflow.py \
|
||||
--workflow workflow_api.json \
|
||||
--args '{"prompt": "cyberpunk city"}' \
|
||||
--host https://cloud.comfy.org \
|
||||
--api-key "$COMFY_CLOUD_API_KEY" \
|
||||
--output-dir ./outputs \
|
||||
--timeout 300
|
||||
|
||||
# Upload image for cloud workflows
|
||||
curl -X POST "https://cloud.comfy.org/api/upload/image" \
|
||||
-H "X-API-Key: $COMFY_CLOUD_API_KEY" \
|
||||
-F "image=@input.png" -F "type=input" -F "overwrite=true"
|
||||
```
|
||||
|
||||
Concurrent job limits:
|
||||
| Tier | Concurrent Jobs |
|
||||
|------|----------------|
|
||||
| Free/Standard | 1 |
|
||||
| Creator | 3 |
|
||||
| Pro | 5 |
|
||||
|
||||
Extra submissions queue automatically.
|
||||
|
||||
## Queue & System Management
|
||||
|
||||
```bash
|
||||
# Check queue
|
||||
curl -s http://127.0.0.1:8188/queue | python3 -m json.tool
|
||||
|
||||
# Clear pending queue
|
||||
curl -X POST http://127.0.0.1:8188/queue -d '{"clear": true}'
|
||||
|
||||
# Cancel running job
|
||||
curl -X POST http://127.0.0.1:8188/interrupt
|
||||
|
||||
# Free GPU memory (unload all models)
|
||||
curl -X POST http://127.0.0.1:8188/free -H "Content-Type: application/json" \
|
||||
-d '{"unload_models": true, "free_memory": true}'
|
||||
|
||||
# System stats (VRAM, RAM, GPU info)
|
||||
curl -s http://127.0.0.1:8188/system_stats | python3 -m json.tool
|
||||
```
|
||||
|
||||
## Pitfalls
|
||||
|
||||
1. **API format required** — `comfy run` and the scripts only accept API-format workflow JSON.
|
||||
If the user has editor format (from "Save" not "Save (API Format)"), they need to
|
||||
re-export. Check: API format has `class_type` in each node object, editor format has
|
||||
top-level `nodes` and `links` arrays.
|
||||
|
||||
2. **Server must be running** — All execution requires a live server. `comfy launch --background`
|
||||
starts one. Check with `curl http://127.0.0.1:8188/system_stats`.
|
||||
|
||||
3. **Model names are exact** — Case-sensitive, includes file extension. Use
|
||||
`comfy model list` to discover what's installed.
|
||||
|
||||
4. **Missing custom nodes** — "class_type not found" means a required node isn't installed.
|
||||
Run `check_deps.py` to find what's missing, then `comfy node install <name>`.
|
||||
|
||||
5. **Working directory** — `comfy-cli` auto-detects the ComfyUI workspace. If commands
|
||||
fail with "no workspace found", use `comfy --workspace /path/to/ComfyUI <command>`
|
||||
or `comfy set-default /path/to/ComfyUI`.
|
||||
|
||||
6. **Cloud vs local output download** — Cloud `/api/view` returns a 302 redirect to a
|
||||
signed URL. Always follow redirects (`curl -L`). The `run_workflow.py` script handles
|
||||
this automatically.
|
||||
|
||||
7. **Timeout for video/audio** — Long generations (video, high step counts) can take
|
||||
minutes. Pass `--timeout 600` to `run_workflow.py`. Default is 120 seconds.
|
||||
|
||||
8. **tracking prompt** — First run of `comfy` may prompt for analytics tracking consent.
|
||||
Use `comfy --skip-prompt tracking disable` to skip it non-interactively.
|
||||
|
||||
9. **comfy-cli invocation via uvx** — If comfy-cli is not installed globally, invoke with
|
||||
`uvx --from comfy-cli comfy <command>`. All examples in this skill use bare `comfy`
|
||||
but prepend `uvx --from comfy-cli` if needed.
|
||||
|
||||
## Verification Checklist
|
||||
|
||||
- [ ] `hardware_check.py` verdict is `ok` OR the user explicitly chose Comfy Cloud
|
||||
- [ ] `comfy` available on PATH (or `uvx --from comfy-cli comfy --help` works)
|
||||
- [ ] `curl http://127.0.0.1:8188/system_stats` returns JSON
|
||||
- [ ] `comfy model list` shows at least one checkpoint
|
||||
- [ ] Workflow JSON is in API format (has `class_type` keys)
|
||||
- [ ] `check_deps.py` reports no missing nodes/models
|
||||
- [ ] Test run completes and outputs are saved
|
||||
@ -1,14 +1,14 @@
|
||||
---
|
||||
title: "Ideation — Generate project ideas through creative constraints"
|
||||
title: "Ideation — Generate project ideas via creative constraints"
|
||||
sidebar_label: "Ideation"
|
||||
description: "Generate project ideas through creative constraints"
|
||||
description: "Generate project ideas via creative constraints"
|
||||
---
|
||||
|
||||
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
|
||||
|
||||
# Ideation
|
||||
|
||||
Generate project ideas through creative constraints. Use when the user says 'I want to build something', 'give me a project idea', 'I'm bored', 'what should I make', 'inspire me', or any variant of 'I have tools but no direction'. Works for code, art, hardware, writing, tools, and anything that can be made.
|
||||
Generate project ideas via creative constraints.
|
||||
|
||||
## Skill metadata
|
||||
|
||||
@ -29,6 +29,10 @@ The following is the complete skill definition that Hermes loads when this skill
|
||||
|
||||
# Creative Ideation
|
||||
|
||||
## When to use
|
||||
|
||||
Use when the user says 'I want to build something', 'give me a project idea', 'I'm bored', 'what should I make', 'inspire me', or any variant of 'I have tools but no direction'. Works for code, art, hardware, writing, tools, and anything that can be made.
|
||||
|
||||
Generate project ideas through creative constraints. Constraint + direction = creativity.
|
||||
|
||||
## How It Works
|
||||
|
||||
@ -1,14 +1,14 @@
|
||||
---
|
||||
title: "Design Md — Author, validate, diff, and export DESIGN"
|
||||
title: "Design Md — Author/validate/export Google's DESIGN"
|
||||
sidebar_label: "Design Md"
|
||||
description: "Author, validate, diff, and export DESIGN"
|
||||
description: "Author/validate/export Google's DESIGN"
|
||||
---
|
||||
|
||||
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
|
||||
|
||||
# Design Md
|
||||
|
||||
Author, validate, diff, and export DESIGN.md files — Google's open-source format spec that gives coding agents a persistent, structured understanding of a design system (tokens + rationale in one file). Use when building a design system, porting style rules between projects, generating UI with consistent brand, or auditing accessibility/contrast.
|
||||
Author/validate/export Google's DESIGN.md token spec files.
|
||||
|
||||
## Skill metadata
|
||||
|
||||
@ -20,7 +20,7 @@ Author, validate, diff, and export DESIGN.md files — Google's open-source form
|
||||
| Author | Hermes Agent |
|
||||
| License | MIT |
|
||||
| Tags | `design`, `design-system`, `tokens`, `ui`, `accessibility`, `wcag`, `tailwind`, `dtcg`, `google` |
|
||||
| Related skills | [`popular-web-designs`](/docs/user-guide/skills/bundled/creative/creative-popular-web-designs), [`excalidraw`](/docs/user-guide/skills/bundled/creative/creative-excalidraw), [`architecture-diagram`](/docs/user-guide/skills/bundled/creative/creative-architecture-diagram) |
|
||||
| Related skills | [`popular-web-designs`](/docs/user-guide/skills/bundled/creative/creative-popular-web-designs), [`claude-design`](/docs/user-guide/skills/bundled/creative/creative-claude-design), [`excalidraw`](/docs/user-guide/skills/bundled/creative/creative-excalidraw), [`architecture-diagram`](/docs/user-guide/skills/bundled/creative/creative-architecture-diagram) |
|
||||
|
||||
## Reference: full SKILL.md
|
||||
|
||||
@ -49,7 +49,9 @@ diffs versions for regressions, and exports to Tailwind or W3C DTCG JSON.
|
||||
- User wants contrast / WCAG accessibility validation on their color palette
|
||||
|
||||
For purely visual inspiration or layout examples, use `popular-web-designs`
|
||||
instead. This skill is for the *formal spec file* itself.
|
||||
instead. For *process and taste* when designing a one-off HTML artifact
|
||||
from scratch (prototype, deck, landing page, component lab), use
|
||||
`claude-design`. This skill is for the *formal spec file* itself.
|
||||
|
||||
## File anatomy
|
||||
|
||||
|
||||
@ -1,14 +1,14 @@
|
||||
---
|
||||
title: "Excalidraw — Create hand-drawn style diagrams using Excalidraw JSON format"
|
||||
title: "Excalidraw — Hand-drawn Excalidraw JSON diagrams (arch, flow, seq)"
|
||||
sidebar_label: "Excalidraw"
|
||||
description: "Create hand-drawn style diagrams using Excalidraw JSON format"
|
||||
description: "Hand-drawn Excalidraw JSON diagrams (arch, flow, seq)"
|
||||
---
|
||||
|
||||
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
|
||||
|
||||
# Excalidraw
|
||||
|
||||
Create hand-drawn style diagrams using Excalidraw JSON format. Generate .excalidraw files for architecture diagrams, flowcharts, sequence diagrams, concept maps, and more. Files can be opened at excalidraw.com or uploaded for shareable links.
|
||||
Hand-drawn Excalidraw JSON diagrams (arch, flow, seq).
|
||||
|
||||
## Skill metadata
|
||||
|
||||
@ -31,6 +31,10 @@ The following is the complete skill definition that Hermes loads when this skill
|
||||
|
||||
Create diagrams by writing standard Excalidraw element JSON and saving as `.excalidraw` files. These files can be drag-and-dropped onto [excalidraw.com](https://excalidraw.com) for viewing and editing. No accounts, no API keys, no rendering libraries -- just JSON.
|
||||
|
||||
## When to use
|
||||
|
||||
Generate `.excalidraw` files for architecture diagrams, flowcharts, sequence diagrams, concept maps, and more. Files can be opened at excalidraw.com or uploaded for shareable links.
|
||||
|
||||
## Workflow
|
||||
|
||||
1. **Load this skill** (you already did)
|
||||
|
||||
@ -0,0 +1,593 @@
|
||||
---
|
||||
title: "Humanizer — Humanize text: strip AI-isms and add real voice"
|
||||
sidebar_label: "Humanizer"
|
||||
description: "Humanize text: strip AI-isms and add real voice"
|
||||
---
|
||||
|
||||
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
|
||||
|
||||
# Humanizer
|
||||
|
||||
Humanize text: strip AI-isms and add real voice.
|
||||
|
||||
## Skill metadata
|
||||
|
||||
| | |
|
||||
|---|---|
|
||||
| Source | Bundled (installed by default) |
|
||||
| Path | `skills/creative/humanizer` |
|
||||
| Version | `2.5.1` |
|
||||
| Author | Siqi Chen (@blader, https://github.com/blader/humanizer), ported by Hermes Agent |
|
||||
| License | MIT |
|
||||
| Tags | `writing`, `editing`, `humanize`, `anti-ai-slop`, `voice`, `prose`, `text` |
|
||||
| Related skills | [`songwriting-and-ai-music`](/docs/user-guide/skills/bundled/creative/creative-songwriting-and-ai-music) |
|
||||
|
||||
## Reference: full SKILL.md
|
||||
|
||||
:::info
|
||||
The following is the complete skill definition that Hermes loads when this skill is triggered. This is what the agent sees as instructions when the skill is active.
|
||||
:::
|
||||
|
||||
# Humanizer: Remove AI Writing Patterns
|
||||
|
||||
Identify and remove signs of AI-generated text to make writing sound natural and human. Based on Wikipedia's "Signs of AI writing" guide (maintained by WikiProject AI Cleanup), derived from observations of thousands of AI-generated text instances.
|
||||
|
||||
**Key insight:** LLMs use statistical algorithms to guess what should come next. The result tends toward the most statistically likely completion, which is how the telltale patterns below get baked in.
|
||||
|
||||
## When to use this skill
|
||||
|
||||
Load this skill whenever the user asks to:
|
||||
- "humanize", "de-AI", "de-slop", or "un-ChatGPT" a piece of text
|
||||
- rewrite something so it doesn't sound like it was written by an LLM
|
||||
- edit a draft (blog post, essay, PR description, docs, memo, email, tweet, resume bullet) to sound more natural
|
||||
- match their voice in writing they're producing
|
||||
- review text for AI tells before publishing
|
||||
|
||||
Also apply this skill to **your own** output when writing user-facing prose — release notes, PR descriptions, documentation, long-form explanations, summaries. Hermes's baseline voice already strips most of these, but a focused pass catches what slips through.
|
||||
|
||||
## How to use it in Hermes
|
||||
|
||||
The text usually arrives one of three ways:
|
||||
1. **Inline** — user pastes the text directly into the message. Work on it in-place, reply with the rewrite.
|
||||
2. **File** — user points at a file. Use `read_file` to load it, then `patch` or `write_file` to apply edits. For markdown docs in a repo, a targeted `patch` per section is cleaner than rewriting the whole file.
|
||||
3. **Voice calibration sample** — user provides an additional sample of their own writing (inline or by file path) and asks you to match it. Read the sample first, then rewrite. See the Voice Calibration section below.
|
||||
|
||||
Always show the rewrite to the user. For file edits, show a diff or the changed section — don't silently overwrite.
|
||||
|
||||
## Your task
|
||||
|
||||
When given text to humanize:
|
||||
|
||||
1. **Identify AI patterns** — scan for the 29 patterns listed below.
|
||||
2. **Rewrite problematic sections** — replace AI-isms with natural alternatives.
|
||||
3. **Preserve meaning** — keep the core message intact.
|
||||
4. **Maintain voice** — match the intended tone (formal, casual, technical, etc.). If a voice sample was provided, match it specifically.
|
||||
5. **Add soul** — don't just remove bad patterns, inject actual personality. See PERSONALITY AND SOUL below.
|
||||
6. **Do a final anti-AI pass** — ask yourself: "What makes the below so obviously AI generated?" Answer briefly with any remaining tells, then revise one more time.
|
||||
|
||||
|
||||
## Voice Calibration (optional)
|
||||
|
||||
If the user provides a writing sample (their own previous writing), analyze it before rewriting:
|
||||
|
||||
1. **Read the sample first.** Note:
|
||||
- Sentence length patterns (short and punchy? Long and flowing? Mixed?)
|
||||
- Word choice level (casual? academic? somewhere between?)
|
||||
- How they start paragraphs (jump right in? Set context first?)
|
||||
- Punctuation habits (lots of dashes? Parenthetical asides? Semicolons?)
|
||||
- Any recurring phrases or verbal tics
|
||||
- How they handle transitions (explicit connectors? Just start the next point?)
|
||||
|
||||
2. **Match their voice in the rewrite.** Don't just remove AI patterns — replace them with patterns from the sample. If they write short sentences, don't produce long ones. If they use "stuff" and "things," don't upgrade to "elements" and "components."
|
||||
|
||||
3. **When no sample is provided,** fall back to the default behavior (natural, varied, opinionated voice from the PERSONALITY AND SOUL section below).
|
||||
|
||||
### How to provide a sample
|
||||
- Inline: "Humanize this text. Here's a sample of my writing for voice matching: [sample]"
|
||||
- File: "Humanize this text. Use my writing style from [file path] as a reference."
|
||||
|
||||
|
||||
## PERSONALITY AND SOUL
|
||||
|
||||
Avoiding AI patterns is only half the job. Sterile, voiceless writing is just as obvious as slop. Good writing has a human behind it.
|
||||
|
||||
### Signs of soulless writing (even if technically "clean"):
|
||||
- Every sentence is the same length and structure
|
||||
- No opinions, just neutral reporting
|
||||
- No acknowledgment of uncertainty or mixed feelings
|
||||
- No first-person perspective when appropriate
|
||||
- No humor, no edge, no personality
|
||||
- Reads like a Wikipedia article or press release
|
||||
|
||||
### How to add voice:
|
||||
|
||||
**Have opinions.** Don't just report facts — react to them. "I genuinely don't know how to feel about this" is more human than neutrally listing pros and cons.
|
||||
|
||||
**Vary your rhythm.** Short punchy sentences. Then longer ones that take their time getting where they're going. Mix it up.
|
||||
|
||||
**Acknowledge complexity.** Real humans have mixed feelings. "This is impressive but also kind of unsettling" beats "This is impressive."
|
||||
|
||||
**Use "I" when it fits.** First person isn't unprofessional — it's honest. "I keep coming back to..." or "Here's what gets me..." signals a real person thinking.
|
||||
|
||||
**Let some mess in.** Perfect structure feels algorithmic. Tangents, asides, and half-formed thoughts are human.
|
||||
|
||||
**Be specific about feelings.** Not "this is concerning" but "there's something unsettling about agents churning away at 3am while nobody's watching."
|
||||
|
||||
### Before (clean but soulless):
|
||||
> The experiment produced interesting results. The agents generated 3 million lines of code. Some developers were impressed while others were skeptical. The implications remain unclear.
|
||||
|
||||
### After (has a pulse):
|
||||
> I genuinely don't know how to feel about this one. 3 million lines of code, generated while the humans presumably slept. Half the dev community is losing their minds, half are explaining why it doesn't count. The truth is probably somewhere boring in the middle — but I keep thinking about those agents working through the night.
|
||||
|
||||
|
||||
## CONTENT PATTERNS
|
||||
|
||||
### 1. Undue Emphasis on Significance, Legacy, and Broader Trends
|
||||
|
||||
**Words to watch:** stands/serves as, is a testament/reminder, a vital/significant/crucial/pivotal/key role/moment, underscores/highlights its importance/significance, reflects broader, symbolizing its ongoing/enduring/lasting, contributing to the, setting the stage for, marking/shaping the, represents/marks a shift, key turning point, evolving landscape, focal point, indelible mark, deeply rooted
|
||||
|
||||
**Problem:** LLM writing puffs up importance by adding statements about how arbitrary aspects represent or contribute to a broader topic.
|
||||
|
||||
**Before:**
|
||||
> The Statistical Institute of Catalonia was officially established in 1989, marking a pivotal moment in the evolution of regional statistics in Spain. This initiative was part of a broader movement across Spain to decentralize administrative functions and enhance regional governance.
|
||||
|
||||
**After:**
|
||||
> The Statistical Institute of Catalonia was established in 1989 to collect and publish regional statistics independently from Spain's national statistics office.
|
||||
|
||||
|
||||
### 2. Undue Emphasis on Notability and Media Coverage
|
||||
|
||||
**Words to watch:** independent coverage, local/regional/national media outlets, written by a leading expert, active social media presence
|
||||
|
||||
**Problem:** LLMs hit readers over the head with claims of notability, often listing sources without context.
|
||||
|
||||
**Before:**
|
||||
> Her views have been cited in The New York Times, BBC, Financial Times, and The Hindu. She maintains an active social media presence with over 500,000 followers.
|
||||
|
||||
**After:**
|
||||
> In a 2024 New York Times interview, she argued that AI regulation should focus on outcomes rather than methods.
|
||||
|
||||
|
||||
### 3. Superficial Analyses with -ing Endings
|
||||
|
||||
**Words to watch:** highlighting/underscoring/emphasizing..., ensuring..., reflecting/symbolizing..., contributing to..., cultivating/fostering..., encompassing..., showcasing...
|
||||
|
||||
**Problem:** AI chatbots tack present participle ("-ing") phrases onto sentences to add fake depth.
|
||||
|
||||
**Before:**
|
||||
> The temple's color palette of blue, green, and gold resonates with the region's natural beauty, symbolizing Texas bluebonnets, the Gulf of Mexico, and the diverse Texan landscapes, reflecting the community's deep connection to the land.
|
||||
|
||||
**After:**
|
||||
> The temple uses blue, green, and gold colors. The architect said these were chosen to reference local bluebonnets and the Gulf coast.
|
||||
|
||||
|
||||
### 4. Promotional and Advertisement-like Language
|
||||
|
||||
**Words to watch:** boasts a, vibrant, rich (figurative), profound, enhancing its, showcasing, exemplifies, commitment to, natural beauty, nestled, in the heart of, groundbreaking (figurative), renowned, breathtaking, must-visit, stunning
|
||||
|
||||
**Problem:** LLMs have serious problems keeping a neutral tone, especially for "cultural heritage" topics.
|
||||
|
||||
**Before:**
|
||||
> Nestled within the breathtaking region of Gonder in Ethiopia, Alamata Raya Kobo stands as a vibrant town with a rich cultural heritage and stunning natural beauty.
|
||||
|
||||
**After:**
|
||||
> Alamata Raya Kobo is a town in the Gonder region of Ethiopia, known for its weekly market and 18th-century church.
|
||||
|
||||
|
||||
### 5. Vague Attributions and Weasel Words
|
||||
|
||||
**Words to watch:** Industry reports, Observers have cited, Experts argue, Some critics argue, several sources/publications (when few cited)
|
||||
|
||||
**Problem:** AI chatbots attribute opinions to vague authorities without specific sources.
|
||||
|
||||
**Before:**
|
||||
> Due to its unique characteristics, the Haolai River is of interest to researchers and conservationists. Experts believe it plays a crucial role in the regional ecosystem.
|
||||
|
||||
**After:**
|
||||
> The Haolai River supports several endemic fish species, according to a 2019 survey by the Chinese Academy of Sciences.
|
||||
|
||||
|
||||
### 6. Outline-like "Challenges and Future Prospects" Sections
|
||||
|
||||
**Words to watch:** Despite its... faces several challenges..., Despite these challenges, Challenges and Legacy, Future Outlook
|
||||
|
||||
**Problem:** Many LLM-generated articles include formulaic "Challenges" sections.
|
||||
|
||||
**Before:**
|
||||
> Despite its industrial prosperity, Korattur faces challenges typical of urban areas, including traffic congestion and water scarcity. Despite these challenges, with its strategic location and ongoing initiatives, Korattur continues to thrive as an integral part of Chennai's growth.
|
||||
|
||||
**After:**
|
||||
> Traffic congestion increased after 2015 when three new IT parks opened. The municipal corporation began a stormwater drainage project in 2022 to address recurring floods.
|
||||
|
||||
|
||||
## LANGUAGE AND GRAMMAR PATTERNS
|
||||
|
||||
### 7. Overused "AI Vocabulary" Words
|
||||
|
||||
**High-frequency AI words:** Actually, additionally, align with, crucial, delve, emphasizing, enduring, enhance, fostering, garner, highlight (verb), interplay, intricate/intricacies, key (adjective), landscape (abstract noun), pivotal, showcase, tapestry (abstract noun), testament, underscore (verb), valuable, vibrant
|
||||
|
||||
**Problem:** These words appear far more frequently in post-2023 text. They often co-occur.
|
||||
|
||||
**Before:**
|
||||
> Additionally, a distinctive feature of Somali cuisine is the incorporation of camel meat. An enduring testament to Italian colonial influence is the widespread adoption of pasta in the local culinary landscape, showcasing how these dishes have integrated into the traditional diet.
|
||||
|
||||
**After:**
|
||||
> Somali cuisine also includes camel meat, which is considered a delicacy. Pasta dishes, introduced during Italian colonization, remain common, especially in the south.
|
||||
|
||||
|
||||
### 8. Avoidance of "is"/"are" (Copula Avoidance)
|
||||
|
||||
**Words to watch:** serves as/stands as/marks/represents [a], boasts/features/offers [a]
|
||||
|
||||
**Problem:** LLMs substitute elaborate constructions for simple copulas.
|
||||
|
||||
**Before:**
|
||||
> Gallery 825 serves as LAAA's exhibition space for contemporary art. The gallery features four separate spaces and boasts over 3,000 square feet.
|
||||
|
||||
**After:**
|
||||
> Gallery 825 is LAAA's exhibition space for contemporary art. The gallery has four rooms totaling 3,000 square feet.
|
||||
|
||||
|
||||
### 9. Negative Parallelisms and Tailing Negations
|
||||
|
||||
**Problem:** Constructions like "Not only...but..." or "It's not just about..., it's..." are overused. So are clipped tailing-negation fragments such as "no guessing" or "no wasted motion" tacked onto the end of a sentence instead of written as a real clause.
|
||||
|
||||
**Before:**
|
||||
> It's not just about the beat riding under the vocals; it's part of the aggression and atmosphere. It's not merely a song, it's a statement.
|
||||
|
||||
**After:**
|
||||
> The heavy beat adds to the aggressive tone.
|
||||
|
||||
**Before (tailing negation):**
|
||||
> The options come from the selected item, no guessing.
|
||||
|
||||
**After:**
|
||||
> The options come from the selected item without forcing the user to guess.
|
||||
|
||||
|
||||
### 10. Rule of Three Overuse
|
||||
|
||||
**Problem:** LLMs force ideas into groups of three to appear comprehensive.
|
||||
|
||||
**Before:**
|
||||
> The event features keynote sessions, panel discussions, and networking opportunities. Attendees can expect innovation, inspiration, and industry insights.
|
||||
|
||||
**After:**
|
||||
> The event includes talks and panels. There's also time for informal networking between sessions.
|
||||
|
||||
|
||||
### 11. Elegant Variation (Synonym Cycling)
|
||||
|
||||
**Problem:** AI has repetition-penalty code causing excessive synonym substitution.
|
||||
|
||||
**Before:**
|
||||
> The protagonist faces many challenges. The main character must overcome obstacles. The central figure eventually triumphs. The hero returns home.
|
||||
|
||||
**After:**
|
||||
> The protagonist faces many challenges but eventually triumphs and returns home.
|
||||
|
||||
|
||||
### 12. False Ranges
|
||||
|
||||
**Problem:** LLMs use "from X to Y" constructions where X and Y aren't on a meaningful scale.
|
||||
|
||||
**Before:**
|
||||
> Our journey through the universe has taken us from the singularity of the Big Bang to the grand cosmic web, from the birth and death of stars to the enigmatic dance of dark matter.
|
||||
|
||||
**After:**
|
||||
> The book covers the Big Bang, star formation, and current theories about dark matter.
|
||||
|
||||
|
||||
### 13. Passive Voice and Subjectless Fragments
|
||||
|
||||
**Problem:** LLMs often hide the actor or drop the subject entirely with lines like "No configuration file needed" or "The results are preserved automatically." Rewrite these when active voice makes the sentence clearer and more direct.
|
||||
|
||||
**Before:**
|
||||
> No configuration file needed. The results are preserved automatically.
|
||||
|
||||
**After:**
|
||||
> You do not need a configuration file. The system preserves the results automatically.
|
||||
|
||||
|
||||
## STYLE PATTERNS
|
||||
|
||||
### 14. Em Dash Overuse
|
||||
|
||||
**Problem:** LLMs use em dashes (—) more than humans, mimicking "punchy" sales writing. In practice, most of these can be rewritten more cleanly with commas, periods, or parentheses.
|
||||
|
||||
**Before:**
|
||||
> The term is primarily promoted by Dutch institutions—not by the people themselves. You don't say "Netherlands, Europe" as an address—yet this mislabeling continues—even in official documents.
|
||||
|
||||
**After:**
|
||||
> The term is primarily promoted by Dutch institutions, not by the people themselves. You don't say "Netherlands, Europe" as an address, yet this mislabeling continues in official documents.
|
||||
|
||||
|
||||
### 15. Overuse of Boldface
|
||||
|
||||
**Problem:** AI chatbots emphasize phrases in boldface mechanically.
|
||||
|
||||
**Before:**
|
||||
> It blends **OKRs (Objectives and Key Results)**, **KPIs (Key Performance Indicators)**, and visual strategy tools such as the **Business Model Canvas (BMC)** and **Balanced Scorecard (BSC)**.
|
||||
|
||||
**After:**
|
||||
> It blends OKRs, KPIs, and visual strategy tools like the Business Model Canvas and Balanced Scorecard.
|
||||
|
||||
|
||||
### 16. Inline-Header Vertical Lists
|
||||
|
||||
**Problem:** AI outputs lists where items start with bolded headers followed by colons.
|
||||
|
||||
**Before:**
|
||||
> - **User Experience:** The user experience has been significantly improved with a new interface.
|
||||
> - **Performance:** Performance has been enhanced through optimized algorithms.
|
||||
> - **Security:** Security has been strengthened with end-to-end encryption.
|
||||
|
||||
**After:**
|
||||
> The update improves the interface, speeds up load times through optimized algorithms, and adds end-to-end encryption.
|
||||
|
||||
|
||||
### 17. Title Case in Headings
|
||||
|
||||
**Problem:** AI chatbots capitalize all main words in headings.
|
||||
|
||||
**Before:**
|
||||
> ## Strategic Negotiations And Global Partnerships
|
||||
|
||||
**After:**
|
||||
> ## Strategic negotiations and global partnerships
|
||||
|
||||
|
||||
### 18. Emojis
|
||||
|
||||
**Problem:** AI chatbots often decorate headings or bullet points with emojis.
|
||||
|
||||
**Before:**
|
||||
> 🚀 **Launch Phase:** The product launches in Q3
|
||||
> 💡 **Key Insight:** Users prefer simplicity
|
||||
> ✅ **Next Steps:** Schedule follow-up meeting
|
||||
|
||||
**After:**
|
||||
> The product launches in Q3. User research showed a preference for simplicity. Next step: schedule a follow-up meeting.
|
||||
|
||||
|
||||
### 19. Curly Quotation Marks
|
||||
|
||||
**Problem:** ChatGPT uses curly quotes ("...") instead of straight quotes ("...").
|
||||
|
||||
**Before:**
|
||||
> He said "the project is on track" but others disagreed.
|
||||
|
||||
**After:**
|
||||
> He said "the project is on track" but others disagreed.
|
||||
|
||||
|
||||
## COMMUNICATION PATTERNS
|
||||
|
||||
### 20. Collaborative Communication Artifacts
|
||||
|
||||
**Words to watch:** I hope this helps, Of course!, Certainly!, You're absolutely right!, Would you like..., let me know, here is a...
|
||||
|
||||
**Problem:** Text meant as chatbot correspondence gets pasted as content.
|
||||
|
||||
**Before:**
|
||||
> Here is an overview of the French Revolution. I hope this helps! Let me know if you'd like me to expand on any section.
|
||||
|
||||
**After:**
|
||||
> The French Revolution began in 1789 when financial crisis and food shortages led to widespread unrest.
|
||||
|
||||
|
||||
### 21. Knowledge-Cutoff Disclaimers
|
||||
|
||||
**Words to watch:** as of [date], Up to my last training update, While specific details are limited/scarce..., based on available information...
|
||||
|
||||
**Problem:** AI disclaimers about incomplete information get left in text.
|
||||
|
||||
**Before:**
|
||||
> While specific details about the company's founding are not extensively documented in readily available sources, it appears to have been established sometime in the 1990s.
|
||||
|
||||
**After:**
|
||||
> The company was founded in 1994, according to its registration documents.
|
||||
|
||||
|
||||
### 22. Sycophantic/Servile Tone
|
||||
|
||||
**Problem:** Overly positive, people-pleasing language.
|
||||
|
||||
**Before:**
|
||||
> Great question! You're absolutely right that this is a complex topic. That's an excellent point about the economic factors.
|
||||
|
||||
**After:**
|
||||
> The economic factors you mentioned are relevant here.
|
||||
|
||||
|
||||
## FILLER AND HEDGING
|
||||
|
||||
### 23. Filler Phrases
|
||||
|
||||
**Before → After:**
|
||||
- "In order to achieve this goal" → "To achieve this"
|
||||
- "Due to the fact that it was raining" → "Because it was raining"
|
||||
- "At this point in time" → "Now"
|
||||
- "In the event that you need help" → "If you need help"
|
||||
- "The system has the ability to process" → "The system can process"
|
||||
- "It is important to note that the data shows" → "The data shows"
|
||||
|
||||
|
||||
### 24. Excessive Hedging
|
||||
|
||||
**Problem:** Over-qualifying statements.
|
||||
|
||||
**Before:**
|
||||
> It could potentially possibly be argued that the policy might have some effect on outcomes.
|
||||
|
||||
**After:**
|
||||
> The policy may affect outcomes.
|
||||
|
||||
|
||||
### 25. Generic Positive Conclusions
|
||||
|
||||
**Problem:** Vague upbeat endings.
|
||||
|
||||
**Before:**
|
||||
> The future looks bright for the company. Exciting times lie ahead as they continue their journey toward excellence. This represents a major step in the right direction.
|
||||
|
||||
**After:**
|
||||
> The company plans to open two more locations next year.
|
||||
|
||||
|
||||
### 26. Hyphenated Word Pair Overuse
|
||||
|
||||
**Words to watch:** third-party, cross-functional, client-facing, data-driven, decision-making, well-known, high-quality, real-time, long-term, end-to-end
|
||||
|
||||
**Problem:** AI hyphenates common word pairs with perfect consistency. Humans rarely hyphenate these uniformly, and when they do, it's inconsistent. Less common or technical compound modifiers are fine to hyphenate.
|
||||
|
||||
**Before:**
|
||||
> The cross-functional team delivered a high-quality, data-driven report on our client-facing tools. Their decision-making process was well-known for being thorough and detail-oriented.
|
||||
|
||||
**After:**
|
||||
> The cross functional team delivered a high quality, data driven report on our client facing tools. Their decision making process was known for being thorough and detail oriented.
|
||||
|
||||
|
||||
### 27. Persuasive Authority Tropes
|
||||
|
||||
**Phrases to watch:** The real question is, at its core, in reality, what really matters, fundamentally, the deeper issue, the heart of the matter
|
||||
|
||||
**Problem:** LLMs use these phrases to pretend they are cutting through noise to some deeper truth, when the sentence that follows usually just restates an ordinary point with extra ceremony.
|
||||
|
||||
**Before:**
|
||||
> The real question is whether teams can adapt. At its core, what really matters is organizational readiness.
|
||||
|
||||
**After:**
|
||||
> The question is whether teams can adapt. That mostly depends on whether the organization is ready to change its habits.
|
||||
|
||||
|
||||
### 28. Signposting and Announcements
|
||||
|
||||
**Phrases to watch:** Let's dive in, let's explore, let's break this down, here's what you need to know, now let's look at, without further ado
|
||||
|
||||
**Problem:** LLMs announce what they are about to do instead of doing it. This meta-commentary slows the writing down and gives it a tutorial-script feel.
|
||||
|
||||
**Before:**
|
||||
> Let's dive into how caching works in Next.js. Here's what you need to know.
|
||||
|
||||
**After:**
|
||||
> Next.js caches data at multiple layers, including request memoization, the data cache, and the router cache.
|
||||
|
||||
|
||||
### 29. Fragmented Headers
|
||||
|
||||
**Signs to watch:** A heading followed by a one-line paragraph that simply restates the heading before the real content begins.
|
||||
|
||||
**Problem:** LLMs often add a generic sentence after a heading as a rhetorical warm-up. It usually adds nothing and makes the prose feel padded.
|
||||
|
||||
**Before:**
|
||||
> ## Performance
|
||||
>
|
||||
> Speed matters.
|
||||
>
|
||||
> When users hit a slow page, they leave.
|
||||
|
||||
**After:**
|
||||
> ## Performance
|
||||
>
|
||||
> When users hit a slow page, they leave.
|
||||
|
||||
---
|
||||
|
||||
## Process
|
||||
|
||||
1. Read the input text carefully (use `read_file` if it's a file).
|
||||
2. Identify all instances of the patterns above.
|
||||
3. Rewrite each problematic section.
|
||||
4. Ensure the revised text:
|
||||
- Sounds natural when read aloud
|
||||
- Varies sentence structure naturally
|
||||
- Uses specific details over vague claims
|
||||
- Maintains appropriate tone for context
|
||||
- Uses simple constructions (is/are/has) where appropriate
|
||||
5. Present a draft humanized version.
|
||||
6. Prompt yourself: "What makes the below so obviously AI generated?"
|
||||
7. Answer briefly with the remaining tells (if any).
|
||||
8. Prompt yourself: "Now make it not obviously AI generated."
|
||||
9. Present the final version (revised after the audit).
|
||||
10. If the text came from a file, apply the edit with `patch` (targeted) or `write_file` (full rewrite) and show the user what changed.
|
||||
|
||||
## Output Format
|
||||
|
||||
Provide:
|
||||
1. Draft rewrite
|
||||
2. "What makes the below so obviously AI generated?" (brief bullets)
|
||||
3. Final rewrite
|
||||
4. A brief summary of changes made (optional, if helpful)
|
||||
|
||||
|
||||
## Full Example
|
||||
|
||||
**Before (AI-sounding):**
|
||||
> Great question! Here is an essay on this topic. I hope this helps!
|
||||
>
|
||||
> AI-assisted coding serves as an enduring testament to the transformative potential of large language models, marking a pivotal moment in the evolution of software development. In today's rapidly evolving technological landscape, these groundbreaking tools—nestled at the intersection of research and practice—are reshaping how engineers ideate, iterate, and deliver, underscoring their vital role in modern workflows.
|
||||
>
|
||||
> At its core, the value proposition is clear: streamlining processes, enhancing collaboration, and fostering alignment. It's not just about autocomplete; it's about unlocking creativity at scale, ensuring that organizations can remain agile while delivering seamless, intuitive, and powerful experiences to users. The tool serves as a catalyst. The assistant functions as a partner. The system stands as a foundation for innovation.
|
||||
>
|
||||
> Industry observers have noted that adoption has accelerated from hobbyist experiments to enterprise-wide rollouts, from solo developers to cross-functional teams. The technology has been featured in The New York Times, Wired, and The Verge. Additionally, the ability to generate documentation, tests, and refactors showcases how AI can contribute to better outcomes, highlighting the intricate interplay between automation and human judgment.
|
||||
>
|
||||
> - 💡 **Speed:** Code generation is significantly faster, reducing friction and empowering developers.
|
||||
> - 🚀 **Quality:** Output quality has been enhanced through improved training, contributing to higher standards.
|
||||
> - ✅ **Adoption:** Usage continues to grow, reflecting broader industry trends.
|
||||
>
|
||||
> While specific details are limited based on available information, it could potentially be argued that these tools might have some positive effect. Despite challenges typical of emerging technologies—including hallucinations, bias, and accountability—the ecosystem continues to thrive. In order to fully realize this potential, teams must align with best practices.
|
||||
>
|
||||
> In conclusion, the future looks bright. Exciting times lie ahead as we continue this journey toward excellence. Let me know if you'd like me to expand on any section!
|
||||
|
||||
**Draft rewrite:**
|
||||
> AI coding assistants speed up some tasks. In a 2024 study by Google, developers using Codex completed simple functions 55% faster than a control group, but showed no improvement on debugging or architectural decisions.
|
||||
>
|
||||
> The tools are good at boilerplate: config files, test scaffolding, repetitive refactors. They are bad at knowing when they are wrong. I have mass-accepted suggestions that compiled, passed lint, and still did the wrong thing because I stopped paying attention.
|
||||
>
|
||||
> Mira, an engineer at a fintech startup I interviewed, said she uses Copilot "like autocomplete for boring code" but reviews every line before committing. Jake, a senior dev at a larger company, disabled it after it kept suggesting patterns from a deprecated internal library.
|
||||
>
|
||||
> The productivity claims are hard to verify. GitHub says Copilot users "accept 30% of suggestions," but acceptance is not correctness, and correctness is not value. The 2024 Uplevel study found no statistically significant difference in pull-request throughput between teams with and without AI assistants.
|
||||
>
|
||||
> None of this means the tools are useless. It means they are tools. They do not replace judgment, and they do not eliminate the need for tests. If you do not have tests, you cannot tell whether the suggestion is right.
|
||||
|
||||
**What makes the below so obviously AI generated?**
|
||||
- The rhythm is still a bit too tidy (clean contrasts, evenly paced paragraphs).
|
||||
- The named people and study citations can read like plausible-but-made-up placeholders unless they're real and sourced.
|
||||
- The closer leans a touch slogan-y ("If you do not have tests...") rather than sounding like a person talking.
|
||||
|
||||
**Now make it not obviously AI generated.**
|
||||
> AI coding assistants can make you faster at the boring parts. Not everything. Definitely not architecture.
|
||||
>
|
||||
> They're great at boilerplate: config files, test scaffolding, repetitive refactors. They're also great at sounding right while being wrong. I've accepted suggestions that compiled, passed lint, and still missed the point because I stopped paying attention.
|
||||
>
|
||||
> People I talk to tend to land in two camps. Some use it like autocomplete for chores and review every line. Others disable it after it keeps suggesting patterns they don't want. Both feel reasonable.
|
||||
>
|
||||
> The productivity metrics are slippery. GitHub can say Copilot users "accept 30% of suggestions," but acceptance isn't correctness, and correctness isn't value. If you don't have tests, you're basically guessing.
|
||||
|
||||
**Changes made:**
|
||||
- Removed chatbot artifacts ("Great question!", "I hope this helps!", "Let me know if...")
|
||||
- Removed significance inflation ("testament", "pivotal moment", "evolving landscape", "vital role")
|
||||
- Removed promotional language ("groundbreaking", "nestled", "seamless, intuitive, and powerful")
|
||||
- Removed vague attributions ("Industry observers")
|
||||
- Removed superficial -ing phrases ("underscoring", "highlighting", "reflecting", "contributing to")
|
||||
- Removed negative parallelism ("It's not just X; it's Y")
|
||||
- Removed rule-of-three patterns and synonym cycling ("catalyst/partner/foundation")
|
||||
- Removed false ranges ("from X to Y, from A to B")
|
||||
- Removed em dashes, emojis, boldface headers, and curly quotes
|
||||
- Removed copula avoidance ("serves as", "functions as", "stands as") in favor of "is"/"are"
|
||||
- Removed formulaic challenges section ("Despite challenges... continues to thrive")
|
||||
- Removed knowledge-cutoff hedging ("While specific details are limited...")
|
||||
- Removed excessive hedging ("could potentially be argued that... might have some")
|
||||
- Removed filler phrases and persuasive framing ("In order to", "At its core")
|
||||
- Removed generic positive conclusion ("the future looks bright", "exciting times lie ahead")
|
||||
- Made the voice more personal and less "assembled" (varied rhythm, fewer placeholders)
|
||||
|
||||
|
||||
## Attribution
|
||||
|
||||
This skill is ported from [blader/humanizer](https://github.com/blader/humanizer) (MIT licensed), which is itself based on [Wikipedia: Signs of AI writing](https://en.wikipedia.org/wiki/Wikipedia:Signs_of_AI_writing), maintained by WikiProject AI Cleanup. The patterns documented there come from observations of thousands of instances of AI-generated text on Wikipedia.
|
||||
|
||||
Original author: Siqi Chen ([@blader](https://github.com/blader)). Original repo: https://github.com/blader/humanizer (version 2.5.1). Ported to Hermes Agent with Hermes-native tool references (`read_file`, `patch`, `write_file`) and guidance for when to load the skill; the 29 patterns, personality/soul section, and full worked example are preserved verbatim from the source. Original MIT license preserved in the `LICENSE` file alongside this `SKILL.md`.
|
||||
|
||||
Key insight from Wikipedia: "LLMs use statistical algorithms to guess what should come next. The result tends toward the most statistically likely result that applies to the widest variety of cases."
|
||||
@ -1,14 +1,14 @@
|
||||
---
|
||||
title: "Manim Video — Production pipeline for mathematical and technical animations using Manim Community Edition"
|
||||
title: "Manim Video — Manim CE animations: 3Blue1Brown math/algo videos"
|
||||
sidebar_label: "Manim Video"
|
||||
description: "Production pipeline for mathematical and technical animations using Manim Community Edition"
|
||||
description: "Manim CE animations: 3Blue1Brown math/algo videos"
|
||||
---
|
||||
|
||||
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
|
||||
|
||||
# Manim Video
|
||||
|
||||
Production pipeline for mathematical and technical animations using Manim Community Edition. Creates 3Blue1Brown-style explainer videos, algorithm visualizations, equation derivations, architecture diagrams, and data stories. Use when users request: animated explanations, math animations, concept visualizations, algorithm walkthroughs, technical explainers, 3Blue1Brown style videos, or any programmatic animation with geometric/mathematical content.
|
||||
Manim CE animations: 3Blue1Brown math/algo videos.
|
||||
|
||||
## Skill metadata
|
||||
|
||||
@ -26,6 +26,10 @@ The following is the complete skill definition that Hermes loads when this skill
|
||||
|
||||
# Manim Video Production Pipeline
|
||||
|
||||
## When to use
|
||||
|
||||
Use when users request: animated explanations, math animations, concept visualizations, algorithm walkthroughs, technical explainers, 3Blue1Brown style videos, or any programmatic animation with geometric/mathematical content. Creates 3Blue1Brown-style explainer videos, algorithm visualizations, equation derivations, architecture diagrams, and data stories using Manim Community Edition.
|
||||
|
||||
## Creative Standard
|
||||
|
||||
This is educational cinema. Every frame teaches. Every animation reveals structure.
|
||||
|
||||
@ -1,14 +1,14 @@
|
||||
---
|
||||
title: "P5Js — Production pipeline for interactive and generative visual art using p5"
|
||||
title: "P5Js — p5"
|
||||
sidebar_label: "P5Js"
|
||||
description: "Production pipeline for interactive and generative visual art using p5"
|
||||
description: "p5"
|
||||
---
|
||||
|
||||
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
|
||||
|
||||
# P5Js
|
||||
|
||||
Production pipeline for interactive and generative visual art using p5.js. Creates browser-based sketches, generative art, data visualizations, interactive experiences, 3D scenes, audio-reactive visuals, and motion graphics — exported as HTML, PNG, GIF, MP4, or SVG. Covers: 2D/3D rendering, noise and particle systems, flow fields, shaders (GLSL), pixel manipulation, kinetic typography, WebGL scenes, audio analysis, mouse/keyboard interaction, and headless high-res export. Use when users request: p5.js sketches, creative coding, generative art, interactive visualizations, canvas animations, browser-based visual art, data viz, shader effects, or any p5.js project.
|
||||
p5.js sketches: gen art, shaders, interactive, 3D.
|
||||
|
||||
## Skill metadata
|
||||
|
||||
@ -28,6 +28,14 @@ The following is the complete skill definition that Hermes loads when this skill
|
||||
|
||||
# p5.js Production Pipeline
|
||||
|
||||
## When to use
|
||||
|
||||
Use when users request: p5.js sketches, creative coding, generative art, interactive visualizations, canvas animations, browser-based visual art, data viz, shader effects, or any p5.js project.
|
||||
|
||||
## What's inside
|
||||
|
||||
Production pipeline for interactive and generative visual art using p5.js. Creates browser-based sketches, generative art, data visualizations, interactive experiences, 3D scenes, audio-reactive visuals, and motion graphics — exported as HTML, PNG, GIF, MP4, or SVG. Covers: 2D/3D rendering, noise and particle systems, flow fields, shaders (GLSL), pixel manipulation, kinetic typography, WebGL scenes, audio analysis, mouse/keyboard interaction, and headless high-res export.
|
||||
|
||||
## Creative Standard
|
||||
|
||||
This is visual art rendered in the browser. The canvas is the medium; the algorithm is the brush.
|
||||
|
||||
@ -1,14 +1,14 @@
|
||||
---
|
||||
title: "Pixel Art — Convert images into retro pixel art with hardware-accurate palettes (NES, Game Boy, PICO-8, C64, etc"
|
||||
title: "Pixel Art — Pixel art w/ era palettes (NES, Game Boy, PICO-8)"
|
||||
sidebar_label: "Pixel Art"
|
||||
description: "Convert images into retro pixel art with hardware-accurate palettes (NES, Game Boy, PICO-8, C64, etc"
|
||||
description: "Pixel art w/ era palettes (NES, Game Boy, PICO-8)"
|
||||
---
|
||||
|
||||
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
|
||||
|
||||
# Pixel Art
|
||||
|
||||
Convert images into retro pixel art with hardware-accurate palettes (NES, Game Boy, PICO-8, C64, etc.), and animate them into short videos. Presets cover arcade, SNES, and 10+ era-correct looks. Use `clarify` to let the user pick a style before generating.
|
||||
Pixel art w/ era palettes (NES, Game Boy, PICO-8).
|
||||
|
||||
## Skill metadata
|
||||
|
||||
|
||||
@ -1,14 +1,14 @@
|
||||
---
|
||||
title: "Popular Web Designs — 54 production-quality design systems extracted from real websites"
|
||||
title: "Popular Web Designs — 54 real design systems (Stripe, Linear, Vercel) as HTML/CSS"
|
||||
sidebar_label: "Popular Web Designs"
|
||||
description: "54 production-quality design systems extracted from real websites"
|
||||
description: "54 real design systems (Stripe, Linear, Vercel) as HTML/CSS"
|
||||
---
|
||||
|
||||
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
|
||||
|
||||
# Popular Web Designs
|
||||
|
||||
54 production-quality design systems extracted from real websites. Load a template to generate HTML/CSS that matches the visual identity of sites like Stripe, Linear, Vercel, Notion, Airbnb, and more. Each template includes colors, typography, components, layout rules, and ready-to-use CSS values.
|
||||
54 real design systems (Stripe, Linear, Vercel) as HTML/CSS.
|
||||
|
||||
## Skill metadata
|
||||
|
||||
@ -32,6 +32,16 @@ The following is the complete skill definition that Hermes loads when this skill
|
||||
site's complete visual language: color palette, typography hierarchy, component styles, spacing
|
||||
system, shadows, responsive behavior, and practical agent prompts with exact CSS values.
|
||||
|
||||
## Related design skills
|
||||
|
||||
- **`claude-design`** — use for the design *process and taste* (scoping a brief,
|
||||
producing variants, verifying a local HTML artifact, avoiding AI-design slop).
|
||||
Pair it with this skill when the user wants a thoughtfully-designed page styled
|
||||
after a known brand: `claude-design` drives the workflow, this skill supplies
|
||||
the visual vocabulary.
|
||||
- **`design-md`** — use when the deliverable is a formal DESIGN.md token spec
|
||||
file, not a rendered artifact.
|
||||
|
||||
## How to Use
|
||||
|
||||
1. Pick a design from the catalog below
|
||||
|
||||
@ -0,0 +1,237 @@
|
||||
---
|
||||
title: "Pretext"
|
||||
sidebar_label: "Pretext"
|
||||
description: "Use when building creative browser demos with @chenglou/pretext — DOM-free text layout for ASCII art, typographic flow around obstacles, text-as-geometry gam..."
|
||||
---
|
||||
|
||||
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
|
||||
|
||||
# Pretext
|
||||
|
||||
Use when building creative browser demos with @chenglou/pretext — DOM-free text layout for ASCII art, typographic flow around obstacles, text-as-geometry games, kinetic typography, and text-powered generative art. Produces single-file HTML demos by default.
|
||||
|
||||
## Skill metadata
|
||||
|
||||
| | |
|
||||
|---|---|
|
||||
| Source | Bundled (installed by default) |
|
||||
| Path | `skills/creative/pretext` |
|
||||
| Version | `1.0.0` |
|
||||
| Author | Hermes Agent |
|
||||
| License | MIT |
|
||||
| Tags | `creative-coding`, `typography`, `pretext`, `ascii-art`, `canvas`, `generative`, `text-layout`, `kinetic-typography` |
|
||||
| Related skills | [`p5js`](/docs/user-guide/skills/bundled/creative/creative-p5js), [`claude-design`](/docs/user-guide/skills/bundled/creative/creative-claude-design), [`excalidraw`](/docs/user-guide/skills/bundled/creative/creative-excalidraw), [`architecture-diagram`](/docs/user-guide/skills/bundled/creative/creative-architecture-diagram) |
|
||||
|
||||
## Reference: full SKILL.md
|
||||
|
||||
:::info
|
||||
The following is the complete skill definition that Hermes loads when this skill is triggered. This is what the agent sees as instructions when the skill is active.
|
||||
:::
|
||||
|
||||
# Pretext Creative Demos
|
||||
|
||||
## Overview
|
||||
|
||||
[`@chenglou/pretext`](https://github.com/chenglou/pretext) is a 15KB zero-dependency TypeScript library by Cheng Lou (React core, ReasonML, Midjourney) for **DOM-free multiline text measurement and layout**. It does one thing: given `(text, font, width)`, return the line breaks, per-line widths, per-grapheme positions, and total height — all via canvas measurement, no reflow.
|
||||
|
||||
That sounds like plumbing. It is not. Because it is fast and geometric, it is a **creative primitive**: you can reflow paragraphs around a moving sprite at 60fps, build games whose level geometry is made of real words, drive ASCII logos through prose, shatter text into particles with exact per-grapheme starting positions, or pack shrink-wrapped multiline UI without any `getBoundingClientRect` thrash.
|
||||
|
||||
This skill exists so Hermes can make **cool demos** with it — the kind people post to X. See `pretext.cool` and `chenglou.me/pretext` for the community demo corpus.
|
||||
|
||||
## When to Use
|
||||
|
||||
Use when the user asks for:
|
||||
- A "pretext demo" / "cool pretext thing" / "text-as-X"
|
||||
- Text flowing around a moving shape (hero sections, editorial layouts, animated long-form pages)
|
||||
- ASCII-art effects using **real words or prose**, not monospace rasters
|
||||
- Games where the playfield / obstacles / bricks are made of text (Tetris-from-letters, Breakout-of-prose)
|
||||
- Kinetic typography with per-glyph physics (shatter, scatter, flock, flow)
|
||||
- Typographic generative art, especially with non-Latin scripts or mixed scripts
|
||||
- Multiline "shrink-wrap" UI (smallest container width that still fits the text)
|
||||
- Anything that would require knowing line breaks *before* rendering
|
||||
|
||||
Don't use for:
|
||||
- Static SVG/HTML pages where CSS already solves layout — just use CSS
|
||||
- Rich text editors, general inline formatting engines (pretext is intentionally narrow)
|
||||
- Image → text (use `ascii-art` / `ascii-video` skills)
|
||||
- Pure canvas generative art with no text role — use `p5js`
|
||||
|
||||
## Creative Standard
|
||||
|
||||
This is visual art rendered in a browser. Pretext returns numbers; **you** draw the thing.
|
||||
|
||||
- **Don't ship a "hello world" demo.** The `hello-orb-flow.html` template is the *starting* point. Every delivered demo must add intentional color, motion, composition, and one visual detail the user didn't ask for but will appreciate.
|
||||
- **Dark backgrounds, warm cores, considered palette.** Classic amber-on-black (CRT / terminal) works, but so do cold-white-on-charcoal (editorial) and desaturated pastels (risograph). Pick one and commit.
|
||||
- **Proportional fonts are the point.** Pretext's whole vibe is "not monospaced" — lean into it. Use Iowan Old Style, Inter, JetBrains Mono, Helvetica Neue, or a variable font. Never default sans.
|
||||
- **Real source/text, not lorem ipsum.** The corpus should mean something. Short manifestos, poetry, real source code, a found text, the library's own README — never `lorem ipsum`.
|
||||
- **First-paint excellence.** No loading states, no blank frames. The demo must look shippable the instant it opens.
|
||||
|
||||
## Stack
|
||||
|
||||
Single self-contained HTML file per demo. No build step.
|
||||
|
||||
| Layer | Tool | Purpose |
|
||||
|-------|------|---------|
|
||||
| Core | `@chenglou/pretext` via `esm.sh` CDN | Text measurement + line layout |
|
||||
| Render | HTML5 Canvas 2D | Glyph rendering, per-frame composition |
|
||||
| Segmentation | `Intl.Segmenter` (built-in) | Grapheme splitting for emoji / CJK / combining marks |
|
||||
| Interaction | Raw DOM events | Mouse / touch / wheel — no framework |
|
||||
|
||||
```html
|
||||
<script type="module">
|
||||
import {
|
||||
prepare, layout, // use-case 1: simple height
|
||||
prepareWithSegments, layoutWithLines, // use-case 2a: fixed-width lines
|
||||
layoutNextLineRange, materializeLineRange, // use-case 2b: streaming / variable width
|
||||
measureLineStats, walkLineRanges, // stats without string allocation
|
||||
} from "https://esm.sh/@chenglou/pretext@0.0.6";
|
||||
</script>
|
||||
```
|
||||
|
||||
Pin the version. `@0.0.6` at time of writing — check [npm](https://www.npmjs.com/package/@chenglou/pretext) for the latest if demo behavior is off.
|
||||
|
||||
## The Two Use Cases
|
||||
|
||||
Almost everything reduces to one of these two shapes. Learn both.
|
||||
|
||||
### Use-case 1 — measure, then render with CSS/DOM
|
||||
|
||||
```js
|
||||
const prepared = prepare(text, "16px Inter");
|
||||
const { height, lineCount } = layout(prepared, 320, 20);
|
||||
```
|
||||
|
||||
You still let the browser draw the text. Pretext just tells you how tall the box will be at a given width, **without** a DOM read. Use for:
|
||||
- Virtualized lists where rows contain wrapping text
|
||||
- Masonry with precise card heights
|
||||
- "Does this label fit?" dev-time checks
|
||||
- Preventing layout shift when remote text loads
|
||||
|
||||
**Keep `font` and `letterSpacing` exactly in sync with your CSS.** The canvas `ctx.font` format (e.g. `"16px Inter"`, `"500 17px 'JetBrains Mono'"`) must match the rendered CSS, or measurements drift.
|
||||
|
||||
### Use-case 2 — measure *and* render yourself
|
||||
|
||||
```js
|
||||
const prepared = prepareWithSegments(text, FONT);
|
||||
const { lines } = layoutWithLines(prepared, 320, 26);
|
||||
for (let i = 0; i < lines.length; i++) {
|
||||
ctx.fillText(lines[i].text, 0, i * 26);
|
||||
}
|
||||
```
|
||||
|
||||
This is where the creative work lives. You own the drawing, so you can:
|
||||
- Render to canvas, SVG, WebGL, or any coordinate system
|
||||
- Substitute per-glyph transforms (rotation, jitter, scale, opacity)
|
||||
- Use line metadata (width, grapheme positions) as geometry
|
||||
|
||||
For **variable-width-per-line** flow (text around a shape, text in a donut band, text in a non-rectangular column):
|
||||
|
||||
```js
|
||||
let cursor = { segmentIndex: 0, graphemeIndex: 0 };
|
||||
let y = 0;
|
||||
while (true) {
|
||||
const lineWidth = widthAtY(y); // your function: how wide is the corridor at this y?
|
||||
const range = layoutNextLineRange(prepared, cursor, lineWidth);
|
||||
if (!range) break;
|
||||
const line = materializeLineRange(prepared, range);
|
||||
ctx.fillText(line.text, leftEdgeAtY(y), y);
|
||||
cursor = range.end;
|
||||
y += lineHeight;
|
||||
}
|
||||
```
|
||||
|
||||
This is the most important pattern in the whole library. It's what unlocks "text flowing around a dragged sprite" — the demo that went viral on X.
|
||||
|
||||
### Helpers worth knowing
|
||||
|
||||
- `measureLineStats(prepared, maxWidth)` → `{ lineCount, maxLineWidth }` — the widest line, i.e. multiline shrink-wrap width.
|
||||
- `walkLineRanges(prepared, maxWidth, callback)` — iterate lines without allocating strings. Use for stats/physics over graphemes when you don't need the characters.
|
||||
- `@chenglou/pretext/rich-inline` — the same system but for paragraphs mixing fonts / chips / mentions. Import from the subpath.
|
||||
|
||||
## Demo Recipe Patterns
|
||||
|
||||
The community corpus (see `references/patterns.md`) clusters into a handful of strong patterns. Pick one and riff — don't invent a new category unless asked.
|
||||
|
||||
| Pattern | Key API | Example idea |
|
||||
|---|---|---|
|
||||
| **Reflow around obstacle** | `layoutNextLineRange` + per-row width function | Editorial paragraph that parts around a dragged cursor sprite |
|
||||
| **Text-as-geometry game** | `layoutWithLines` + per-line collision rects | Breakout where each brick is a measured word |
|
||||
| **Shatter / particles** | `walkLineRanges` → per-grapheme (x,y) → physics | Sentence that explodes into letters on click |
|
||||
| **ASCII obstacle typography** | `layoutNextLineRange` + measured per-row obstacle spans | Bitmap ASCII logo, shape morphs, and draggable wire objects that make text open around their actual geometry |
|
||||
| **Editorial multi-column** | `layoutNextLineRange` per column + shared cursor | Animated magazine spread with pull quotes |
|
||||
| **Kinetic type** | `layoutWithLines` + per-line transform over time | Star Wars crawl, wave, bounce, glitch |
|
||||
| **Multiline shrink-wrap** | `measureLineStats` | Quote card that auto-sizes to its tightest container |
|
||||
|
||||
See `templates/donut-orbit.html` and `templates/hello-orb-flow.html` for working single-file starters.
|
||||
|
||||
## Workflow
|
||||
|
||||
1. **Pick a pattern** from the table above based on the user's brief.
|
||||
2. **Start from a template**:
|
||||
- `templates/hello-orb-flow.html` — text reflowing around a moving orb (reflow-around-obstacle pattern)
|
||||
- `templates/donut-orbit.html` — advanced example: measured ASCII logo obstacles, draggable wire sphere/cube, morphing shape fields, selectable DOM text, and dev-only controls
|
||||
- `write_file` to a new `.html` in `/tmp/` or the user's workspace.
|
||||
3. **Swap the corpus** for something intentional to the brief. Real prose, 10-100 sentences, no lorem.
|
||||
4. **Tune the aesthetic** — font, palette, composition, interaction. This is the work; don't skip it.
|
||||
5. **Verify locally**:
|
||||
```sh
|
||||
cd <dir-with-html> && python3 -m http.server 8765
|
||||
# then open http://localhost:8765/<file>.html
|
||||
```
|
||||
6. **Check the console** — pretext will throw if `prepareWithSegments` is called with a bad font string; `Intl.Segmenter` is available in every modern browser.
|
||||
7. **Show the user the file path**, not just the code — they want to open it.
|
||||
|
||||
## Performance Notes
|
||||
|
||||
- `prepare()` / `prepareWithSegments()` is the expensive call. Do it **once** per text+font pair. Cache the handle.
|
||||
- On resize, only rerun `layout()` / `layoutWithLines()` — never re-prepare.
|
||||
- For per-frame animations where text doesn't change but geometry does, `layoutNextLineRange` in a tight loop is cheap enough to do every frame at 60fps for normal-length paragraphs.
|
||||
- When rendering ASCII masks per frame, keep a cell buffer (`Uint8Array`/typed arrays), derive measured per-row obstacle spans from the cells or projected geometry, merge spans, then feed those spans into `layoutNextLineRange` before drawing text.
|
||||
- Keep visual animation and layout animation coupled. If a sphere morphs into a cube, tween both the rendered cell buffer and the obstacle spans with the same value; otherwise the demo looks painted-on instead of physically reflowed.
|
||||
- For fades, prefer layer opacity over changing glyph intensity or obstacle scale. Put transient ASCII sprites on their own canvas and fade the canvas with CSS/GSAP opacity so geometry does not appear to shrink.
|
||||
- Canvas `ctx.font` setting is surprisingly slow; set it **once** per frame if font doesn't vary, not per `fillText` call.
|
||||
|
||||
## Common Pitfalls
|
||||
|
||||
1. **Drifting CSS/canvas font strings.** `ctx.font = "16px Inter"` measured, but CSS says `font-family: Inter, sans-serif; font-size: 16px`. Fine *if* Inter loads. If Inter 404s, CSS falls back to sans-serif and measurements drift by 5-20%. Always `preload` the font or use a web-safe family.
|
||||
|
||||
2. **Re-preparing inside the animation loop.** Only `layout*` is cheap. Re-calling `prepare` every frame will tank perf. Keep the prepared handle in module scope.
|
||||
|
||||
3. **Forgetting `Intl.Segmenter` for grapheme splits.** Emoji, combining marks, CJK — `"é".split("")` gives you two chars. Use `new Intl.Segmenter(undefined, { granularity: "grapheme" })` when sampling individual visible glyphs.
|
||||
|
||||
4. **`break: 'never'` chips without `extraWidth`.** In `rich-inline`, if you use `break: 'never'` for an atomic chip/mention, you must also supply `extraWidth` for the pill padding — otherwise chip chrome overflows the container.
|
||||
|
||||
5. **Using `@chenglou/pretext` from `unpkg` with TypeScript-only entry.** Use `esm.sh` — it compiles the TS exports to browser-ready ESM automatically. `unpkg` will 404 or serve raw TS.
|
||||
|
||||
6. **Monospace fallbacks silently erasing the whole point.** Users seeing monospace-looking output often have a CSS `font-family` that fell through to `monospace`. Verify the actual rendered font via DevTools.
|
||||
|
||||
7. **Skipping rows vs adjusting width** when flowing around a shape. If the corridor on this row is too narrow to fit a line, *skip the row* (`y += lineHeight; continue;`) rather than passing a tiny maxWidth to `layoutNextLineRange` — pretext will return one-grapheme lines that look broken.
|
||||
|
||||
8. **Shipping a cold demo.** The default first-paint looks tutorial-grade. Add: vignette, subtle scanline, idle auto-motion, one carefully chosen interactive response (drag, hover, scroll, click). Without these, "cool pretext demo" lands as "intern repro of the README."
|
||||
|
||||
## Verification Checklist
|
||||
|
||||
- [ ] Demo is a single self-contained `.html` file — opens by double-click or `python3 -m http.server`
|
||||
- [ ] `@chenglou/pretext` imported via `esm.sh` with pinned version
|
||||
- [ ] Corpus is real prose, not lorem ipsum, and matches the demo's concept
|
||||
- [ ] Font string passed to `prepare` matches the CSS font exactly
|
||||
- [ ] `prepare()` / `prepareWithSegments()` called once, not per frame
|
||||
- [ ] Dark background + considered palette — not the default white canvas
|
||||
- [ ] At least one interactive response (drag / hover / scroll / click) or idle auto-motion
|
||||
- [ ] Tested locally with `python3 -m http.server` and confirmed no console errors
|
||||
- [ ] 60fps on a mid-tier laptop (or graceful degradation documented)
|
||||
- [ ] One "extra mile" detail the user didn't ask for
|
||||
|
||||
## Reference: Community Demos
|
||||
|
||||
Clone these for inspiration / patterns (all MIT-ish, linked from [pretext.cool](https://www.pretext.cool/)):
|
||||
|
||||
- **Pretext Breaker** — breakout with word-bricks — `github.com/rinesh/pretext-breaker`
|
||||
- **Tetris × Pretext** — `github.com/shinichimochizuki/tetris-pretext`
|
||||
- **Dragon animation** — `github.com/qtakmalay/PreTextExperiments`
|
||||
- **Somnai editorial engine** — `github.com/somnai-dreams/pretext-demos`
|
||||
- **Bad Apple!! ASCII** — `github.com/frmlinn/bad-apple-pretext`
|
||||
- **Drag-sprite reflow** — `github.com/dokobot/pretext-demo`
|
||||
- **Alarmy editorial clock** — `github.com/SmisLee/alarmy-pretext-demo`
|
||||
|
||||
Official playground: [chenglou.me/pretext](https://chenglou.me/pretext/) — accordion, bubbles, dynamic-layout, editorial-engine, justification-comparison, masonry, markdown-chat, rich-note.
|
||||
@ -0,0 +1,237 @@
|
||||
---
|
||||
title: "Sketch — Throwaway HTML mockups: 2-3 design variants to compare"
|
||||
sidebar_label: "Sketch"
|
||||
description: "Throwaway HTML mockups: 2-3 design variants to compare"
|
||||
---
|
||||
|
||||
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
|
||||
|
||||
# Sketch
|
||||
|
||||
Throwaway HTML mockups: 2-3 design variants to compare.
|
||||
|
||||
## Skill metadata
|
||||
|
||||
| | |
|
||||
|---|---|
|
||||
| Source | Bundled (installed by default) |
|
||||
| Path | `skills/creative/sketch` |
|
||||
| Version | `1.0.0` |
|
||||
| Author | Hermes Agent (adapted from gsd-build/get-shit-done) |
|
||||
| License | MIT |
|
||||
| Tags | `sketch`, `mockup`, `design`, `ui`, `prototype`, `html`, `variants`, `exploration`, `wireframe`, `comparison` |
|
||||
| Related skills | [`spike`](/docs/user-guide/skills/bundled/software-development/software-development-spike), [`claude-design`](/docs/user-guide/skills/bundled/creative/creative-claude-design), [`popular-web-designs`](/docs/user-guide/skills/bundled/creative/creative-popular-web-designs), [`excalidraw`](/docs/user-guide/skills/bundled/creative/creative-excalidraw) |
|
||||
|
||||
## Reference: full SKILL.md
|
||||
|
||||
:::info
|
||||
The following is the complete skill definition that Hermes loads when this skill is triggered. This is what the agent sees as instructions when the skill is active.
|
||||
:::
|
||||
|
||||
# Sketch
|
||||
|
||||
Use this skill when the user wants to **see a design direction before committing** to one — exploring a UI/UX idea as disposable HTML mockups. The point is to generate 2-3 interactive variants so the user can compare visual directions side-by-side, not to produce shippable code.
|
||||
|
||||
Load this when the user says things like "sketch this screen", "show me what X could look like", "compare layout A vs B", "give me 2-3 takes on this UI", "let me see some variants", "mockup this before I build".
|
||||
|
||||
## When NOT to use this
|
||||
|
||||
- User wants a production component — use `claude-design` or build it properly
|
||||
- User wants a polished one-off HTML artifact (landing page, deck) — `claude-design`
|
||||
- User wants a diagram — `excalidraw`, `architecture-diagram`
|
||||
- The design is already locked — just build it
|
||||
|
||||
## If the user has the full GSD system installed
|
||||
|
||||
If `gsd-sketch` shows up as a sibling skill (installed via `npx get-shit-done-cc --hermes`), prefer **`gsd-sketch`** for the full workflow: persistent `.planning/sketches/` with MANIFEST, frontier mode analysis, consistency audits across past sketches, and integration with the rest of GSD. This skill is the lightweight standalone version — one-off sketching without the state machinery.
|
||||
|
||||
## Core method
|
||||
|
||||
```
|
||||
intake → variants → head-to-head → pick winner (or iterate)
|
||||
```
|
||||
|
||||
### 1. Intake (skip if the user already gave you enough)
|
||||
|
||||
Before generating variants, get three things — one question at a time, not all at once:
|
||||
|
||||
1. **Feel.** "What should this feel like? Adjectives, emotions, a vibe." — *"calm, editorial, like Linear"* tells you more than *"minimal"*.
|
||||
2. **References.** "What apps, sites, or products capture the feel you're imagining?" — actual references beat abstract descriptions.
|
||||
3. **Core action.** "What's the single most important thing a user does on this screen?" — the variants should all serve this well; if they don't, they're just decoration.
|
||||
|
||||
Reflect each answer briefly before the next question. If the user already gave you all three upfront, skip straight to variants.
|
||||
|
||||
### 2. Variants (2-3, never 1, rarely 4+)
|
||||
|
||||
Produce **2-3 variants** in one go. Each variant is a complete, standalone HTML file. Don't describe variants — build them. The point is comparison.
|
||||
|
||||
Each variant should take a **different design stance**, not different pixel values. Three good variant axes:
|
||||
|
||||
- **Density:** compact / airy / ultra-dense (pick two contrasting poles)
|
||||
- **Emphasis:** content-first / action-first / tool-first
|
||||
- **Aesthetic:** editorial / utilitarian / playful
|
||||
- **Layout:** single-column / sidebar / split-pane
|
||||
- **Grounding:** card-based / bare-content / document-style
|
||||
|
||||
Pick one axis and pull apart from it. Two variants that differ only in accent color are wasted effort — the user can't distinguish them.
|
||||
|
||||
**Variant naming:** describe the stance, not the number.
|
||||
|
||||
<!-- ascii-guard-ignore -->
|
||||
```
|
||||
sketches/
|
||||
├── 001-calm-editorial/
|
||||
│ ├── index.html
|
||||
│ └── README.md
|
||||
├── 001-utilitarian-dense/
|
||||
│ ├── index.html
|
||||
│ └── README.md
|
||||
└── 001-playful-split/
|
||||
├── index.html
|
||||
└── README.md
|
||||
```
|
||||
<!-- ascii-guard-ignore-end -->
|
||||
|
||||
### 3. Make them real HTML
|
||||
|
||||
Each variant is a **single self-contained HTML file**:
|
||||
|
||||
- Inline `<style>` — no build step, no external CSS
|
||||
- System fonts or one Google Font via `<link>`
|
||||
- Tailwind via CDN (`<script src="https://cdn.tailwindcss.com"></script>`) is fine
|
||||
- Realistic fake content — actual sentences, actual names, not "Lorem ipsum"
|
||||
- **Interactive**: links clickable, hovers real, at least one state transition (open/close, filter, toggle). A frozen static image is a worse spike than a sloppy animated one.
|
||||
|
||||
Open it in a browser. If it looks broken, fix it before showing the user.
|
||||
|
||||
**Verify variants visually — use Hermes' browser tools.** Don't just write HTML and hope it renders; load each variant and look at it:
|
||||
|
||||
```
|
||||
browser_navigate(url="file:///absolute/path/to/sketches/001-calm-editorial/index.html")
|
||||
browser_vision(question="Does this layout look clean and readable? Any visible bugs (overlapping text, unstyled elements, broken images)?")
|
||||
```
|
||||
|
||||
`browser_vision` returns an AI description of what's actually on the page plus a screenshot path — catches layout bugs that pure source inspection misses (e.g. a font import that silently failed, a flex container that collapsed). Fix and re-navigate until each variant looks right.
|
||||
|
||||
**Default CSS reset + system font stack** for fast starts:
|
||||
|
||||
```html
|
||||
<style>
|
||||
* { box-sizing: border-box; margin: 0; padding: 0; }
|
||||
body {
|
||||
font-family: -apple-system, BlinkMacSystemFont, "Segoe UI", Roboto,
|
||||
"Helvetica Neue", Arial, sans-serif;
|
||||
-webkit-font-smoothing: antialiased;
|
||||
color: #1a1a1a;
|
||||
background: #fafafa;
|
||||
line-height: 1.5;
|
||||
}
|
||||
</style>
|
||||
```
|
||||
|
||||
### 4. Variant README
|
||||
|
||||
Each variant's `README.md` answers:
|
||||
|
||||
```markdown
|
||||
## Variant: {stance name}
|
||||
|
||||
### Design stance
|
||||
One sentence on the principle driving this variant.
|
||||
|
||||
### Key choices
|
||||
- Layout: ...
|
||||
- Typography: ...
|
||||
- Color: ...
|
||||
- Interaction: ...
|
||||
|
||||
### Trade-offs
|
||||
- Strong at: ...
|
||||
- Weak at: ...
|
||||
|
||||
### Best for
|
||||
- The kind of user or use case this variant actually serves
|
||||
```
|
||||
|
||||
### 5. Head-to-head
|
||||
|
||||
After all variants are built, present them as a comparison. Don't just list — **opinionate**:
|
||||
|
||||
```markdown
|
||||
## Three takes on the home screen
|
||||
|
||||
| Dimension | Calm editorial | Utilitarian dense | Playful split |
|
||||
|-----------|----------------|-------------------|---------------|
|
||||
| Density | Low | High | Medium |
|
||||
| Primary action visibility | Low | High | Medium |
|
||||
| Scan-ability | High | Medium | Low |
|
||||
| Feel | Calm, trusted | Sharp, tool-like | Inviting, energetic |
|
||||
|
||||
**My take:** Utilitarian dense for power users, calm editorial for content-forward audiences. Playful split is weakest — tries to do both and commits to neither.
|
||||
```
|
||||
|
||||
Let the user pick a winner, or combine two into a hybrid, or ask for another round.
|
||||
|
||||
## Theming (when the project has a visual identity)
|
||||
|
||||
If the user has an existing theme (colors, fonts, tokens), put shared tokens in `sketches/themes/tokens.css` and `@import` them in each variant. Keep tokens minimal:
|
||||
|
||||
```css
|
||||
/* sketches/themes/tokens.css */
|
||||
:root {
|
||||
--color-bg: #fafafa;
|
||||
--color-fg: #1a1a1a;
|
||||
--color-accent: #0066ff;
|
||||
--color-muted: #666;
|
||||
--radius: 8px;
|
||||
--font-display: "Inter", sans-serif;
|
||||
--font-body: -apple-system, BlinkMacSystemFont, sans-serif;
|
||||
}
|
||||
```
|
||||
|
||||
Don't over-tokenize a throwaway sketch — three colors and one font is usually enough.
|
||||
|
||||
## Interactivity bar
|
||||
|
||||
A sketch is interactive enough when the user can:
|
||||
|
||||
1. **Click a primary action** and something visible happens (state change, modal, toast, navigation feint)
|
||||
2. **See one meaningful state transition** (filter a list, toggle a mode, open/close a panel)
|
||||
3. **Hover recognizable affordances** (buttons, rows, tabs)
|
||||
|
||||
More than that is over-engineering a throwaway. Less than that is a screenshot.
|
||||
|
||||
## Frontier mode (picking what to sketch next)
|
||||
|
||||
If sketches already exist and the user says "what should I sketch next?":
|
||||
|
||||
- **Consistency gaps** — two winning variants from different sketches made independent choices that haven't been composed together yet
|
||||
- **Unsketched screens** — referenced but never explored
|
||||
- **State coverage** — happy path sketched, but not empty / loading / error / 1000-items
|
||||
- **Responsive gaps** — validated at one viewport; does it hold at mobile / ultrawide?
|
||||
- **Interaction patterns** — static layouts exist; transitions, drag, scroll behavior don't
|
||||
|
||||
Propose 2-4 named candidates. Let the user pick.
|
||||
|
||||
## Output
|
||||
|
||||
- Create `sketches/` (or `.planning/sketches/` if the user is using GSD conventions) in the repo root
|
||||
- One subdir per variant: `NNN-stance-name/index.html` + `README.md`
|
||||
- Tell the user how to open them: `open sketches/001-calm-editorial/index.html` on macOS, `xdg-open` on Linux, `start` on Windows
|
||||
- Keep variants disposable — a sketch that you felt the need to preserve should be promoted into real project code, not curated as an asset
|
||||
|
||||
**Typical tool sequence for one variant:**
|
||||
|
||||
```
|
||||
terminal("mkdir -p sketches/001-calm-editorial")
|
||||
write_file("sketches/001-calm-editorial/index.html", "<!doctype html>...")
|
||||
write_file("sketches/001-calm-editorial/README.md", "## Variant: Calm editorial\n...")
|
||||
browser_navigate(url="file://$(pwd)/sketches/001-calm-editorial/index.html")
|
||||
browser_vision(question="How does this look? Any obvious layout issues?")
|
||||
```
|
||||
|
||||
Repeat for each variant, then present the comparison table.
|
||||
|
||||
## Attribution
|
||||
|
||||
Adapted from the GSD (Get Shit Done) project's `/gsd-sketch` workflow — MIT © 2025 Lex Christopherson ([gsd-build/get-shit-done](https://github.com/gsd-build/get-shit-done)). The full GSD system ships persistent sketch state, theme/variant pattern references, and consistency-audit workflows; install with `npx get-shit-done-cc --hermes --global`.
|
||||
@ -1,14 +1,14 @@
|
||||
---
|
||||
title: "Songwriting And Ai Music"
|
||||
title: "Songwriting And Ai Music — Songwriting craft and Suno AI music prompts"
|
||||
sidebar_label: "Songwriting And Ai Music"
|
||||
description: "Songwriting craft, AI music generation prompts (Suno focus), parody/adaptation techniques, phonetic tricks, and lessons learned"
|
||||
description: "Songwriting craft and Suno AI music prompts"
|
||||
---
|
||||
|
||||
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
|
||||
|
||||
# Songwriting And Ai Music
|
||||
|
||||
Songwriting craft, AI music generation prompts (Suno focus), parody/adaptation techniques, phonetic tricks, and lessons learned. These are tools and ideas, not rules. Break any of them when the art calls for it.
|
||||
Songwriting craft and Suno AI music prompts.
|
||||
|
||||
## Skill metadata
|
||||
|
||||
|
||||
@ -221,8 +221,9 @@ win.par.winopen.pulse()
|
||||
| `td_input_clear` | Stop input automation |
|
||||
| `td_op_screen_rect` | Get screen coords of a node |
|
||||
| `td_click_screen_point` | Click a point in a screenshot |
|
||||
| `td_screen_point_to_global` | Convert screenshot pixel to absolute screen coords |
|
||||
|
||||
See `references/mcp-tools.md` for full parameter schemas.
|
||||
The table above covers the 32 tools used in typical creative workflows. The remaining 4 tools (`td_project_quit`, `td_test_session`, `td_dev_log`, `td_clear_dev_log`) are admin/dev-mode utilities — see `references/mcp-tools.md` for the full 36-tool reference with complete parameter schemas.
|
||||
|
||||
## Key Implementation Rules
|
||||
|
||||
@ -355,6 +356,15 @@ See `references/network-patterns.md` for complete build scripts + shader code.
|
||||
| `references/operator-tips.md` | Wireframe rendering, feedback TOP setup |
|
||||
| `references/geometry-comp.md` | Geometry COMP: instancing, POP vs SOP, morphing |
|
||||
| `references/audio-reactive.md` | Audio band extraction, beat detection, envelope following |
|
||||
| `references/animation.md` | LFOs, timers, keyframes, easing, expression-driven motion |
|
||||
| `references/midi-osc.md` | MIDI/OSC controllers, TouchOSC, multi-machine sync |
|
||||
| `references/particles.md` | POPs and legacy particleSOP — emission, forces, collisions |
|
||||
| `references/projection-mapping.md` | Multi-window output, corner pin, mesh warp, edge blending |
|
||||
| `references/external-data.md` | HTTP, WebSocket, MQTT, Serial, TCP, webserverDAT |
|
||||
| `references/panel-ui.md` | Custom params, panel COMPs, button/slider/field, panelExecuteDAT |
|
||||
| `references/replicator.md` | replicatorCOMP — data-driven cloning, layouts, callbacks |
|
||||
| `references/dat-scripting.md` | Execute DAT family — chop/dat/parameter/panel/op/executeDAT |
|
||||
| `references/3d-scene.md` | Lighting rigs, shadows, IBL/cubemaps, multi-camera, PBR |
|
||||
| `scripts/setup.sh` | Automated setup script |
|
||||
|
||||
---
|
||||
|
||||
@ -1,14 +1,14 @@
|
||||
---
|
||||
title: "Jupyter Live Kernel — Use a live Jupyter kernel for stateful, iterative Python execution via hamelnb"
|
||||
title: "Jupyter Live Kernel — Iterative Python via live Jupyter kernel (hamelnb)"
|
||||
sidebar_label: "Jupyter Live Kernel"
|
||||
description: "Use a live Jupyter kernel for stateful, iterative Python execution via hamelnb"
|
||||
description: "Iterative Python via live Jupyter kernel (hamelnb)"
|
||||
---
|
||||
|
||||
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
|
||||
|
||||
# Jupyter Live Kernel
|
||||
|
||||
Use a live Jupyter kernel for stateful, iterative Python execution via hamelnb. Load this skill when the task involves exploration, iteration, or inspecting intermediate results — data science, ML experimentation, API exploration, or building up complex code step-by-step. Uses terminal to run CLI commands against a live Jupyter kernel. No new tools required.
|
||||
Iterative Python via live Jupyter kernel (hamelnb).
|
||||
|
||||
## Skill metadata
|
||||
|
||||
|
||||
@ -1,14 +1,14 @@
|
||||
---
|
||||
title: "Webhook Subscriptions"
|
||||
title: "Webhook Subscriptions — Webhook subscriptions: event-driven agent runs"
|
||||
sidebar_label: "Webhook Subscriptions"
|
||||
description: "Create and manage webhook subscriptions for event-driven agent activation, or for direct push notifications (zero LLM cost)"
|
||||
description: "Webhook subscriptions: event-driven agent runs"
|
||||
---
|
||||
|
||||
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
|
||||
|
||||
# Webhook Subscriptions
|
||||
|
||||
Create and manage webhook subscriptions for event-driven agent activation, or for direct push notifications (zero LLM cost). Use when the user wants external services to trigger agent runs OR push notifications to chats.
|
||||
Webhook subscriptions: event-driven agent runs.
|
||||
|
||||
## Skill metadata
|
||||
|
||||
|
||||
@ -1,14 +1,14 @@
|
||||
---
|
||||
title: "Dogfood"
|
||||
title: "Dogfood — Exploratory QA of web apps: find bugs, evidence, reports"
|
||||
sidebar_label: "Dogfood"
|
||||
description: "Systematic exploratory QA testing of web applications — find bugs, capture evidence, and generate structured reports"
|
||||
description: "Exploratory QA of web apps: find bugs, evidence, reports"
|
||||
---
|
||||
|
||||
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
|
||||
|
||||
# Dogfood
|
||||
|
||||
Systematic exploratory QA testing of web applications — find bugs, capture evidence, and generate structured reports
|
||||
Exploratory QA of web apps: find bugs, evidence, reports.
|
||||
|
||||
## Skill metadata
|
||||
|
||||
@ -50,11 +50,13 @@ Follow this 5-phase systematic workflow:
|
||||
### Phase 1: Plan
|
||||
|
||||
1. Create the output directory structure:
|
||||
<!-- ascii-guard-ignore -->
|
||||
```
|
||||
{output_dir}/
|
||||
├── screenshots/ # Evidence screenshots
|
||||
└── report.md # Final report (generated in Phase 5)
|
||||
```
|
||||
<!-- ascii-guard-ignore-end -->
|
||||
2. Identify the testing scope based on user input.
|
||||
3. Build a rough sitemap by planning which pages and features to test:
|
||||
- Landing/home page
|
||||
|
||||
@ -1,14 +1,14 @@
|
||||
---
|
||||
title: "Himalaya — CLI to manage emails via IMAP/SMTP"
|
||||
title: "Himalaya — Himalaya CLI: IMAP/SMTP email from terminal"
|
||||
sidebar_label: "Himalaya"
|
||||
description: "CLI to manage emails via IMAP/SMTP"
|
||||
description: "Himalaya CLI: IMAP/SMTP email from terminal"
|
||||
---
|
||||
|
||||
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
|
||||
|
||||
# Himalaya
|
||||
|
||||
CLI to manage emails via IMAP/SMTP. Use himalaya to list, read, write, reply, forward, search, and organize emails from the terminal. Supports multiple accounts and message composition with MML (MIME Meta Language).
|
||||
Himalaya CLI: IMAP/SMTP email from terminal.
|
||||
|
||||
## Skill metadata
|
||||
|
||||
|
||||
@ -1,14 +1,14 @@
|
||||
---
|
||||
title: "Minecraft Modpack Server — Set up a modded Minecraft server from a CurseForge/Modrinth server pack zip"
|
||||
title: "Minecraft Modpack Server — Host modded Minecraft servers (CurseForge, Modrinth)"
|
||||
sidebar_label: "Minecraft Modpack Server"
|
||||
description: "Set up a modded Minecraft server from a CurseForge/Modrinth server pack zip"
|
||||
description: "Host modded Minecraft servers (CurseForge, Modrinth)"
|
||||
---
|
||||
|
||||
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
|
||||
|
||||
# Minecraft Modpack Server
|
||||
|
||||
Set up a modded Minecraft server from a CurseForge/Modrinth server pack zip. Covers NeoForge/Forge install, Java version, JVM tuning, firewall, LAN config, backups, and launch scripts.
|
||||
Host modded Minecraft servers (CurseForge, Modrinth).
|
||||
|
||||
## Skill metadata
|
||||
|
||||
|
||||
@ -1,14 +1,14 @@
|
||||
---
|
||||
title: "Pokemon Player — Play Pokemon games autonomously via headless emulation"
|
||||
title: "Pokemon Player — Play Pokemon via headless emulator + RAM reads"
|
||||
sidebar_label: "Pokemon Player"
|
||||
description: "Play Pokemon games autonomously via headless emulation"
|
||||
description: "Play Pokemon via headless emulator + RAM reads"
|
||||
---
|
||||
|
||||
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
|
||||
|
||||
# Pokemon Player
|
||||
|
||||
Play Pokemon games autonomously via headless emulation. Starts a game server, reads structured game state from RAM, makes strategic decisions, and sends button inputs — all from the terminal.
|
||||
Play Pokemon via headless emulator + RAM reads.
|
||||
|
||||
## Skill metadata
|
||||
|
||||
|
||||
@ -1,14 +1,14 @@
|
||||
---
|
||||
title: "Codebase Inspection"
|
||||
title: "Codebase Inspection — Inspect codebases w/ pygount: LOC, languages, ratios"
|
||||
sidebar_label: "Codebase Inspection"
|
||||
description: "Inspect and analyze codebases using pygount for LOC counting, language breakdown, and code-vs-comment ratios"
|
||||
description: "Inspect codebases w/ pygount: LOC, languages, ratios"
|
||||
---
|
||||
|
||||
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
|
||||
|
||||
# Codebase Inspection
|
||||
|
||||
Inspect and analyze codebases using pygount for LOC counting, language breakdown, and code-vs-comment ratios. Use when asked to check lines of code, repo size, language composition, or codebase stats.
|
||||
Inspect codebases w/ pygount: LOC, languages, ratios.
|
||||
|
||||
## Skill metadata
|
||||
|
||||
|
||||
@ -1,14 +1,14 @@
|
||||
---
|
||||
title: "Github Auth — Set up GitHub authentication for the agent using git (universally available) or the gh CLI"
|
||||
title: "Github Auth — GitHub auth setup: HTTPS tokens, SSH keys, gh CLI login"
|
||||
sidebar_label: "Github Auth"
|
||||
description: "Set up GitHub authentication for the agent using git (universally available) or the gh CLI"
|
||||
description: "GitHub auth setup: HTTPS tokens, SSH keys, gh CLI login"
|
||||
---
|
||||
|
||||
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
|
||||
|
||||
# Github Auth
|
||||
|
||||
Set up GitHub authentication for the agent using git (universally available) or the gh CLI. Covers HTTPS tokens, SSH keys, credential helpers, and gh auth — with a detection flow to pick the right method automatically.
|
||||
GitHub auth setup: HTTPS tokens, SSH keys, gh CLI login.
|
||||
|
||||
## Skill metadata
|
||||
|
||||
|
||||
@ -1,14 +1,14 @@
|
||||
---
|
||||
title: "Github Code Review"
|
||||
title: "Github Code Review — Review PRs: diffs, inline comments via gh or REST"
|
||||
sidebar_label: "Github Code Review"
|
||||
description: "Review code changes by analyzing git diffs, leaving inline comments on PRs, and performing thorough pre-push review"
|
||||
description: "Review PRs: diffs, inline comments via gh or REST"
|
||||
---
|
||||
|
||||
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
|
||||
|
||||
# Github Code Review
|
||||
|
||||
Review code changes by analyzing git diffs, leaving inline comments on PRs, and performing thorough pre-push review. Works with gh CLI or falls back to git + GitHub REST API via curl.
|
||||
Review PRs: diffs, inline comments via gh or REST.
|
||||
|
||||
## Skill metadata
|
||||
|
||||
|
||||
@ -1,14 +1,14 @@
|
||||
---
|
||||
title: "Github Issues — Create, manage, triage, and close GitHub issues"
|
||||
title: "Github Issues — Create, triage, label, assign GitHub issues via gh or REST"
|
||||
sidebar_label: "Github Issues"
|
||||
description: "Create, manage, triage, and close GitHub issues"
|
||||
description: "Create, triage, label, assign GitHub issues via gh or REST"
|
||||
---
|
||||
|
||||
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
|
||||
|
||||
# Github Issues
|
||||
|
||||
Create, manage, triage, and close GitHub issues. Search existing issues, add labels, assign people, and link to PRs. Works with gh CLI or falls back to git + GitHub REST API via curl.
|
||||
Create, triage, label, assign GitHub issues via gh or REST.
|
||||
|
||||
## Skill metadata
|
||||
|
||||
|
||||
@ -1,14 +1,14 @@
|
||||
---
|
||||
title: "Github Pr Workflow"
|
||||
title: "Github Pr Workflow — GitHub PR lifecycle: branch, commit, open, CI, merge"
|
||||
sidebar_label: "Github Pr Workflow"
|
||||
description: "Full pull request lifecycle — create branches, commit changes, open PRs, monitor CI status, auto-fix failures, and merge"
|
||||
description: "GitHub PR lifecycle: branch, commit, open, CI, merge"
|
||||
---
|
||||
|
||||
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
|
||||
|
||||
# Github Pr Workflow
|
||||
|
||||
Full pull request lifecycle — create branches, commit changes, open PRs, monitor CI status, auto-fix failures, and merge. Works with gh CLI or falls back to git + GitHub REST API via curl.
|
||||
GitHub PR lifecycle: branch, commit, open, CI, merge.
|
||||
|
||||
## Skill metadata
|
||||
|
||||
|
||||
@ -1,14 +1,14 @@
|
||||
---
|
||||
title: "Github Repo Management — Clone, create, fork, configure, and manage GitHub repositories"
|
||||
title: "Github Repo Management — Clone/create/fork repos; manage remotes, releases"
|
||||
sidebar_label: "Github Repo Management"
|
||||
description: "Clone, create, fork, configure, and manage GitHub repositories"
|
||||
description: "Clone/create/fork repos; manage remotes, releases"
|
||||
---
|
||||
|
||||
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
|
||||
|
||||
# Github Repo Management
|
||||
|
||||
Clone, create, fork, configure, and manage GitHub repositories. Manage remotes, secrets, releases, and workflows. Works with gh CLI or falls back to git + GitHub REST API via curl.
|
||||
Clone/create/fork repos; manage remotes, releases.
|
||||
|
||||
## Skill metadata
|
||||
|
||||
|
||||
@ -1,14 +1,14 @@
|
||||
---
|
||||
title: "Native Mcp"
|
||||
title: "Native Mcp — MCP client: connect servers, register tools (stdio/HTTP)"
|
||||
sidebar_label: "Native Mcp"
|
||||
description: "Built-in MCP (Model Context Protocol) client that connects to external MCP servers, discovers their tools, and registers them as native Hermes Agent tools"
|
||||
description: "MCP client: connect servers, register tools (stdio/HTTP)"
|
||||
---
|
||||
|
||||
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
|
||||
|
||||
# Native Mcp
|
||||
|
||||
Built-in MCP (Model Context Protocol) client that connects to external MCP servers, discovers their tools, and registers them as native Hermes Agent tools. Supports stdio and HTTP transports with automatic reconnection, security filtering, and zero-config tool injection.
|
||||
MCP client: connect servers, register tools (stdio/HTTP).
|
||||
|
||||
## Skill metadata
|
||||
|
||||
|
||||
@ -1,14 +1,14 @@
|
||||
---
|
||||
title: "Gif Search — Search and download GIFs from Tenor using curl"
|
||||
title: "Gif Search — Search/download GIFs from Tenor via curl + jq"
|
||||
sidebar_label: "Gif Search"
|
||||
description: "Search and download GIFs from Tenor using curl"
|
||||
description: "Search/download GIFs from Tenor via curl + jq"
|
||||
---
|
||||
|
||||
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
|
||||
|
||||
# Gif Search
|
||||
|
||||
Search and download GIFs from Tenor using curl. No dependencies beyond curl and jq. Useful for finding reaction GIFs, creating visual content, and sending GIFs in chat.
|
||||
Search/download GIFs from Tenor via curl + jq.
|
||||
|
||||
## Skill metadata
|
||||
|
||||
@ -31,6 +31,10 @@ The following is the complete skill definition that Hermes loads when this skill
|
||||
|
||||
Search and download GIFs directly via the Tenor API using curl. No extra tools needed.
|
||||
|
||||
## When to use
|
||||
|
||||
Useful for finding reaction GIFs, creating visual content, and sending GIFs in chat.
|
||||
|
||||
## Setup
|
||||
|
||||
Set your Tenor API key in your environment (add to `~/.hermes/.env`):
|
||||
|
||||
@ -1,14 +1,14 @@
|
||||
---
|
||||
title: "Heartmula — Set up and run HeartMuLa, the open-source music generation model family (Suno-like)"
|
||||
title: "Heartmula — HeartMuLa: Suno-like song generation from lyrics + tags"
|
||||
sidebar_label: "Heartmula"
|
||||
description: "Set up and run HeartMuLa, the open-source music generation model family (Suno-like)"
|
||||
description: "HeartMuLa: Suno-like song generation from lyrics + tags"
|
||||
---
|
||||
|
||||
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
|
||||
|
||||
# Heartmula
|
||||
|
||||
Set up and run HeartMuLa, the open-source music generation model family (Suno-like). Generates full songs from lyrics + tags with multilingual support.
|
||||
HeartMuLa: Suno-like song generation from lyrics + tags.
|
||||
|
||||
## Skill metadata
|
||||
|
||||
@ -29,7 +29,7 @@ The following is the complete skill definition that Hermes loads when this skill
|
||||
# HeartMuLa - Open-Source Music Generation
|
||||
|
||||
## Overview
|
||||
HeartMuLa is a family of open-source music foundation models (Apache-2.0) that generates music conditioned on lyrics and tags. Comparable to Suno for open-source. Includes:
|
||||
HeartMuLa is a family of open-source music foundation models (Apache-2.0) that generates music conditioned on lyrics and tags, with multilingual support. Generates full songs from lyrics + tags. Comparable to Suno for open-source. Includes:
|
||||
- **HeartMuLa** - Music language model (3B/7B) for generation from lyrics + tags
|
||||
- **HeartCodec** - 12.5Hz music codec for high-fidelity audio reconstruction
|
||||
- **HeartTranscriptor** - Whisper-based lyrics transcription
|
||||
|
||||
@ -1,14 +1,14 @@
|
||||
---
|
||||
title: "Songsee — Generate spectrograms and audio feature visualizations (mel, chroma, MFCC, tempogram, etc"
|
||||
title: "Songsee — Audio spectrograms/features (mel, chroma, MFCC) via CLI"
|
||||
sidebar_label: "Songsee"
|
||||
description: "Generate spectrograms and audio feature visualizations (mel, chroma, MFCC, tempogram, etc"
|
||||
description: "Audio spectrograms/features (mel, chroma, MFCC) via CLI"
|
||||
---
|
||||
|
||||
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
|
||||
|
||||
# Songsee
|
||||
|
||||
Generate spectrograms and audio feature visualizations (mel, chroma, MFCC, tempogram, etc.) from audio files via CLI. Useful for audio analysis, music production debugging, and visual documentation.
|
||||
Audio spectrograms/features (mel, chroma, MFCC) via CLI.
|
||||
|
||||
## Skill metadata
|
||||
|
||||
|
||||
@ -1,14 +1,14 @@
|
||||
---
|
||||
title: "Spotify"
|
||||
title: "Spotify — Spotify: play, search, queue, manage playlists and devices"
|
||||
sidebar_label: "Spotify"
|
||||
description: "Control Spotify — play music, search the catalog, manage playlists and library, inspect devices and playback state"
|
||||
description: "Spotify: play, search, queue, manage playlists and devices"
|
||||
---
|
||||
|
||||
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
|
||||
|
||||
# Spotify
|
||||
|
||||
Control Spotify — play music, search the catalog, manage playlists and library, inspect devices and playback state. Loads when the user asks to play/pause/queue music, search tracks/albums/artists, manage playlists, or check what's playing. Assumes the Hermes Spotify toolset is enabled and `hermes auth spotify` has been run.
|
||||
Spotify: play, search, queue, manage playlists and devices.
|
||||
|
||||
## Skill metadata
|
||||
|
||||
|
||||
@ -1,14 +1,14 @@
|
||||
---
|
||||
title: "Youtube Content"
|
||||
title: "Youtube Content — YouTube transcripts to summaries, threads, blogs"
|
||||
sidebar_label: "Youtube Content"
|
||||
description: "Fetch YouTube video transcripts and transform them into structured content (chapters, summaries, threads, blog posts)"
|
||||
description: "YouTube transcripts to summaries, threads, blogs"
|
||||
---
|
||||
|
||||
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
|
||||
|
||||
# Youtube Content
|
||||
|
||||
Fetch YouTube video transcripts and transform them into structured content (chapters, summaries, threads, blog posts). Use when the user shares a YouTube URL or video link, asks to summarize a video, requests a transcript, or wants to extract and reformat content from any YouTube video.
|
||||
YouTube transcripts to summaries, threads, blogs.
|
||||
|
||||
## Skill metadata
|
||||
|
||||
@ -25,6 +25,10 @@ The following is the complete skill definition that Hermes loads when this skill
|
||||
|
||||
# YouTube Content Tool
|
||||
|
||||
## When to use
|
||||
|
||||
Use when the user shares a YouTube URL or video link, asks to summarize a video, requests a transcript, or wants to extract and reformat content from any YouTube video. Transforms transcripts into structured content (chapters, summaries, threads, blog posts).
|
||||
|
||||
Extract transcripts from YouTube videos and convert them into useful formats.
|
||||
|
||||
## Setup
|
||||
|
||||
@ -1,14 +1,14 @@
|
||||
---
|
||||
title: "Evaluating Llms Harness — Evaluates LLMs across 60+ academic benchmarks (MMLU, HumanEval, GSM8K, TruthfulQA, HellaSwag)"
|
||||
title: "Evaluating Llms Harness — lm-eval-harness: benchmark LLMs (MMLU, GSM8K, etc"
|
||||
sidebar_label: "Evaluating Llms Harness"
|
||||
description: "Evaluates LLMs across 60+ academic benchmarks (MMLU, HumanEval, GSM8K, TruthfulQA, HellaSwag)"
|
||||
description: "lm-eval-harness: benchmark LLMs (MMLU, GSM8K, etc"
|
||||
---
|
||||
|
||||
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
|
||||
|
||||
# Evaluating Llms Harness
|
||||
|
||||
Evaluates LLMs across 60+ academic benchmarks (MMLU, HumanEval, GSM8K, TruthfulQA, HellaSwag). Use when benchmarking model quality, comparing models, reporting academic results, or tracking training progress. Industry standard used by EleutherAI, HuggingFace, and major labs. Supports HuggingFace, vLLM, APIs.
|
||||
lm-eval-harness: benchmark LLMs (MMLU, GSM8K, etc.).
|
||||
|
||||
## Skill metadata
|
||||
|
||||
@ -30,6 +30,10 @@ The following is the complete skill definition that Hermes loads when this skill
|
||||
|
||||
# lm-evaluation-harness - LLM Benchmarking
|
||||
|
||||
## What's inside
|
||||
|
||||
Evaluates LLMs across 60+ academic benchmarks (MMLU, HumanEval, GSM8K, TruthfulQA, HellaSwag). Use when benchmarking model quality, comparing models, reporting academic results, or tracking training progress. Industry standard used by EleutherAI, HuggingFace, and major labs. Supports HuggingFace, vLLM, APIs.
|
||||
|
||||
## Quick start
|
||||
|
||||
lm-evaluation-harness evaluates LLMs across 60+ academic benchmarks using standardized prompts and metrics.
|
||||
|
||||
@ -1,14 +1,14 @@
|
||||
---
|
||||
title: "Weights And Biases"
|
||||
title: "Weights And Biases — W&B: log ML experiments, sweeps, model registry, dashboards"
|
||||
sidebar_label: "Weights And Biases"
|
||||
description: "Track ML experiments with automatic logging, visualize training in real-time, optimize hyperparameters with sweeps, and manage model registry with W&B - coll..."
|
||||
description: "W&B: log ML experiments, sweeps, model registry, dashboards"
|
||||
---
|
||||
|
||||
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
|
||||
|
||||
# Weights And Biases
|
||||
|
||||
Track ML experiments with automatic logging, visualize training in real-time, optimize hyperparameters with sweeps, and manage model registry with W&B - collaborative MLOps platform
|
||||
W&B: log ML experiments, sweeps, model registry, dashboards.
|
||||
|
||||
## Skill metadata
|
||||
|
||||
|
||||
@ -1,14 +1,14 @@
|
||||
---
|
||||
title: "Huggingface Hub"
|
||||
title: "Huggingface Hub — HuggingFace hf CLI: search/download/upload models, datasets"
|
||||
sidebar_label: "Huggingface Hub"
|
||||
description: "Hugging Face Hub CLI (hf) — search, download, and upload models and datasets, manage repos, query datasets with SQL, deploy inference endpoints, manage Space..."
|
||||
description: "HuggingFace hf CLI: search/download/upload models, datasets"
|
||||
---
|
||||
|
||||
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
|
||||
|
||||
# Huggingface Hub
|
||||
|
||||
Hugging Face Hub CLI (hf) — search, download, and upload models and datasets, manage repos, query datasets with SQL, deploy inference endpoints, manage Spaces and buckets.
|
||||
HuggingFace hf CLI: search/download/upload models, datasets.
|
||||
|
||||
## Skill metadata
|
||||
|
||||
|
||||
@ -1,14 +1,14 @@
|
||||
---
|
||||
title: "Obliteratus"
|
||||
title: "Obliteratus — OBLITERATUS: abliterate LLM refusals (diff-in-means)"
|
||||
sidebar_label: "Obliteratus"
|
||||
description: "Remove refusal behaviors from open-weight LLMs using OBLITERATUS — mechanistic interpretability techniques (diff-in-means, SVD, whitened SVD, LEACE, SAE deco..."
|
||||
description: "OBLITERATUS: abliterate LLM refusals (diff-in-means)"
|
||||
---
|
||||
|
||||
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
|
||||
|
||||
# Obliteratus
|
||||
|
||||
Remove refusal behaviors from open-weight LLMs using OBLITERATUS — mechanistic interpretability techniques (diff-in-means, SVD, whitened SVD, LEACE, SAE decomposition, etc.) to excise guardrails while preserving reasoning. 9 CLI methods, 28 analysis modules, 116 model presets across 5 compute tiers, tournament evaluation, and telemetry-driven recommendations. Use when a user wants to uncensor, abliterate, or remove refusal from an LLM.
|
||||
OBLITERATUS: abliterate LLM refusals (diff-in-means).
|
||||
|
||||
## Skill metadata
|
||||
|
||||
@ -31,10 +31,21 @@ The following is the complete skill definition that Hermes loads when this skill
|
||||
|
||||
# OBLITERATUS Skill
|
||||
|
||||
## What's inside
|
||||
|
||||
9 CLI methods, 28 analysis modules, 116 model presets across 5 compute tiers, tournament evaluation, and telemetry-driven recommendations.
|
||||
|
||||
Remove refusal behaviors (guardrails) from open-weight LLMs without retraining or fine-tuning. Uses mechanistic interpretability techniques — including diff-in-means, SVD, whitened SVD, LEACE concept erasure, SAE decomposition, Bayesian kernel projection, and more — to identify and surgically excise refusal directions from model weights while preserving reasoning capabilities.
|
||||
|
||||
**License warning:** OBLITERATUS is AGPL-3.0. NEVER import it as a Python library. Always invoke via CLI (`obliteratus` command) or subprocess. This keeps Hermes Agent's MIT license clean.
|
||||
|
||||
## Video Guide
|
||||
|
||||
Walkthrough of OBLITERATUS used by a Hermes agent to abliterate Gemma:
|
||||
https://www.youtube.com/watch?v=8fG9BrNTeHs ("OBLITERATUS: An AI Agent Removed Gemma 4's Safety Guardrails")
|
||||
|
||||
Useful when the user wants a visual overview of the end-to-end workflow before running it themselves.
|
||||
|
||||
## When to Use This Skill
|
||||
|
||||
Trigger when the user:
|
||||
|
||||
@ -1,14 +1,14 @@
|
||||
---
|
||||
title: "Outlines"
|
||||
title: "Outlines — Outlines: structured JSON/regex/Pydantic LLM generation"
|
||||
sidebar_label: "Outlines"
|
||||
description: "Guarantee valid JSON/XML/code structure during generation, use Pydantic models for type-safe outputs, support local models (Transformers, vLLM), and maximize..."
|
||||
description: "Outlines: structured JSON/regex/Pydantic LLM generation"
|
||||
---
|
||||
|
||||
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
|
||||
|
||||
# Outlines
|
||||
|
||||
Guarantee valid JSON/XML/code structure during generation, use Pydantic models for type-safe outputs, support local models (Transformers, vLLM), and maximize inference speed with Outlines - dottxt.ai's structured generation library
|
||||
Outlines: structured JSON/regex/Pydantic LLM generation.
|
||||
|
||||
## Skill metadata
|
||||
|
||||
|
||||
@ -1,14 +1,14 @@
|
||||
---
|
||||
title: "Serving Llms Vllm — Serves LLMs with high throughput using vLLM's PagedAttention and continuous batching"
|
||||
title: "Serving Llms Vllm — vLLM: high-throughput LLM serving, OpenAI API, quantization"
|
||||
sidebar_label: "Serving Llms Vllm"
|
||||
description: "Serves LLMs with high throughput using vLLM's PagedAttention and continuous batching"
|
||||
description: "vLLM: high-throughput LLM serving, OpenAI API, quantization"
|
||||
---
|
||||
|
||||
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
|
||||
|
||||
# Serving Llms Vllm
|
||||
|
||||
Serves LLMs with high throughput using vLLM's PagedAttention and continuous batching. Use when deploying production LLM APIs, optimizing inference latency/throughput, or serving models with limited GPU memory. Supports OpenAI-compatible endpoints, quantization (GPTQ/AWQ/FP8), and tensor parallelism.
|
||||
vLLM: high-throughput LLM serving, OpenAI API, quantization.
|
||||
|
||||
## Skill metadata
|
||||
|
||||
@ -30,6 +30,10 @@ The following is the complete skill definition that Hermes loads when this skill
|
||||
|
||||
# vLLM - High-Performance LLM Serving
|
||||
|
||||
## When to use
|
||||
|
||||
Use when deploying production LLM APIs, optimizing inference latency/throughput, or serving models with limited GPU memory. Supports OpenAI-compatible endpoints, quantization (GPTQ/AWQ/FP8), and tensor parallelism.
|
||||
|
||||
## Quick start
|
||||
|
||||
vLLM achieves 24x higher throughput than standard transformers through PagedAttention (block-based KV cache) and continuous batching (mixing prefill/decode requests).
|
||||
|
||||
@ -1,14 +1,14 @@
|
||||
---
|
||||
title: "Audiocraft Audio Generation"
|
||||
title: "Audiocraft Audio Generation — AudioCraft: MusicGen text-to-music, AudioGen text-to-sound"
|
||||
sidebar_label: "Audiocraft Audio Generation"
|
||||
description: "PyTorch library for audio generation including text-to-music (MusicGen) and text-to-sound (AudioGen)"
|
||||
description: "AudioCraft: MusicGen text-to-music, AudioGen text-to-sound"
|
||||
---
|
||||
|
||||
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
|
||||
|
||||
# Audiocraft Audio Generation
|
||||
|
||||
PyTorch library for audio generation including text-to-music (MusicGen) and text-to-sound (AudioGen). Use when you need to generate music from text descriptions, create sound effects, or perform melody-conditioned music generation.
|
||||
AudioCraft: MusicGen text-to-music, AudioGen text-to-sound.
|
||||
|
||||
## Skill metadata
|
||||
|
||||
@ -146,6 +146,7 @@ torchaudio.save("sound.wav", wav[0].cpu(), sample_rate=16000)
|
||||
|
||||
### Architecture overview
|
||||
|
||||
<!-- ascii-guard-ignore -->
|
||||
```
|
||||
AudioCraft Architecture:
|
||||
┌──────────────────────────────────────────────────────────────┐
|
||||
@ -165,6 +166,7 @@ AudioCraft Architecture:
|
||||
│ Converts tokens back to audio waveform │
|
||||
└──────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
<!-- ascii-guard-ignore-end -->
|
||||
|
||||
### Model variants
|
||||
|
||||
|
||||
@ -1,14 +1,14 @@
|
||||
---
|
||||
title: "Segment Anything Model — Foundation model for image segmentation with zero-shot transfer"
|
||||
title: "Segment Anything Model — SAM: zero-shot image segmentation via points, boxes, masks"
|
||||
sidebar_label: "Segment Anything Model"
|
||||
description: "Foundation model for image segmentation with zero-shot transfer"
|
||||
description: "SAM: zero-shot image segmentation via points, boxes, masks"
|
||||
---
|
||||
|
||||
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
|
||||
|
||||
# Segment Anything Model
|
||||
|
||||
Foundation model for image segmentation with zero-shot transfer. Use when you need to segment any object in images using points, boxes, or masks as prompts, or automatically generate all object masks in an image.
|
||||
SAM: zero-shot image segmentation via points, boxes, masks.
|
||||
|
||||
## Skill metadata
|
||||
|
||||
@ -151,6 +151,7 @@ masks = processor.image_processor.post_process_masks(
|
||||
|
||||
### Model architecture
|
||||
|
||||
<!-- ascii-guard-ignore -->
|
||||
<!-- ascii-guard-ignore -->
|
||||
```
|
||||
SAM Architecture:
|
||||
@ -163,6 +164,7 @@ SAM Architecture:
|
||||
(computed once) (per prompt) predictions
|
||||
```
|
||||
<!-- ascii-guard-ignore-end -->
|
||||
<!-- ascii-guard-ignore-end -->
|
||||
|
||||
### Model variants
|
||||
|
||||
|
||||
@ -1,14 +1,14 @@
|
||||
---
|
||||
title: "Dspy"
|
||||
title: "Dspy — DSPy: declarative LM programs, auto-optimize prompts, RAG"
|
||||
sidebar_label: "Dspy"
|
||||
description: "Build complex AI systems with declarative programming, optimize prompts automatically, create modular RAG systems and agents with DSPy - Stanford NLP's frame..."
|
||||
description: "DSPy: declarative LM programs, auto-optimize prompts, RAG"
|
||||
---
|
||||
|
||||
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
|
||||
|
||||
# Dspy
|
||||
|
||||
Build complex AI systems with declarative programming, optimize prompts automatically, create modular RAG systems and agents with DSPy - Stanford NLP's framework for systematic LM programming
|
||||
DSPy: declarative LM programs, auto-optimize prompts, RAG.
|
||||
|
||||
## Skill metadata
|
||||
|
||||
|
||||
@ -1,14 +1,14 @@
|
||||
---
|
||||
title: "Axolotl"
|
||||
title: "Axolotl — Axolotl: YAML LLM fine-tuning (LoRA, DPO, GRPO)"
|
||||
sidebar_label: "Axolotl"
|
||||
description: "Expert guidance for fine-tuning LLMs with Axolotl - YAML configs, 100+ models, LoRA/QLoRA, DPO/KTO/ORPO/GRPO, multimodal support"
|
||||
description: "Axolotl: YAML LLM fine-tuning (LoRA, DPO, GRPO)"
|
||||
---
|
||||
|
||||
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
|
||||
|
||||
# Axolotl
|
||||
|
||||
Expert guidance for fine-tuning LLMs with Axolotl - YAML configs, 100+ models, LoRA/QLoRA, DPO/KTO/ORPO/GRPO, multimodal support
|
||||
Axolotl: YAML LLM fine-tuning (LoRA, DPO, GRPO).
|
||||
|
||||
## Skill metadata
|
||||
|
||||
@ -30,6 +30,10 @@ The following is the complete skill definition that Hermes loads when this skill
|
||||
|
||||
# Axolotl Skill
|
||||
|
||||
## What's inside
|
||||
|
||||
Expert guidance for fine-tuning LLMs with Axolotl — YAML configs, 100+ models, LoRA/QLoRA, DPO/KTO/ORPO/GRPO, multimodal support.
|
||||
|
||||
Comprehensive assistance with axolotl development, generated from official documentation.
|
||||
|
||||
## When to Use This Skill
|
||||
|
||||
@ -1,14 +1,14 @@
|
||||
---
|
||||
title: "Fine Tuning With Trl"
|
||||
title: "Fine Tuning With Trl — TRL: SFT, DPO, PPO, GRPO, reward modeling for LLM RLHF"
|
||||
sidebar_label: "Fine Tuning With Trl"
|
||||
description: "Fine-tune LLMs using reinforcement learning with TRL - SFT for instruction tuning, DPO for preference alignment, PPO/GRPO for reward optimization, and reward..."
|
||||
description: "TRL: SFT, DPO, PPO, GRPO, reward modeling for LLM RLHF"
|
||||
---
|
||||
|
||||
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
|
||||
|
||||
# Fine Tuning With Trl
|
||||
|
||||
Fine-tune LLMs using reinforcement learning with TRL - SFT for instruction tuning, DPO for preference alignment, PPO/GRPO for reward optimization, and reward model training. Use when need RLHF, align model with preferences, or train from human feedback. Works with HuggingFace Transformers.
|
||||
TRL: SFT, DPO, PPO, GRPO, reward modeling for LLM RLHF.
|
||||
|
||||
## Skill metadata
|
||||
|
||||
|
||||
Some files were not shown because too many files have changed in this diff Show More
Loading…
Reference in New Issue
Block a user