feat(plugin): gh-identity — per-agent attribution via env injection + gh wrapper
Some checks failed
CI / Shellcheck + wrapper tests (push) Failing after 9s
CI / Go build + test + vet (push) Failing after 13m36s

Fixes molecule-core#1957: agent identity collapse where all agents share
one GitHub PAT and their writes attribute to the CEO.

This plugin takes the pragmatic "wrap, don't multiply identities" path:
- Injects MOLECULE_AGENT_ROLE / OWNER / ATTRIBUTION_BADGE per workspace
- Ships a shell wrapper for `gh` that:
  * prepends an attribution badge to issue/PR bodies on publish
  * rewrites --assignee @me to the role's designated human owner
  * emits an NDJSON audit log to /var/log/molecule-gh.ndjson
- Wrapper is shipped as base64 env var; each workspace template's
  install.sh decodes and writes it to /usr/local/bin/gh

Scales where GitHub Apps / machine users don't: adding a new agent role
is one entry in config.yaml, not a GitHub UI roundtrip per role.

See README + known-issues.md for the v2-architecture migration plan.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Hongming Wang 2026-04-23 20:38:45 -07:00
commit 4fd5ac7be3
14 changed files with 976 additions and 0 deletions

31
.github/workflows/ci.yml vendored Normal file
View File

@ -0,0 +1,31 @@
name: CI
on:
push:
branches: [main]
pull_request:
jobs:
go:
name: Go build + test + vet
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-go@v5
with: { go-version: "1.25" }
- run: go mod tidy && git diff --exit-code go.mod go.sum
- run: go build ./...
- run: go vet ./...
- run: go test -race ./...
shell:
name: Shellcheck + wrapper tests
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Install shellcheck
run: sudo apt-get update -qq && sudo apt-get install -y -qq shellcheck
- name: Shellcheck
run: shellcheck internal/ghidentity/wrapper.sh scripts/test-wrapper.sh
- name: Run wrapper tests
run: bash scripts/test-wrapper.sh

4
.gitignore vendored Normal file
View File

@ -0,0 +1,4 @@
/bin/
/dist/
*.test
.DS_Store

119
README.md Normal file
View File

@ -0,0 +1,119 @@
# molecule-ai-plugin-gh-identity
Injects per-agent identity into workspace env so every `gh` CLI call
carries agent attribution — without needing a distinct GitHub account per
agent.
## Problem
All agents in a Molecule fleet share one GitHub PAT. When an agent runs:
```
gh issue create --assignee @me ...
gh pr comment ...
```
`@me` resolves to the PAT owner (the CEO). Every issue, PR, and comment
gets attributed to one person, making audit impossible and flooding that
person's notifications. See [molecule-core#1957].
GitHub's identity model doesn't scale the way agent fleets do: creating N
machine users requires N emails + seats; creating N GitHub Apps requires
N manual UI round-trips. Neither is batch-generatable.
## Approach
Work around the identity model with a convention, enforced by a tiny
shell wrapper and this plugin's env injection.
1. Plugin injects per-workspace env: `MOLECULE_AGENT_ROLE`,
`MOLECULE_OWNER`, `MOLECULE_ATTRIBUTION_BADGE`.
2. The workspace base image ships a `gh` wrapper (`/usr/local/bin/gh`)
that reads `$MOLECULE_AGENT_ROLE` and:
- prepends an attribution block to every `issue comment` / `pr
comment` / `issue create --body` / `pr create --body`
- rewrites `--assignee @me` to `--assignee $MOLECULE_OWNER` (or
strips it entirely)
- emits an audit line to `/var/log/molecule-gh.ndjson`
3. A `git` wrapper does the same for `Co-authored-by:` on commits.
The wrapper script is shipped embedded in the plugin (`wrapper.sh`) and
installed by each workspace-template's `install.sh` when the plugin is
active. Plugin → env injection; template → file write.
## What this plugin is NOT
- NOT a GitHub App installer. Existing `molecule-ai-plugin-github-app-auth`
handles App-based auth; this plugin is additive and does not conflict.
- NOT a machine-user provisioning tool. There are no distinct GitHub
identities; attribution is text-based.
- NOT a per-agent rate limiter or cost accounter (future work; see #1957
follow-ups).
## Env vars injected
| Name | Source | Example |
|---|---|---|
| `MOLECULE_AGENT_ROLE` | workspace metadata (`role` field) | `PMM-Lead` |
| `MOLECULE_OWNER` | plugin config (role → owner map) | `HongmingWang-Rabbit` |
| `MOLECULE_ATTRIBUTION_BADGE` | computed | `🤖 [Agent: PMM-Lead · ws-a0689c35]` |
| `MOLECULE_GH_WRAPPER_SHA` | computed | hash of wrapper.sh for version pinning |
## Plugin manifest (v1)
This plugin ships as a v1 plugin (matching `molecule-ai-plugin-github-app-auth`).
Migration to [plugin-architecture-v2] happens in phase 6 of that plan.
The v1 shape here is intentionally structured so v2 migration is mostly a
manifest rename:
- `EnvMutator.MutateEnv` → v2's `contributions.env` + `hooks.env_refresh`
- Role→owner map in `config.yaml` → v2's `spec.config`
- Wrapper script shipping → v2's `contributions.files` (new axis)
## Install (v1)
Monorepo side:
```
manifest.json:plugins += {name: "gh-identity", repo: "Molecule-AI/molecule-ai-plugin-gh-identity", ref: "main"}
workspace-server/go.mod: require github.com/Molecule-AI/molecule-ai-plugin-gh-identity
workspace-server/cmd/server/main.go: pluginloader.BuildRegistry()
```
Env (operator):
```
MOLECULE_GH_IDENTITY_CONFIG_FILE=/path/to/config.yaml
```
## Config
```yaml
# config.yaml — role → owner map (used for `@me` rewrite)
roles:
PMM-Lead: { owner: HongmingWang-Rabbit }
Dev-Lead: { owner: HongmingWang-Rabbit }
Research-Lead:{ owner: HongmingWang-Rabbit }
default: { owner: HongmingWang-Rabbit }
```
## Capabilities requested (v2 forward-compat)
When v2 enforcement lands, this plugin will declare:
- `workspace:env_inject` — required
- `workspace:file_write:/usr/local/bin/gh` — required (via template install.sh)
- `audit:emit` — required
- `network_egress:api.github.com` — required (wrapper makes API calls via real gh)
No broader capabilities. In particular: **no secret access** (PAT is
shared and platform-managed, not in plugin scope).
## Related
- molecule-core#1957 — agent identity collapse (this plugin's driver)
- molecule-core#1933 — GH_TOKEN refresh (separate concern; handled by
github-app-auth plugin)
- internal `product/plugin-architecture-v2.md` — target arch for v2
migration
[molecule-core#1957]: https://github.com/Molecule-AI/molecule-core/issues/1957
[plugin-architecture-v2]: https://github.com/Molecule-AI/internal/blob/main/product/plugin-architecture-v2.md

17
examples/config.yaml Normal file
View File

@ -0,0 +1,17 @@
# Example config for gh-identity plugin.
# Point MOLECULE_GH_IDENTITY_CONFIG_FILE at this file.
roles:
# Each role → the GitHub user @me calls should be rewritten to.
# Role names match the `role` field set on workspace metadata; the
# plugin's SanitizeRole() normalizes whitespace/case, so you can
# write these in a consistent style regardless of agent input.
PMM-Lead: { owner: HongmingWang-Rabbit }
Dev-Lead: { owner: HongmingWang-Rabbit }
Research-Lead: { owner: HongmingWang-Rabbit }
Marketing-Lead: { owner: HongmingWang-Rabbit }
# Catch-all for roles not explicitly listed. If you omit this, unknown
# roles get their `--assignee @me` stripped entirely (safer than
# wrong-attribution).
default: { owner: HongmingWang-Rabbit }

15
go.mod Normal file
View File

@ -0,0 +1,15 @@
module github.com/Molecule-AI/molecule-ai-plugin-gh-identity
go 1.25.0
require gopkg.in/yaml.v3 v3.0.1
// This plugin's Mutator type satisfies monorepo's provisionhook.EnvMutator
// structurally we don't import it, so no cross-module replace directive
// is needed. If we ever need to reference exported types from
// molecule-monorepo/platform, uncomment:
//
// replace github.com/Molecule-AI/molecule-monorepo/platform => ../molecule-monorepo/workspace-server
//
// Keeping this out of the require list lets the plugin build standalone in CI
// without checking out the monorepo.

4
go.sum Normal file
View File

@ -0,0 +1,4 @@
gopkg.in/check.v1 v0.0.0-20161208181325-20d25e280405 h1:yhCVgyC4o1eVCa2tZl7eS0r+SDo693bJlVdllGtEeKM=
gopkg.in/check.v1 v0.0.0-20161208181325-20d25e280405/go.mod h1:Co6ibVJAznAaIkqp8huTwlJQCZ016jof/cbN4VW5Yz0=
gopkg.in/yaml.v3 v3.0.1 h1:fxVm/GzAzEWqLHuvctI91KS9hhNmmWOoWu0XTYJS7CA=
gopkg.in/yaml.v3 v3.0.1/go.mod h1:K4uyk7z7BCEPqu6E+C64Yfv1cQ7kz7rIZviUmN+EgEM=

View File

@ -0,0 +1,101 @@
// Package ghidentity implements the workspace-server plugin that injects
// per-agent attribution env vars into workspace containers.
//
// See repo README for the "why" (molecule-core#1957 agent-identity
// collapse). This package contains the wiring; the behavioural logic
// lives in wrapper.sh which is shipped to the workspace via env.
package ghidentity
import (
"fmt"
"os"
"strings"
"gopkg.in/yaml.v3"
)
// Config maps agent roles (set per-workspace via MOLECULE_AGENT_ROLE)
// to the human GitHub user who "owns" that role for purposes of @me
// rewriting. A single default entry covers roles not explicitly listed.
//
// Config is loaded once at platform boot from
// $MOLECULE_GH_IDENTITY_CONFIG_FILE. Missing file → use the DefaultOwner
// for all roles; plugin still works, just with blanket attribution.
type Config struct {
Roles map[string]RoleConfig `yaml:"roles"`
}
// RoleConfig defines the per-role settings. Today: just the owner.
// Future fields (capability overrides, rate limits, per-role repo
// allowlists) slot in here without breaking the surface.
type RoleConfig struct {
Owner string `yaml:"owner"`
}
// LoadConfig reads a YAML config file. Missing file is not an error —
// returns a Config with an empty Roles map and the caller falls through
// to DefaultOwner.
func LoadConfig(path string) (*Config, error) {
if path == "" {
return &Config{Roles: map[string]RoleConfig{}}, nil
}
raw, err := os.ReadFile(path)
if err != nil {
if os.IsNotExist(err) {
return &Config{Roles: map[string]RoleConfig{}}, nil
}
return nil, fmt.Errorf("read config %q: %w", path, err)
}
var cfg Config
if err := yaml.Unmarshal(raw, &cfg); err != nil {
return nil, fmt.Errorf("parse config %q: %w", path, err)
}
if cfg.Roles == nil {
cfg.Roles = map[string]RoleConfig{}
}
return &cfg, nil
}
// ResolveOwner picks the GitHub owner for the given role. Unknown roles
// fall through to the "default" entry; if neither is set, returns "" so
// the wrapper strips --assignee @me entirely (correct behavior — better
// than assigning to the wrong person).
//
// Lookup is case-insensitive against the sanitized role form. The
// yaml config writer might use "PMM-Lead", "pmm-lead", or "Pmm-Lead"
// interchangeably — we accept all three by lower-casing both sides.
// "default" is treated literally (reserved key).
func (c *Config) ResolveOwner(role string) string {
needle := strings.ToLower(role)
for k, rc := range c.Roles {
if k == "default" {
continue
}
if strings.ToLower(k) == needle && rc.Owner != "" {
return rc.Owner
}
}
if rc, ok := c.Roles["default"]; ok {
return rc.Owner
}
return ""
}
// SanitizeRole normalizes a role string for use in env vars / badges.
// Strips whitespace, upper-cases the first letter of each hyphen-
// separated segment so arbitrary user input (" pmm-lead ") becomes a
// predictable string ("PMM-Lead") visible in attribution badges.
func SanitizeRole(raw string) string {
raw = strings.TrimSpace(raw)
if raw == "" {
return ""
}
parts := strings.Split(raw, "-")
for i, p := range parts {
if p == "" {
continue
}
parts[i] = strings.ToUpper(p[:1]) + p[1:]
}
return strings.Join(parts, "-")
}

View File

@ -0,0 +1,88 @@
package ghidentity
import (
"context"
"crypto/sha256"
"encoding/base64"
"encoding/hex"
"fmt"
"strings"
)
// Mutator implements the monorepo's provisionhook.EnvMutator interface.
// Exported so cmd/server (monorepo side) can register it at platform
// boot and the provisioner calls MutateEnv per-workspace.
type Mutator struct {
Config *Config
}
// Name satisfies provisionhook.EnvMutator. Appears in log lines and
// metrics; keep stable — matches the plugin's manifest name.
func (m *Mutator) Name() string { return "gh-identity" }
// MutateEnv is the per-workspace entry point. It reads the workspace's
// declared role (passed in via the env map pre-populated by the
// provisioner — see "Role resolution" below) and injects:
//
// - MOLECULE_AGENT_ROLE — sanitized role string
// - MOLECULE_OWNER — GitHub user to rewrite @me to
// - MOLECULE_WORKSPACE_ID — for the audit log
// - MOLECULE_ATTRIBUTION_BADGE — the markdown badge the wrapper prepends
// - MOLECULE_GH_WRAPPER_B64 — base64'd wrapper.sh; template decodes
// - MOLECULE_GH_WRAPPER_SHA — hash of wrapper.sh for version pinning
//
// ## Role resolution
//
// The role is expected in env["MOLECULE_AGENT_ROLE"] already — the
// workspace-server's provisionWorkspace reads workspace metadata (the
// `role` field on the workspace row) and sets it BEFORE calling
// mutators. If it's unset we skip silently; the wrapper script falls
// back to pass-through mode in that case so nothing breaks.
//
// This mutator never returns an error for policy reasons: a missing
// config file OR an unknown role must NOT block workspace boot. The
// wrapper passes through gracefully when env is absent.
func (m *Mutator) MutateEnv(ctx context.Context, workspaceID string, env map[string]string) error {
if env == nil {
return fmt.Errorf("gh-identity: env map is nil")
}
rawRole := env["MOLECULE_AGENT_ROLE"]
role := SanitizeRole(rawRole)
if role == "" {
// No role declared → plugin is a no-op for this workspace.
// Leave env alone; wrapper falls back to pass-through.
return nil
}
owner := ""
if m.Config != nil {
owner = m.Config.ResolveOwner(role)
}
env["MOLECULE_AGENT_ROLE"] = role
env["MOLECULE_OWNER"] = owner
env["MOLECULE_WORKSPACE_ID"] = workspaceID
env["MOLECULE_ATTRIBUTION_BADGE"] = fmt.Sprintf("🤖 [Agent: %s · %s]", role, shortID(workspaceID))
// Ship the wrapper as base64 so the template's install.sh can
// decode + write without dealing with newline-embedded strings in
// cloud-init user-data.
env["MOLECULE_GH_WRAPPER_B64"] = base64.StdEncoding.EncodeToString([]byte(WrapperScript))
h := sha256.Sum256([]byte(WrapperScript))
env["MOLECULE_GH_WRAPPER_SHA"] = hex.EncodeToString(h[:])[:12]
return nil
}
// shortID returns a human-readable tag for the workspace, used in the
// attribution badge. Workspace IDs are UUIDs (e.g. d3605ef2-f7d6-…), so
// we take the first 8 hex chars and prefix "ws-" → "ws-d3605ef2".
// Idempotent: strips any pre-existing "ws-" to avoid "ws-ws-…" if a
// caller happens to pass an already-prefixed id (some test fixtures do).
func shortID(id string) string {
id = strings.TrimPrefix(id, "ws-")
if len(id) >= 8 {
return "ws-" + id[:8]
}
return "ws-" + id
}

View File

@ -0,0 +1,119 @@
package ghidentity
import (
"context"
"encoding/base64"
"strings"
"testing"
)
func TestMutateEnv_NoRoleIsNoOp(t *testing.T) {
m := &Mutator{Config: &Config{Roles: map[string]RoleConfig{
"default": {Owner: "hongming"},
}}}
env := map[string]string{}
if err := m.MutateEnv(context.Background(), "ws-abc", env); err != nil {
t.Fatalf("unexpected err: %v", err)
}
if len(env) != 0 {
t.Errorf("expected no mutation without role, got %v", env)
}
}
func TestMutateEnv_InjectsAllFields(t *testing.T) {
m := &Mutator{Config: &Config{Roles: map[string]RoleConfig{
"PMM-Lead": {Owner: "hongming"},
}}}
env := map[string]string{"MOLECULE_AGENT_ROLE": "pmm-lead"}
if err := m.MutateEnv(context.Background(), "ws-abcdef01-foo", env); err != nil {
t.Fatalf("unexpected err: %v", err)
}
// Role sanitized
if got := env["MOLECULE_AGENT_ROLE"]; got != "Pmm-Lead" {
t.Errorf("expected Pmm-Lead, got %q", got)
}
// Owner resolved via case-insensitive lookup: config key "PMM-Lead"
// matches sanitized role "Pmm-Lead" because we lower-case both sides.
if got := env["MOLECULE_OWNER"]; got != "hongming" {
t.Errorf("expected owner=hongming, got %q", got)
}
// Workspace id passed through
if env["MOLECULE_WORKSPACE_ID"] != "ws-abcdef01-foo" {
t.Errorf("workspace id not set")
}
// Badge contains role + short id
if !strings.Contains(env["MOLECULE_ATTRIBUTION_BADGE"], "Pmm-Lead") {
t.Errorf("badge missing role: %q", env["MOLECULE_ATTRIBUTION_BADGE"])
}
if !strings.Contains(env["MOLECULE_ATTRIBUTION_BADGE"], "ws-abcdef01") {
t.Errorf("badge missing short id: %q", env["MOLECULE_ATTRIBUTION_BADGE"])
}
// Wrapper base64 decodes back to wrapper.sh
b64 := env["MOLECULE_GH_WRAPPER_B64"]
if b64 == "" {
t.Fatal("wrapper b64 not set")
}
decoded, err := base64.StdEncoding.DecodeString(b64)
if err != nil {
t.Fatalf("decode: %v", err)
}
if !strings.Contains(string(decoded), "#!/usr/bin/env bash") {
t.Errorf("wrapper decode mismatch")
}
// Wrapper sha is short hex
if len(env["MOLECULE_GH_WRAPPER_SHA"]) != 12 {
t.Errorf("wrapper sha length: %d", len(env["MOLECULE_GH_WRAPPER_SHA"]))
}
}
func TestMutateEnv_NilMapErrors(t *testing.T) {
m := &Mutator{}
if err := m.MutateEnv(context.Background(), "ws-1", nil); err == nil {
t.Fatal("expected error on nil env map")
}
}
func TestResolveOwner_FallbackChain(t *testing.T) {
cfg := &Config{Roles: map[string]RoleConfig{
"PMM": {Owner: "alice"},
"default": {Owner: "bob"},
}}
cases := []struct {
role, want string
}{
{"PMM", "alice"},
{"pmm", "alice"}, // case-insensitive
{"Pmm", "alice"}, // case-insensitive (sanitized form)
{"unknown-role", "bob"},
{"", "bob"},
}
for _, c := range cases {
if got := cfg.ResolveOwner(c.role); got != c.want {
t.Errorf("role=%q: got %q want %q", c.role, got, c.want)
}
}
}
func TestResolveOwner_NoDefaultReturnsEmpty(t *testing.T) {
cfg := &Config{Roles: map[string]RoleConfig{
"PMM": {Owner: "alice"},
}}
if got := cfg.ResolveOwner("unknown"); got != "" {
t.Errorf("expected empty for unknown role without default, got %q", got)
}
}
func TestSanitizeRole(t *testing.T) {
cases := []struct{ in, want string }{
{"", ""},
{" pmm-lead ", "Pmm-Lead"},
{"ResearchLead", "ResearchLead"},
{"multi-part-role", "Multi-Part-Role"},
{" -starts-with-dash", "-Starts-With-Dash"}, // edge: preserved
}
for _, c := range cases {
if got := SanitizeRole(c.in); got != c.want {
t.Errorf("in=%q: got %q want %q", c.in, got, c.want)
}
}
}

View File

@ -0,0 +1,14 @@
package ghidentity
import _ "embed"
// WrapperScript is the shell wrapper that replaces `gh` in the workspace
// container's PATH. Shipped to the workspace via env var
// MOLECULE_GH_WRAPPER_B64 (base64) — the template's install.sh decodes
// and writes it to /usr/local/bin/gh.
//
// Embedded (not a constant) so gofmt/go vet treat it like source; easier
// to edit than a multi-line Go string.
//
//go:embed wrapper.sh
var WrapperScript string

View File

@ -0,0 +1,169 @@
#!/usr/bin/env bash
# molecule-gh wrapper — intercepts `gh` calls to inject agent attribution
# and kill the @me-collapses-to-CEO anti-pattern (molecule-core#1957).
#
# Installed at /usr/local/bin/gh ahead of the real gh binary at /usr/bin/gh
# by the workspace template's install.sh. Both `gh` and `git` (a separate
# wrapper calling this for `git commit` trailers) read the same env.
#
# The wrapper is opt-in: if $MOLECULE_AGENT_ROLE is unset, we pass through
# unchanged. Workspaces without the plugin behave exactly like today.
#
# Audit log is append-only NDJSON at /var/log/molecule-gh.ndjson. Each
# invocation emits one line with role, workspace id, argv, and exit code.
# Log readers: PM triage, post-incident forensics, and the audit:emit
# capability if v2 formalizes it.
set -uo pipefail
# Find the real gh — must NOT be this script.
real_gh() {
local p
for p in /usr/bin/gh /opt/gh/bin/gh /usr/local/bin/gh.real; do
if [ -x "$p" ] && [ "$p" != "/usr/local/bin/gh" ]; then
echo "$p"
return 0
fi
done
# Fall back to PATH hunt, skipping this wrapper by path.
local self="${BASH_SOURCE[0]}"
local cand
while IFS= read -r cand; do
if [ -x "$cand" ] && [ "$cand" != "$self" ]; then
echo "$cand"
return 0
fi
done < <(command -v -a gh 2>/dev/null | grep -v "^$self$")
return 1
}
audit_emit() {
local rc="$1"; shift
local log_file="/var/log/molecule-gh.ndjson"
# Quote argv via python's json for safety (shell arg quoting is a trap).
# Timestamp comes from _MOLECULE_GH_TS exported by the caller.
python3 - "$@" <<PYEOF 2>/dev/null >> "$log_file" || true
import json, sys, os
argv = sys.argv[1:]
rec = {
"ts": os.environ.get("_MOLECULE_GH_TS"),
"role": os.environ.get("MOLECULE_AGENT_ROLE",""),
"workspace_id": os.environ.get("MOLECULE_WORKSPACE_ID",""),
"owner": os.environ.get("MOLECULE_OWNER",""),
"rc": int(os.environ.get("_MOLECULE_GH_RC","0")),
"argv": argv,
}
print(json.dumps(rec))
PYEOF
}
# Short-circuit: plugin disabled → pure passthrough.
if [ -z "${MOLECULE_AGENT_ROLE:-}" ]; then
exec "$(real_gh)" "$@"
fi
BADGE="${MOLECULE_ATTRIBUTION_BADGE:-🤖 [Agent: ${MOLECULE_AGENT_ROLE}]}"
OWNER="${MOLECULE_OWNER:-}"
# Rewrite argv:
# 1. If we see --assignee @me, replace with the human owner (or drop).
# 2. If we see --body <text> on a command that publishes to github,
# prepend BADGE + two newlines to <text>. Only rewrites once per
# invocation, to stay idempotent.
#
# The set of publishing commands is small and well-known — we explicitly
# enumerate them rather than rewriting every --body (e.g. `gh release
# view --body-length` would be mis-matched on a loose grep).
PUBLISH_VERBS=(
"issue create"
"issue comment"
"issue edit"
"pr create"
"pr comment"
"pr edit"
"pr review"
"release create"
"release edit"
"discussion create"
)
argv=("$@")
n=${#argv[@]}
# Detect which verb is being invoked by joining the first 2 non-flag tokens.
# `gh <subcmd> <verb> [flags]` — we just need to know if this is a
# publishing verb.
first=""; second=""
for ((i=0; i<n; i++)); do
tok="${argv[$i]}"
[[ "$tok" == -* ]] && continue
if [ -z "$first" ]; then first="$tok"
elif [ -z "$second" ]; then second="$tok"; break
fi
done
verb="$first $second"
is_publish=0
for pv in "${PUBLISH_VERBS[@]}"; do
if [ "$pv" = "$verb" ]; then is_publish=1; break; fi
done
# Rewrite @me and --body.
new_argv=()
body_rewritten=0
skip=0
for ((i=0; i<n; i++)); do
if [ "$skip" = "1" ]; then skip=0; continue; fi
tok="${argv[$i]}"
case "$tok" in
--assignee)
next="${argv[$((i+1))]:-}"
if [ "$next" = "@me" ]; then
if [ -n "$OWNER" ]; then
new_argv+=("--assignee" "$OWNER")
fi
# If no OWNER configured, drop the flag entirely.
skip=1
continue
fi
;;
--assignee=@me)
if [ -n "$OWNER" ]; then
new_argv+=("--assignee=$OWNER")
fi
continue
;;
--body)
if [ "$is_publish" = "1" ] && [ "$body_rewritten" = "0" ]; then
next="${argv[$((i+1))]:-}"
new_argv+=("--body" "${BADGE}"$'\n\n'"${next}")
skip=1
body_rewritten=1
continue
fi
;;
--body=*)
if [ "$is_publish" = "1" ] && [ "$body_rewritten" = "0" ]; then
body="${tok#--body=}"
new_argv+=("--body=${BADGE}"$'\n\n'"${body}")
body_rewritten=1
continue
fi
;;
esac
new_argv+=("$tok")
done
# If publishing with no --body provided, we don't add one — the real gh
# will either prompt ($EDITOR) or fail, same as today. We don't want to
# turn a "you forgot --body" error into "we silently posted a badge-only
# comment."
GH=$(real_gh) || { echo "molecule-gh: cannot find real gh binary" >&2; exit 127; }
_MOLECULE_GH_TS="$(date -u +%Y-%m-%dT%H:%M:%SZ)" \
"$GH" "${new_argv[@]}"
rc=$?
_MOLECULE_GH_RC=$rc audit_emit "$rc" "${new_argv[@]}"
exit $rc

103
known-issues.md Normal file
View File

@ -0,0 +1,103 @@
# Known Issues
## Wrapper ships as base64 env var (migrate to v2 files axis)
**State**: accepted trade-off for v1, tracked for v2 migration.
**What**: `MutateEnv` base64-encodes `wrapper.sh` into `MOLECULE_GH_WRAPPER_B64`.
Each workspace template's `install.sh` decodes and writes it to
`/usr/local/bin/gh`.
**Why not a direct file write?** The platform's `provisionhook` interface
only has `EnvMutator` today — there's no `FileMutator` or
`contributions.files` surface. Inventing one here would couple this
plugin to a core-monorepo API change.
**Consequences**:
- Every workspace template (hermes, claude-code, langgraph, etc.) needs
a ~10-line `install.sh` snippet to decode + install the wrapper. The
plugin ships the canonical snippet; template authors paste it in.
- Wrapper size is capped by env var limits (EC2 user-data ~16KB total;
wrapper is ~5KB after base64, plenty of headroom).
- Wrapper updates propagate via plugin version bump, but require a
workspace RESTART to take effect (new user-data writes the new
wrapper). Not hot-reloadable in v1.
**Migration target**: [plugin-architecture-v2][v2], phase 6 — the
unified contribution manifest adds `contributions.files` as an explicit
axis. At that point:
- Plugin declares the file write in YAML manifest, not Go code.
- Platform's v2 provisioner handles the file write.
- Templates drop their install-snippet.
- Grade-A hot reload becomes possible (platform can re-emit the file
without a workspace restart).
[v2]: https://github.com/Molecule-AI/internal/blob/main/product/plugin-architecture-v2.md
---
## Role is read from env map, not workspace metadata
**State**: requires a small monorepo-side change.
**What**: `Mutator.MutateEnv` expects `env["MOLECULE_AGENT_ROLE"]` to
already be populated by the provisioner. The provisioner does NOT do
this today — workspace metadata's `role` field is not propagated into
the env map before mutators run.
**Why not read workspace metadata directly in the plugin?** The
`EnvMutator` interface deliberately gives plugins a narrow view — they
get the env map, the workspace ID, and nothing else. Passing the full
workspace struct would let plugins read secrets / plan / parent
relationships the plugin has no business caring about.
**Fix**: a small monorepo PR (~3 lines in
`workspace-server/internal/handlers/workspace_provision.go`) populates
`env["MOLECULE_AGENT_ROLE"]` from the workspace row's `role` column
before calling the mutator chain. Tracked in the companion monorepo PR.
Until that lands, the plugin is safe — absent the env var, it no-ops
and the wrapper script falls back to pass-through.
---
## Wrapper heuristics miss non-trivial argv shapes
**State**: accepted; works for 95% of agent gh calls.
**What**: `wrapper.sh` parses argv to detect publishing commands by
matching "first non-flag token + second non-flag token" against a
hardcoded list (`issue create`, `pr comment`, etc.). This misses:
- `gh api` calls constructing issues/PRs via raw REST — no `--body`
flag to intercept.
- Custom `gh alias` expansions (an alias like `gh post` expanding to
`gh issue create` won't be recognized — we see `gh post`, not
`gh issue create`).
- Flag ordering oddities where the verb appears after global flags
the wrapper doesn't know about (unlikely but possible).
**Consequences**: some agent actions bypass attribution. The audit log
still captures them (every invocation is logged regardless of
rewrite), so this is a visibility gap, not a correctness gap.
**Fix**: when/if this becomes common, migrate to wrapping the gh Go
binary directly (gh exposes a Go-plugin extension model) rather than
shell-argv rewriting. Not planned for v1.
---
## Audit log grows unbounded
**State**: accepted; needs rotation in workspace base image.
**What**: Every wrapper invocation appends one NDJSON line to
`/var/log/molecule-gh.ndjson`. No rotation, no size limit.
**Why no rotation in the plugin?** Log rotation is a workspace-host
concern, not a plugin concern. The workspace base image's logrotate
config should cover `/var/log/molecule-gh.ndjson` the same way it
covers other `/var/log/*.ndjson`.
**Fix**: ensure logrotate config in workspace base image includes this
file. Follow-up issue in the monorepo.

View File

@ -0,0 +1,50 @@
// Package pluginloader builds a provisionhook.Registry populated with
// the gh-identity Mutator so the platform's cmd/server can wire this
// plugin into the workspace provision chain.
//
// Operators integrating the plugin:
//
// reg, err := pluginloader.BuildRegistry()
// if err != nil { log.Fatalf("gh-identity: %v", err) }
// wh.SetEnvMutators(append(existingMutators, reg.Mutators()...))
//
// The plugin is INTENTIONALLY non-fatal on missing config: absent the
// optional MOLECULE_GH_IDENTITY_CONFIG_FILE env var, this still
// registers a Mutator that reads workspace-supplied roles and emits
// wrapper env — just without @me owner rewriting. Operators who want
// owner rewriting set MOLECULE_GH_IDENTITY_CONFIG_FILE.
package pluginloader
import (
"fmt"
"os"
"github.com/Molecule-AI/molecule-ai-plugin-gh-identity/internal/ghidentity"
)
// Result bundles what BuildRegistry returns — a single mutator plus
// whatever config it loaded, so test harnesses can inspect both.
type Result struct {
Mutator *ghidentity.Mutator
Config *ghidentity.Config
}
// BuildRegistry reads MOLECULE_GH_IDENTITY_CONFIG_FILE (optional),
// constructs the Mutator, and returns it.
//
// Error modes:
// - config file set but unreadable → error (operator bug; fail loud)
// - config file unset → fine, use empty map
// - config file set but non-existent → fine, use empty map (lets you
// point at a file that CI hasn't created yet without blocking boot)
func BuildRegistry() (*Result, error) {
cfgPath := os.Getenv("MOLECULE_GH_IDENTITY_CONFIG_FILE")
cfg, err := ghidentity.LoadConfig(cfgPath)
if err != nil {
return nil, fmt.Errorf("gh-identity: %w", err)
}
return &Result{
Mutator: &ghidentity.Mutator{Config: cfg},
Config: cfg,
}, nil
}

142
scripts/test-wrapper.sh Executable file
View File

@ -0,0 +1,142 @@
#!/usr/bin/env bash
# test-wrapper.sh — offline unit tests for wrapper.sh argv rewriting.
#
# We can't run the wrapper end-to-end without a real gh binary, so
# these tests stub gh to echo its argv and check the wrapper mutates
# correctly. Exercises the contract in wrapper.sh top-of-file.
#
# Run: bash scripts/test-wrapper.sh
set -u
HERE="$(cd "$(dirname "$0")" && pwd)"
WRAPPER="$HERE/../internal/ghidentity/wrapper.sh"
if [ ! -f "$WRAPPER" ]; then
echo "FAIL: wrapper.sh not found at $WRAPPER" >&2
exit 2
fi
TMP=$(mktemp -d)
# shellcheck disable=SC2064 # $TMP must expand now — it's what we want to clean
trap "rm -rf $TMP" EXIT
# Stub gh: echoes its argv, one per line prefixed with a marker.
mkdir -p "$TMP/bin"
cat > "$TMP/bin/gh" <<'STUB'
#!/usr/bin/env bash
for arg in "$@"; do
printf 'ARG<%s>\n' "$arg"
done
STUB
chmod +x "$TMP/bin/gh"
# The wrapper looks for /usr/bin/gh first; on macOS/linux CI that
# either is or isn't real gh. We redirect by symlinking our stub
# into a search path the wrapper checks, and prepending $TMP/bin
# to PATH for the command-v fallback.
ln -sf "$TMP/bin/gh" "$TMP/bin/gh.real"
PASS=0
FAIL=0
# Run wrapper with a controlled env and capture output.
# Takes argv for the wrapper directly; caller sets env vars inline.
run_wrapper() {
(
export MOLECULE_AGENT_ROLE="$MOLECULE_AGENT_ROLE"
export MOLECULE_OWNER="${MOLECULE_OWNER:-}"
export MOLECULE_ATTRIBUTION_BADGE="${MOLECULE_ATTRIBUTION_BADGE:-}"
export MOLECULE_WORKSPACE_ID="${MOLECULE_WORKSPACE_ID:-ws-test}"
# Force wrapper to find our stub by prepending a fake /usr/bin path.
# The wrapper checks /usr/bin/gh first — on CI that might be the real
# gh. For test predictability we use the PATH fallback by ensuring
# /usr/bin/gh does not exist IN OUR TEST ENV via sandboxing. Simplest:
# patch the wrapper via sed to point at our stub.
sed "s|/usr/bin/gh|$TMP/bin/gh|g; s|/opt/gh/bin/gh|$TMP/bin/gh.real|g" "$WRAPPER" > "$TMP/wrapper-patched.sh"
chmod +x "$TMP/wrapper-patched.sh"
bash "$TMP/wrapper-patched.sh" "$@" 2>&1
)
}
assert_contains() {
local label="$1" needle="$2" haystack="$3"
if echo "$haystack" | grep -qF "$needle"; then
echo " PASS $label"
PASS=$((PASS+1))
else
echo " FAIL $label"
echo " looking for: $needle"
echo " in: $haystack" | head -c 400
echo ""
FAIL=$((FAIL+1))
fi
}
assert_not_contains() {
local label="$1" needle="$2" haystack="$3"
if echo "$haystack" | grep -qF "$needle"; then
echo " FAIL $label (unexpectedly contained: $needle)"
FAIL=$((FAIL+1))
else
echo " PASS $label"
PASS=$((PASS+1))
fi
}
echo "== wrapper.sh =="
# === Test 1: no MOLECULE_AGENT_ROLE → pure passthrough ===
MOLECULE_AGENT_ROLE="" out=$(run_wrapper issue create --body "hello")
assert_contains "no role → passthrough, preserves --body" "ARG<--body>" "$out"
assert_contains "no role → passthrough, preserves hello verbatim" "ARG<hello>" "$out"
assert_not_contains "no role → no badge injected" "🤖" "$out"
# === Test 2: role set + issue create → badge prepended to --body ===
MOLECULE_AGENT_ROLE=PMM-Lead \
MOLECULE_OWNER=hongming \
MOLECULE_ATTRIBUTION_BADGE="🤖 [Agent: PMM-Lead · ws-abc]" \
out=$(run_wrapper issue create --body "hello")
assert_contains "badge-prepend: badge present" "🤖 [Agent: PMM-Lead · ws-abc]" "$out"
assert_contains "badge-prepend: original body preserved" "hello" "$out"
# === Test 3: --assignee @me → rewritten to OWNER ===
MOLECULE_AGENT_ROLE=PMM \
MOLECULE_OWNER=alice \
MOLECULE_ATTRIBUTION_BADGE="🤖 [Agent: PMM]" \
out=$(run_wrapper issue create --assignee @me --body "hi")
assert_contains "assignee-rewrite: new owner injected" "ARG<alice>" "$out"
assert_not_contains "assignee-rewrite: @me stripped" "ARG<@me>" "$out"
# === Test 4: --assignee=@me (equals form) → rewritten ===
MOLECULE_AGENT_ROLE=PMM \
MOLECULE_OWNER=alice \
MOLECULE_ATTRIBUTION_BADGE="🤖" \
out=$(run_wrapper issue create --assignee=@me --body "hi")
assert_contains "assignee-equals-form: rewritten" "ARG<--assignee=alice>" "$out"
# === Test 5: non-publish verb (`gh repo view`) → body untouched even if present ===
MOLECULE_AGENT_ROLE=PMM \
MOLECULE_OWNER=alice \
MOLECULE_ATTRIBUTION_BADGE="🤖 PMM" \
out=$(run_wrapper repo view --json body)
assert_not_contains "non-publish: no badge injection" "🤖 PMM" "$out"
# === Test 6: publish with no --body → NO synthetic body added ===
MOLECULE_AGENT_ROLE=PMM \
MOLECULE_OWNER=alice \
MOLECULE_ATTRIBUTION_BADGE="🤖 PMM" \
out=$(run_wrapper issue create --title "foo")
assert_not_contains "no-body: wrapper does not synth a --body" "ARG<--body>" "$out"
# === Test 7: --assignee @me with no OWNER → flag dropped entirely ===
MOLECULE_AGENT_ROLE=PMM \
MOLECULE_OWNER="" \
MOLECULE_ATTRIBUTION_BADGE="🤖" \
out=$(run_wrapper issue create --assignee @me --body "x")
assert_not_contains "assignee-drop: @me dropped when no owner" "ARG<@me>" "$out"
assert_not_contains "assignee-drop: --assignee flag dropped too" "ARG<--assignee>" "$out"
echo
echo "== results: $PASS passed, $FAIL failed =="
[ "$FAIL" -eq 0 ]