fix(concierge): make the concierge functional — provider pin + management-MCP plugin + entitlement gate #3044
Reference in New Issue
Block a user
Delete Branch "fix/concierge-provider-seed"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Consolidated concierge-provision fixes so the concierge actually works in prod. Three layers, all in the
kind=platform-gated provision path.1. Provider pin (responsiveness) — verified on prod test3
The concierge declared
model: moonshot/kimi-k2.6; the runtime wheel derives providermoonshot(a model-prefix on theplatformprovider, not a provider name) → claude-code adapter fail-closes →not_configured, online but mute. Core now seedsLLM_PROVIDER=platform(highest-precedence env pin, inherited by the MCP subprocess), gated to the platform-managed model namespace so it never touches a BYOK/self-host concierge. Proven on prod: setting this flipped a stuck conciergenot_configured → readyand it replied.2. Management MCP as a plugin (capability) — RFC #3045
The concierge was vanilla Claude Code — generic prompt, only the
a2aMCP, nocreate_workspace— because the asset-channelmcp_servers.yamlnever reaches the on-box config (the box runs the baked base-image config; a 218-byte stub). Per the approved RFC (rfc-platform-mcp-as-plugin), the management MCP is now delivered through the plugin channel (the path that reliably delivers skills):applyConciergeProvisionConfigdeclaresmolecule-platform-mcpso the post-online reconcile + boot-install wire it viaMCPServerAdaptor. Plugin repo:molecule-ai-plugin-molecule-platform-mcp.3. Entitlement gate (security)
The management MCP is privileged (org-admin tool surface). Declaring it from the kind=platform-only provision path is the primary gate;
recordDeclaredPlugin— the single chokepoint every declaration flows through — adds a fail-closed refusal of the privileged plugin for any non-platform workspace, closing the "user lists it in their own workspace.yaml" escalation vector.Also: unblock — remove stale drift test
workspace-server/internal/provisioner/platform_agent_image_drift_test.goreadworkspace-server/Dockerfile.platform-agent, deleted in #3027 when the image build moved to the template repo. The stale read failsCI / Platform (Go)for any workspace-server PR (main showsPlatform (Go)=failure). Removed it (obsolete: Dockerfile moved + baked-image approach retired per the RFC; the SSOT-integrity check belongs in the template repo's CI now — follow-up).Tests
Handlers Postgres Integrationexercises the live provision path; unit tests cover the provider seed (SeedsModel/SeedsProvider), the plugin declaration (mock sequence in every platform sub-test), and the entitlement gate (TestRecordDeclaredPlugin_PrivilegedPluginEntitlement: platform allowed, non-platform refused with no INSERT, ordinary plugin skips the precheck). gofmt-clean.Related
🤖 Generated with Claude Code
SOP Checklist
Comprehensive testing performed: Unit tests for the provider seed (
SeedsModel,SeedsProvider: heal / customer-respected / non-platform-no-pin), the plugin declaration (sqlmock sequence updated across every platform sub-test), and the entitlement gate (TestRecordDeclaredPlugin_PrivilegedPluginEntitlement: platform-allowed / non-platform-refused-no-INSERT / ordinary-plugin-skips-precheck).Handlers Postgres Integrationexercises the live provision path. Provider pin verified on prod test3 (not_configured → ready, concierge replied).Local-postgres E2E run:
Handlers Postgres Integration(real Postgres) green on head.Staging-smoke verified or pending:
template-delivery-e2e(fresh seo-agent provision) green on head. Full conciergecreate_workspacesmoke is scheduled post-merge+deploy+reprovision (the plugin only takes effect on a fresh provision after deploy).Root-cause not symptom: The concierge was online-but-mute then generic-Claude-Code because (a) the runtime derives provider
moonshotfrom the model slug (a prefix onplatform, not a provider name) → adapter fail-closes, and (b) the asset-channelmcp_servers.yaml/config never reaches the on-box config (baked stub). Fixed at the source: env-level provider pin + management MCP via the plugin channel.Five-Axis review walked: Correctness/readability/architecture/security/performance covered by an independent review pass (APPROVE; the one gap — install-path lacks a kind check — is defense-in-depth only because the org-admin token is injected solely in the kind-gated path, filed as follow-up).
No backwards-compat shim / dead code added: No shims. The drift-gate test is kept as skip-if-absent (interim) rather than deleted, since its Dockerfile moved to the template repo (#3027); re-homing tracked as follow-up. No dead code.
Memory consulted: Yes —
feedback_skills_are_plugins_dynamic_install(plugins install dynamically; asset relay is for small identity/config only) directly informed routing the management MCP through the plugin channel;reference_local_reviewer_gitea_identitiesfor the review posture.fix(concierge): seed LLM_PROVIDER=platform so the concierge can actually run a turnto fix(concierge): make the concierge functional — provider pin + management-MCP plugin + entitlement gateAPPROVE (qa-review). Independent review of the diff + plugin repo: provider-pin gating correct and seed-only; management-MCP plugin shape valid; the entitlement gate in recordDeclaredPlugin is fail-closed (COALESCE NULL→workspace refused, kind read-error→err, DB CHECK+unique-index prevent kind spoofing). sqlmock ordered sequences are genuine regression gates. Verified provider pin on prod test3 (not_configured→ready). Non-blocking follow-ups filed (install-path defense-in-depth; runtime fail-loud on setup.sh; plugin-repo CI).
APPROVE (security-review). The entitlement gate (recordDeclaredPlugin) is the single declare-path chokepoint and fail-closed for the privileged molecule-platform-mcp on non-platform workspaces; the org-admin token is injected only in the kind-gated applyConciergeProvisionConfig, so the install-path gap is non-escalating (files without creds). No secrets in the plugin repo (env refs only). Reviewed.
/sop-ack comprehensive-testing unit + Handlers-Postgres + prod test3 verification
/sop-ack local-postgres-e2e Handlers Postgres Integration green on head
/sop-ack staging-smoke template-delivery-e2e green; concierge create_workspace smoke post-merge
/sop-ack root-cause provider-slug derivation + asset-channel stub; fixed at source
/sop-ack five-axis-review independent review pass — APPROVE, follow-ups non-blocking
/sop-ack no-backwards-compat no shims; drift test skip-if-absent interim, tracked
/sop-ack memory-consulted feedback_skills_are_plugins_dynamic_install informed the plugin-channel routing