molecule-core

Author	SHA1	Message	Date
claude-ceo-assistant	29da0882a7	Merge branch 'main' into fix/175-env-matched-pair-guard	2026-05-08 02:46:40 +00:00
claude-ceo-assistant	8798b316f6	chore: promote accumulated staging fixes to main (vitest, postgres, e2e-api, eic, plugins) (#99 ) Brings staging tip `b11044f8` to main. Includes #97 (vitest) + #98 (handlers-postgres) + #100 (e2e-api) + EIC race fix + #84 (SaaS plugin EIC) + accumulated staging commits. Triggers Vercel + Railway + ECR production deploy chain. Approved by security-auditor.	2026-05-08 02:46:39 +00:00
claude-ceo-assistant	b11044f885	fix(plugins): SaaS (EC2-per-workspace) install/uninstall via EIC SSH (#84 ) Closes docker-only row in backends.md. Approved by security-auditor.	2026-05-08 02:15:49 +00:00
claude-ceo-assistant	d201b13b93	Merge branch 'staging' into fix/saas-plugin-install-eic	2026-05-08 02:03:53 +00:00
claude-ceo-assistant	a4ab623bbf	fix(ci): e2e-api — parallel-safe postgres/redis containers (#100 ) Closes #94. Mirrors PR #98 pattern. Approved by security-auditor.	2026-05-08 02:02:57 +00:00
devops-engineer	b9d2786f45	fix(ci): e2e-api — parallel-safe postgres/redis containers + provisioner setup Class B Hongming-owned CICD red sweep, e2e-api leg. Same substrate hazard as PR #98 (handlers-postgres-integration) — Gitea act_runner configures `container.network: host` operator-wide, so: * Two concurrent e2e-api runs both attempted to bind `-p 15432:5432` and `-p 16379:6379` on the operator host. Verified in run a7/2727 on 2026-05-07: `docker: Error response from daemon: Conflict. The container name "/molecule-ci-redis" is already in use by container af10f438...` — exit 125, job fails before any test runs. * Hardcoded container names `molecule-ci-postgres` / `-redis` plus the leading `docker rm -f` step meant a second job's startup also KILLED the first job's still-running services. Fix shape (mirrors PR #98 bridge-net pattern, adapted because the platform-server is a Go binary on the host, not a containerised step): 1. Per-run unique container names: `pg-e2e-api-${RUN_ID}-${RUN_ATTEMPT}`, `redis-e2e-api-${RUN_ID}-${RUN_ATTEMPT}`. Unique even across reruns of the same run_id. 2. Ephemeral host port per run via `-p 0:5432` / `-p 0:6379` and `docker port` lookup, exported as `DATABASE_URL` / `REDIS_URL` to `$GITHUB_ENV`. No fixed host-port → no collision. 3. `127.0.0.1` (NOT `localhost`) in URLs — IPv6 first-resolve flake fixed in #92 stays fixed. 4. `if: always()` cleanup so containers don't leak when test steps fail. Issue #94 items #2 + #3 also addressed: * Pre-pull `alpine:latest` (provisioner uses it for ephemeral token-write containers in `internal/handlers/container_files.go`). * Idempotent `docker network create molecule-monorepo-net` (the provisioner attaches workspace containers via that bridge — `internal/provisioner/provisioner.go::DefaultNetwork`). Issue #94 item #1 (timeouts) NOT bumped — recent log evidence shows postgres ready in 3s, redis in 1s, platform in 1s when they DO come up. Timeouts are not the bottleneck on the current substrate. NOT addressed here (out of scope, separate change required): * `Run E2E API tests` step has been failing on `Status back online` because the platform's langgraph workspace template image (`ghcr.io/molecule-ai/workspace-template-langgraph:latest`) returns 403 Forbidden post-2026-05-06 GitHub org suspension. That is a template-registry resolution issue (ADR-002 / local-build mode) and belongs in a workspace-server change, not this workflow file. This PR fixes the parallel-collision class and the workflow setup hygiene; the langgraph-403 failure will still surface on runs after this lands until template resolution is fixed separately. Verified manually on operator host 2026-05-08: docker now hands out ephemeral ports on `-p 0:5432`, two parallel runs land on different ports, both reach pg_isready GREEN. Closes #94 (items #2 and #3; item #1 documented as not-bottleneck; langgraph-template-403 referenced for follow-up). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-07 18:59:56 -07:00
claude-ceo-assistant	576166c8c3	Merge branch 'staging' into fix/saas-plugin-install-eic	2026-05-08 01:29:11 +00:00
claude-ceo-assistant	8a3141a763	fix(ci): handlers-postgres — sidestep port collision under host-network runner (#98 ) Switches from services: block to --network molecule-monorepo-net with unique per-run container names. Avoids port-5432 collision when parallel Handlers-Postgres jobs run on host-network act_runner. Approved by security-auditor.	2026-05-08 01:29:06 +00:00
claude-ceo-assistant	78f77532ea	Merge branch 'main' into fix/175-env-matched-pair-guard	2026-05-08 01:27:44 +00:00
claude-ceo-assistant	dccd1aa1ba	fix(canvas-tests): bump vitest testTimeout to 30000ms on CI for cold-start overhead (#97 ) Closes molecule-core#96. Unblocks Canvas (Next.js) on PRs #82/#81/#54/#53 after rebase. Approved by security-auditor.	2026-05-08 01:27:43 +00:00
devops-engineer	a302d75129	chore(ci): retrigger Handlers Postgres Integration for second-green proof Class B verification — second consecutive green run to demonstrate the fix isn't flaky. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-07 18:23:05 -07:00
devops-engineer	241859b552	fix(ci): handlers-postgres — sidestep port collision under host-network runner Class B Hongming-owned CICD red sweep. The Handlers Postgres Integration workflow has been silently failing on staging push and PRs ever since #92 fixed the IPv6 flake — the IPv6 fix correctly pinned 127.0.0.1, but unmasked a deeper issue: with our act_runner global container.network=host config, multiple concurrent runs of this workflow each tried to bind 0.0.0.0:5432 on the operator host. The first wins; subsequent postgres service containers exit with `FATAL: could not create any TCP/IP sockets` + `Address in use`. Docker auto-removes them (act_runner sets AutoRemove:true), so by the time `Apply migrations` runs `psql`, the container is gone — Connection refused, then `failed to remove container: No such container` at cleanup time. Per-job container.network override is silently ignored by act_runner (`--network and --net in the options will be ignored.`), so we sidestep `services:` entirely. The job container still uses host-net (required for cache server discovery on the operator's bridge IP). We launch a sibling postgres on the existing molecule-monorepo-net bridge with a unique name per run (run_id+run_attempt) and connect via the bridge IP read from `docker inspect`. Verified manually on operator host 2026-05-08: 2× postgres on host-net collides, but on the bridge with unique names + different IPs, both succeed and each is reachable from a host-net job container. Adds: - always()-cleanup step so containers don't leak on test failure - Diagnostic dump now includes the postgres container's docker logs - Runbook at docs/runbooks/ documenting the substrate behavior + the pattern future workflows should adopt for any `services:`-shaped need. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-07 18:21:12 -07:00
devops-engineer	da1a5af7a4	fix(canvas): bump vitest testTimeout to 30s on CI for v8-coverage cold start (#96 ) Class A red sweep — 3 first-tests timing out at the 5000ms default on the self-hosted Gitea Actions Docker runner across 4 unrelated PRs (#82, #81, #54, #53). The PRs share zero canvas/ surface — same 3 tests, same cold-start signature, same shape on every run. Root cause: `npx vitest run --coverage` cold-start cost (v8 coverage instrumentation init + JSDOM bootstrap + heavy @/components/* and @/lib/* import + first React render) consumes 5-7 seconds for the first synchronous test in a heavyweight test file. Empirically: ActivityTab "renders all 7 filter options" 5230ms (FAIL) CreateWorkspaceDialog "opens the dialog ..." 6453ms (FAIL) ConfigTab.provider "PUTs the new provider on Save" 5605ms (FAIL) vs subsequent tests in the same files at 100-1500ms each. The component code is correct (e.g. ActivityTab.FILTERS has 7 entries matching the test). 1407 tests pass locally with --coverage in 9-15s; CI runs at 200s under the same flag — the gap is import/transform/environment overhead, not test logic. Fix: CI-conditional `testTimeout: process.env.CI ? 30000 : 5000` in canvas/vitest.config.ts. Local-dev sensitivity to genuine waitFor races preserved; CI gets ~5x headroom over the worst observed first-test (6453ms). Same shape Vitest documents at <https://vitest.dev/config/testtimeout> and <https://vitest.dev/guide/coverage#profiling-test-performance>. Verification: - Local: 5x runs of the 3 failing test files, all 74 tests green (process.env.CI unset → 5000ms applies). - Local: 7s sleep probe FAILS at 5000ms default and PASSES under CI=true → ternary takes effect as written. - Local: full canvas suite under CI=true with --coverage: "Test Files 98 passed (98) \| Tests 1407 passed (1407)". Closes #96. Refs: #82, #81, #54, #53. Hostile self-review (3 weakest spots): 1. 30000ms is a guess, not a measurement. Mitigation: vitest still emits per-test duration; a real 25s+ test will surface as a duration regression and we dial down. 2. Doesn't fix the Docker-runner-overhead root-root-cause. True. That is a multi-week perf project. The right trade today is unblocking 4 PRs from this single class. 3. Local-default of 5000ms means a real 8s race that flies on CI's 30000ms could pass without local sensitivity. Mitigation: dev-time waitFor races are caught at the per-test level; suite-level cold- start is the only legitimate >5s case here. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-07 18:19:58 -07:00
devops-engineer	09ec0b1b4a	chore: sync main → staging (auto, `068c9682`)	2026-05-08 01:17:12 +00:00
claude-ceo-assistant	068c968206	docs(hermes): hermes-agent fork moved to Gitea (#90 ) Doc update reflecting #160 hermes-agent migration. Approved by security-auditor.	2026-05-08 01:17:03 +00:00
devops-engineer	419c109f1d	chore: sync main → staging (auto-resolved workflow conflicts via main-wins) Conflicted files in .github/workflows/ taken from main: .github/workflows/ci.yml .github/workflows/e2e-staging-canvas.yml .github/workflows/retarget-main-to-staging.yml Conflicts arose from main advancing through PR #66/#79/#89 (CI workflow rewrites) while staging hadn't picked up the changes yet. Main is the source of truth for CI workflows; staging is downstream. Co-authored-by: Claude (orchestrator)	2026-05-08 01:00:48 +00:00
claude-ceo-assistant	97c042f666	Merge branch 'main' into fix/hermes-agent-doc-gitea-migration	2026-05-08 00:54:30 +00:00
claude-ceo-assistant	7492d9661c	Merge branch 'main' into fix/175-env-matched-pair-guard	2026-05-08 00:54:14 +00:00
claude-ceo-assistant	3d6303afcc	fix(ci): rewrite retarget-main-to-staging for Gitea REST API (#79 ) Closes #74. Approved by security-auditor.	2026-05-08 00:26:27 +00:00
claude-ceo-assistant	3fcaa1fcc5	Merge branch 'main' into fix/hermes-agent-doc-gitea-migration	2026-05-08 00:21:17 +00:00
claude-ceo-assistant	7f61206a18	Merge branch 'staging' into fix/saas-plugin-install-eic	2026-05-08 00:21:10 +00:00
claude-ceo-assistant	6c823cf673	Merge branch 'main' into fix/196-retarget-main-to-staging-gitea-rest	2026-05-08 00:20:49 +00:00
claude-ceo-assistant	36a509abfb	Merge branch 'main' into fix/175-env-matched-pair-guard	2026-05-08 00:20:43 +00:00
claude-ceo-assistant	12ff797d12	fix(ci): close 3 chronic Gitea-Actions workflow flakes (#92 ) Closes #88. Bundles localhost→127.0.0.1 + 2 other Gitea-act_runner flakes per feedback_gitea_actions_migration_audit_pattern. Approved by security-auditor.	2026-05-08 00:20:42 +00:00
claude-ceo-assistant	4193d54852	fix(ci): pin actions/upload-artifact + download-artifact to @v3 (#89 ) Closes #210. Unblocks 5 stuck PRs (#53/#54/#69/#71/#76/#81). Approved by security-auditor.	2026-05-08 00:20:00 +00:00
claude-ceo-assistant (Claude Opus 4.7 on Hongming's MacBook)	7eb348536b	fix(harness): bake cf-proxy nginx.conf at build time, not via configs: The previous configs:-based fix (`87b971a2`) didn't actually fix the DinD issue — Compose v2 falls back to bind mounts for `configs:` when swarm mode is not active, so the resulting runc invocation still tries to mount /workspace/.../cf-proxy/nginx.conf from the OUTER host filesystem that the act_runner-vs-host-docker socket-mount can't see. Same "not a directory" error returned. Switch to a thin Dockerfile (cf-proxy/Dockerfile) that COPYs nginx.conf into nginx:1.27-alpine. The build context is uploaded to the daemon as a tarball, not bind-mounted from the host filesystem, so the path translation gap doesn't apply. Verified locally: `docker build` + `docker run cf-proxy nginx -T` reproduces the baked config end-to-end. Trade-off: ~2-3s build cost on every harness up. Acceptable for the Gitea CI gate; local-dev re-builds the image only when nginx.conf changes (Docker layer cache). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-07 17:09:08 -07:00
claude-ceo-assistant (Claude Opus 4.7 on Hongming's MacBook)	87b971a292	fix(ci): close 3 chronic Gitea-Actions workflow flakes (closes #88 ) Three workflows have been failing on every push to this Gitea repo for GitHub-shaped reasons that don't translate to act_runner. Surfaced while landing #84; bundled per `feedback_gitea_actions_migration_audit_pattern` ("bundle per-repo, not per-finding") instead of three separate PRs. 1) handlers-postgres-integration: localhost → 127.0.0.1 - lib/pq tries to dial localhost → ::1 first; the postgres service container only listens on IPv4 → ECONNREFUSED → all TestIntegration_* fail. Pin IPv4 to make the job deterministic. 2) pr-guards / disable-auto-merge-on-push: Gitea no-op - The previous reusable-workflow caller invoked `gh pr merge --disable-auto`, which calls GitHub's GraphQL API. Gitea returns HTTP 405 on /api/graphql → step always fails. Inline the step so it can detect Gitea (GITEA_ACTIONS=true OR repo url under moleculesai.app) and no-op with a notice. Auto-merge gating is moot on Gitea anyway: there's no `--auto` primitive being touched. Job stays ALWAYS-RUN so branch protection's required check still lands SUCCESS (avoids the SKIPPED-in-set trap from `feedback_branch_protection_check_name_parity`). 3) Harness Replays: cf-proxy nginx.conf via docker `configs:` (not bind) - act_runner runs the workflow inside a runner container; runc in the docker daemon below resolves bind-mount source paths on the OUTER host, not inside the runner. The path `/workspace/.../cf-proxy/nginx.conf` is invisible there → "not a directory" runc error. Switching to compose `configs:` packages the file as content rather than a host bind, sidestepping the DinD path-translation gap. Local validation: - YAML parsed clean for all 3 files. - cf-proxy nginx.conf: standalone `docker compose run cf-proxy nginx -T` reproduced the configs: mount end-to-end and dumped the config correctly. The full harness compose still renders via `docker compose config`. Real-CI verification will land on this branch's first push. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-07 17:06:09 -07:00
devops-engineer	0bcf195fbc	docs(hermes): hermes-agent fork moved to Gitea (post-suspension) The `HongmingWang-Rabbit/hermes-agent` fork is no longer reachable on github.com (account suspended 2026-05-06). The patched fork now lives at https://git.moleculesai.app/molecule-ai/hermes-agent. Same SHAs, same branches — pure URL flip. See molecule-ai/internal#72 for the github.com fork shell decision. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-07 16:57:57 -07:00
devops-engineer	8885f7cd12	fix(ci): pin actions/upload-artifact + download-artifact to @v3 for Gitea compatibility actions/upload-artifact@v4+ and download-artifact@v4+ use the GHES 3.10+ artifact protocol that Gitea Actions (act_runner v0.6 / Gitea 1.22.x) does NOT implement. Failure cite from PR #54 run 1325 jobs/2: ::error::@actions/artifact v2.0.0+, upload-artifact@v4+ and download-artifact@v4+ are not currently supported on GHES. Pinned all 3 references to v3.2.2 (latest v3) at SHA-pinned form for supply-chain hygiene, matching the existing `uses:` style in this repo. Affected workflows: - ci.yml (Canvas Next.js coverage upload, blocks `CI / Canvas (Next.js)` required check on every PR — was the merge-queue blocker for #53, #54, #69, #71, #76, #81) - e2e-staging-canvas.yml (Playwright report + screenshots on failure) No download-artifact callers in the repo, so v3-pin doesn't compose-break anywhere. Drop these pins post-Gitea-1.23+ when the v4 artifact protocol ships, or migrate to a Gitea-native action. Closes #210. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-07 16:54:44 -07:00
devops-engineer	0c7f3c8909	chore: sync main → staging (auto, `cdbf28fd`)	2026-05-07 23:45:36 +00:00
claude-ceo-assistant	cdbf28fd76	ci(canary): synthetic-check cron for AUTO_SYNC_TOKEN rotation drift (#77 ) 6h cron probes auth + scope + git-push --dry-run. Closes #72. Approved by security-auditor.	2026-05-07 23:45:25 +00:00
devops-engineer	3f9ba90672	chore: sync main → staging (auto, `07bd91e4`)	2026-05-07 23:44:31 +00:00
claude-ceo-assistant	4b82db72a7	Merge branch 'main' into fix/issue-72-auto-sync-token-canary-v2	2026-05-07 23:44:22 +00:00
claude-ceo-assistant	07bd91e436	fix(ci): replace gh run list with Gitea commit-status query (#83 ) Class F of #75 sweep. /commits/{sha}/statuses replaces unavailable workflow-runs API. 4 mapping buckets verified against synthetic+real Gitea data. Approved by security-auditor.	2026-05-07 23:44:21 +00:00
claude-ceo-assistant	ed0874504e	Merge branch 'main' into fix/issue75-class-F-gh-run-list-to-statuses	2026-05-07 23:44:00 +00:00
devops-engineer	6656862870	chore: sync main → staging (auto, `e39fc920`)	2026-05-07 23:39:46 +00:00
claude-ceo-assistant	e39fc92074	fix(ci): replace gh pr CLI with Gitea v1 REST in workflows + scripts (#80 ) Class A of #75 sweep. 23 bash + 9 python tests pass. Live-integration verified against prod Gitea. Approved by security-auditor.	2026-05-07 23:39:22 +00:00
devops-engineer	6d7554d282	chore: sync main → staging (auto, `d84d88ad`)	2026-05-07 23:38:08 +00:00
claude-ceo-assistant	1819ac21f4	Merge branch 'main' into fix/issue75-class-A-gh-pr-to-gitea-rest	2026-05-07 23:37:57 +00:00
claude-ceo-assistant	d84d88ad70	feat(workspace-server): local-dev provisioner builds from Gitea source (#70 ) Hongming-locked Option C: MOLECULE_IMAGE_REGISTRY presence as mode marker. ADR-002 captures rationale. 30 new tests + 64 existing preserved. Hostile-review weakest 3 filed as #204/#205/#206 follow-ups. Closes #63 (Task #194). Approved by security-auditor.	2026-05-07 23:37:56 +00:00
devops-engineer	ae49b184f6	chore: sync main → staging (auto, `1f1ead18`)	2026-05-07 23:33:25 +00:00
claude-ceo-assistant	6bb272360d	Merge branch 'main' into feat/issue-63-local-build-from-gitea-v2	2026-05-07 23:33:03 +00:00
claude-ceo-assistant	1f1ead1833	fix(ci): rewrite auto-promote-staging for Gitea (#78 ) Removes ~60 lines polling+dispatch (Gitea fires on:push naturally on token-merge). Uses Gitea merge_when_checks_succeed; preserves required_approvals=1 on main. Closes #73. Approved by security-auditor.	2026-05-07 23:32:58 +00:00
claude-ceo-assistant	c5f40de585	Merge branch 'main' into fix/195-auto-promote-staging-gitea-rest	2026-05-07 23:30:09 +00:00
devops-engineer	a234ed5c51	chore: sync main → staging (auto, `330a5842`)	2026-05-07 23:29:14 +00:00
claude-ceo-assistant	330a5842ab	Merge pull request 'feat(canvas): ActivityTab → ACTIVITY_LOGGED subscriber (#61 stage 3, final)' (#76 ) from feat/canvas-activity-tab-ws-subscribe into main	2026-05-07 23:27:32 +00:00
devops-engineer	cd55ce10d2	chore: sync main → staging (auto, `502aa082`)	2026-05-07 23:25:49 +00:00
claude-ceo-assistant	2505b36a2c	Merge branch 'main' into fix/195-auto-promote-staging-gitea-rest	2026-05-07 23:22:24 +00:00
security-auditor	e0feae18f4	Merge remote-tracking branch 'origin/main' into feat/canvas-activity-tab-ws-subscribe	2026-05-07 16:18:34 -07:00
claude-ceo-assistant	502aa082bc	Merge pull request 'feat(canvas): A2ATopologyOverlay → ACTIVITY_LOGGED subscriber (#61 stage 2)' (#71 ) from feat/canvas-topology-overlay-ws-subscribe into main	2026-05-07 23:18:24 +00:00

1 2 3 4 5 ...

4690 Commits