fix(ci)(interim): re-add continue-on-error to platform-build (mc#664 fix-forward in flight)
Some checks failed
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 8s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 11s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 11s
CI / Detect changes (pull_request) Successful in 18s
E2E API Smoke Test / detect-changes (pull_request) Successful in 21s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 22s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 22s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 20s
gate-check-v3 / gate-check (pull_request) Successful in 18s
qa-review / approved (pull_request) Failing after 13s
security-review / approved (pull_request) Failing after 13s
sop-tier-check / tier-check (pull_request) Successful in 17s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 9s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 10s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 7s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 4s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 17s
audit-force-merge / audit (pull_request) Successful in 21s
CI / Python Lint & Test (pull_request) Successful in 7m20s
CI / Platform (Go) (pull_request) Failing after 8m35s
CI / Canvas (Next.js) (pull_request) Successful in 10m33s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
CI / all-required (pull_request) Failing after 5s
Some checks failed
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 8s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 11s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 11s
CI / Detect changes (pull_request) Successful in 18s
E2E API Smoke Test / detect-changes (pull_request) Successful in 21s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 22s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 22s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 20s
gate-check-v3 / gate-check (pull_request) Successful in 18s
qa-review / approved (pull_request) Failing after 13s
security-review / approved (pull_request) Failing after 13s
sop-tier-check / tier-check (pull_request) Successful in 17s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 9s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 10s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 7s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 4s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 17s
audit-force-merge / audit (pull_request) Successful in 21s
CI / Python Lint & Test (pull_request) Successful in 7m20s
CI / Platform (Go) (pull_request) Failing after 8m35s
CI / Canvas (Next.js) (pull_request) Successful in 10m33s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
CI / all-required (pull_request) Failing after 5s
Phase-3-masked test failures in workspace-server/internal/handlers/ surfaced when #656 (RFC internal#219 Phase 4) flipped platform-build continue-on-error from true to false on0e5152c3. The pre-#656 main was masking these: 4x delegation_test.go (lines 1110/1176/1228/1271): TestExecuteDelegation_DeliveryConfirmedProxyError_TreatsAsSuccess TestExecuteDelegation_ProxyErrorNon2xx_RemainsFailed TestExecuteDelegation_ProxyErrorEmptyBody_RemainsFailed TestExecuteDelegation_CleanProxyResponse_Unchanged Root cause: expectExecuteDelegationBase/Success/Failed helpers do not mock the DB queries production has issued since ~2026-04-21: - UPDATE workspaces SET last_outbound_at (commit2f36bb9a, 2026-04-18, async goroutine fired from logA2ASuccess in a2a_proxy_helpers.go) - SELECT delivery_mode / SELECT runtime FROM workspaces (lookup* in a2a_proxy_helpers.go since file split in64ccf8e1, 2026-04-21) - INSERT INTO activity_logs (a2a_receive) via LogActivity in logA2ASuccess/logA2AError (preexisting, not mocked) - recordLedgerStatus writes (RFC #2829 #318) Symptoms: sqlmock unexpected query → production short-circuits → trailing ExpectExec for completed/failed never fires → mock.ExpectationsWereMet() reports unmet remaining expectations. 8.11s uniform wall time is the delegationRetryDelay × 2 attempts after the first unexpected-query causes a transient retry path. Halt cond #3 applies (>7 days masked → broader sweep needed; many subsequent commits stacked on top). 1x mcp_test.go:433 (TestMCPHandler_CommitMemory_GlobalScope_Blocked): Commit7d1a189f(2026-05-10) hardened mcp.go:427 to scrub err.Error() from JSON-RPC error.Message (OFFSEC-001 / #259) — returning the constant string "tool call failed" instead. The test asserts the message contains "GLOBAL". Production-vs-test contract collision; needs a design call (revert OFFSEC scrub for this code class, or update the test to assert a different oracle e.g. captured logs / specific error code). Halt cond #2 applies (alternate-class finding, not sqlmock-mismatch). Time-boxed Option A (90 min sqlmock update) does not fit either failure class within scope. Choosing Option B per brief: interim re-mask of platform-build only — the other 4 #656 flips (changes, canvas-build, shellcheck, python-lint) retain continue-on-error: false. This is a sequenced revert→fix→reflip per feedback_strict_root_only_after_class_a emergency clause, NOT a permanent re-mask. mc#664 stays open as the fix-then-reflip tracker. Process note for charter SOP-N (companion to vendor-truth-review-discipline): before flipping a job continue-on-error: true → false, do not trust the combined-status "success" signal alone — pull the actual run log and grep for --- FAIL / FAIL <package> to confirm the tests really pass. The masked green on0e5152c3came from continue-on-error suppressing the per-job status to neutral, which the combined-status aggregator counted as not-failure. Cross-links: - mc#664 (hongming-pc2 04:35Z Phase-3-masked defect filing) - mc#656 (the flip that surfaced this;0e5152c3first commit to actually run the Go tests against internal/handlers/* since the silent stack-up began) - feedback_strict_root_only_after_class_a (revert→fix→reflip discipline) - feedback_return_contract_change_audit_caller_tests (mcp case applies) - feedback_no_such_thing_as_flakes (these are real bugs, not flakes) Evidence (run 17810 / job 33895 / task 34532 on0e5152c3): - 5x --- FAIL lines confirmed in actions_log/molecule-ai/molecule-core/e4/34532.log - delegation_test.go:1110/1176/1228/1271: "unmet sqlmock expectations" - mcp_test.go:433: "error message should mention GLOBAL, got: tool call failed" Gitea 1.22.6 quirk #10 confirmation: per the run, job-level continue-on-error DID still allow the combined commit-status to show neutral/success when the job logically failed — so the #656 PR check showed green even with these underlying failures masked. Reproduced. Co-Authored-By: Hongming Wang <hongmingwang.rabbit@users.noreply.github.com>
This commit is contained in:
parent
0e5152c342
commit
9aa2b13934
@ -126,8 +126,29 @@ jobs:
|
||||
name: Platform (Go)
|
||||
needs: changes
|
||||
runs-on: ubuntu-latest
|
||||
# Phase 4 (RFC #219 §1): confirmed green on main 2026-05-12.
|
||||
continue-on-error: false
|
||||
# mc#664 (interim): re-mask platform-build pending fix-forward. Phase 4
|
||||
# (#656) flipped this to continue-on-error: false based on a Phase-3-masked
|
||||
# "green on main 2026-05-12" — the prior continue-on-error: true had
|
||||
# been hiding failing tests in workspace-server/internal/handlers/.
|
||||
# Two distinct failure classes surfaced on 0e5152c3:
|
||||
# (1) 4x delegation_test.go (lines 1110/1176/1228/1271): helpers
|
||||
# expectExecuteDelegationBase/Success/Failed are missing sqlmock
|
||||
# expectations for queries production has issued since ~2026-04-21
|
||||
# (last_outbound_at UPDATE, lookupDeliveryMode/Runtime SELECTs,
|
||||
# a2a_receive INSERT activity_logs, recordLedgerStatus writes).
|
||||
# Halt cond #3 applies (regression > 7 days → broader sweep).
|
||||
# (2) 1x mcp_test.go:433 (TestMCPHandler_CommitMemory_GlobalScope_Blocked):
|
||||
# commit 7d1a189f (2026-05-10) hardened mcp.go to scrub err.Error()
|
||||
# from JSON-RPC responses (OFFSEC-001), but the test asserts the
|
||||
# error message contains "GLOBAL". Production-vs-test contract
|
||||
# collision — needs design call, not mock update.
|
||||
# Time-boxed Option A (90 min) did not fit the cross-cutting scope.
|
||||
# This is a sequenced revert→fix→reflip per
|
||||
# feedback_strict_root_only_after_class_a emergency clause — NOT
|
||||
# a permanent re-mask. Re-flip blocked on mc#664 fix-forward landing.
|
||||
# Other 4 #656 flips (changes, canvas-build, shellcheck, python-lint)
|
||||
# retain continue-on-error: false; only platform-build regresses.
|
||||
continue-on-error: true # mc#664 fix-forward in flight; re-flip when tests pass
|
||||
defaults:
|
||||
run:
|
||||
working-directory: workspace-server
|
||||
|
||||
Loading…
Reference in New Issue
Block a user