fix(tests/e2e#60+#65): cap org-slug at 32 chars (CP regex) + e2e-pv prefix + always log org-create body #2839
Reference in New Issue
Block a user
Delete Branch "fix/60-staging-e2e-org-create-hardening"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Two linked fixes for the chronic continuous-synth-e2e org-create 400/body-loss failures (the 2026-06-12 staging Platform Boot red run 363934 plus earlier 409s in core#2782):
SLUG HELPER CAP (tests/e2e/lib/collision-proof-slug.sh):
The org-create endpoint rejects slugs over 31 chars with HTTP 400 (CP regex is leading char + 2-31 additional). The previous helper produced slugs up to 61 chars (e2e-smoke- 11 + date 8 + run_id 32 + uuid 8 + 2 separators) which the CP silently rejected. Fix: compute the run_id budget precisely as CP_ORG_SLUG_MAX_LEN minus prefix_len minus 19, with the 8-char uuid anchor preserved at the END. The assert now also catches over-cap slugs at harness startup as defense in depth.
Helper signature changed:
old: make_collision_proof_slug_suffix <run_id>
new: make_collision_proof_slug_suffix <run_id> [prefix_len]
prefix_len defaults to 11 (e2e-smoke-) so existing callers without it still work. Callers in test_staging_full_saas.sh updated to pass 11 (smoke mode) or 4 (non-smoke mode).
ORG-CREATE BODY CAPTURE (tests/e2e/test_staging_full_saas.sh):
Pre-fix the org-create call was a bare
admin_call POST /cp/admin/orgs -d ...under set -euo pipefail. When the 400 fires, --fail-with-body makes curl exit 22, the substitution returns nonzero, and the whole script aborts BEFORE the body-logging line runs. Triage on the 2026-06-12 red had to guess whether the failure was a slug collision (409) or a different state-conflict. Fix: mirror the lines 875-889 pattern (set +e around curl, -o bodyfile plus -w percent-http_code to capture status separately) so the 400/409 body is always on disk regardless of HTTP status. The error message also adds a hint when the slug is over the 31-char cap, steering the operator to the assertion rather than the CP error.Diff stat:
tests/e2e/lib/collision-proof-slug.sh | 114 ++++++++++++++++++----------------
tests/e2e/test_staging_full_saas.sh | 64 +++++++++++++++----
2 files changed, 111 insertions(+), 67 deletions(-)
Manual slug-length test: e2e-smoke-$(make_collision_proof_slug_suffix test-12345 11) yields a 30-char slug with the 8-char uuid preserved at the end and assert_collision_proof_slug returns 0. 1-char headroom on the 31-cap because the date_part plus run_id is short enough not to need truncation in the test case (the helper truncates correctly when needed).
bash -n on both files: clean. shellcheck on the changed lines: no new warnings. The pre-existing SC2002 useless-cat in line 78 of the helper is pre-existing and unchanged.
Refs #60.
fix(tests/e2e#60): cap org-slug at 31 chars + always log org-create bodyto fix(tests/e2e#60+#65): cap org-slug at 32 chars (CP regex) + e2e-pv prefix + always log org-create bodyREQUEST_CHANGES on head
1fec9ebf08.Two blockers:
tests/e2e/lib/collision-proof-slug.sh:117 rejects valid 32-character CP slugs. CP_ORG_SLUG_MAX_LEN is correctly set to 32 and the regex ^[a-z][a-z0-9-]{2,31}$ allows 32 chars, but assert_collision_proof_slug currently fails when len > 31 while printing max=32. This is not just cosmetic: callers such as test_2307_peer_visibility_staging.sh pass prefix_len=8 for the literal e2e-2307- prefix, whose actual length is 9. With the helper's budget this can produce a valid 32-character slug that matches the CP regex, then fail the harness startup assertion. e2e-cncrg-mk has the same off-by-one shape (actual prefix length 13, passed 12). Fix the assertion to reject only >32 and either correct the caller prefix lengths or add tests that catch literal-prefix length drift.
tests/e2e/test_staging_full_saas.sh:369 clobbers the cleanup trap. The file installs trap cleanup_org EXIT INT TERM at line 330, but the new body-file cleanup uses trap 'rm -f "$CREATE_BODYFILE"' EXIT, replacing the org cleanup trap for EXIT. That can leak the staging org/resources after the org-create path succeeds or later steps fail. The body-capture fix should preserve/chain cleanup_org, e.g. remove the temp file inside cleanup_org or use an additional cleanup path that does not overwrite the existing EXIT/INT/TERM trap.
What looks good: the helper preserves the trailing 8-hex entropy by truncating the run_id segment, test_e2e_pv_prefix_caps_to_32 asserts both length and regex match, Shellcheck and all-required are green on the head, and the curl body-capture pattern itself captures status/body as intended. But the two issues above need fixing before this is safe to merge.
Reviewed as reviewer-2 on head
1fec9ebf.The slug-cap hardening matches the #60/#65 failure mode:
test_peer_visibility_mcp_staging.shnow passesprefix_len=7fore2e-pv-, and the helper preserves the 8-hex entropy suffix while capping the generated slug to the CP org-slug regex envelope. I checked all currentmake_collision_proof_slug_suffixcall sites and they now pass explicit prefix lengths, so staging-smoke/continuous-synth/external-runtime/peer-visibility are no longer bypassing the capped helper.The org-create body-capture change in
test_staging_full_saas.shcorrectly moves the curl call into aset +eblock with-obodyfile plus-w '%{http_code}', so 400/409 bodies are logged instead of being swallowed by--fail-with-bodyunderset -e.Verification: local
bash tests/e2e/test_collision_proof_slug_unit.shpassed on1fec9ebf; PR CI shows real green code lanes including Shellcheck job 497690 (53s), Peer Visibility job 497699 (6s), Staging SaaS pr-validate job 497706 (24s), and all-required job 497693 (5s). Combined status is red only from review/ceremony contexts, not code CI.APPROVED.
New commits pushed, approval review dismissed automatically according to repository settings
REQUEST_CHANGES on head
8efca161.The prefix-length fixes are present (
e2e-2307-= 9,e2e-cncrg-worker-= 17,e2e-cncrg-mk-= 13), ande2e-pv-still uses prefix_len=7. CI code lanes are green so far (CI / all-requiredjob 497952, Shellcheck job 497949, Peer Visibility job 497958), but two correctness blockers remain.tests/e2e/lib/collision-proof-slug.shstill rejectslen > 31, even thoughCP_ORG_SLUG_MAX_LEN=32and the CP regex allows 32 total chars. Concrete repro on this head:So the RC #11654 boundary concern is not resolved. The unit suite also does not cover a valid 32-char slug, which is why it still passes.
cleanup_orgsyntactically, but it prependsrm -f "$CREATE_BODYFILE"before invoking the previous trap.cleanup_orgexplicitly relies on its first statement capturing the original$?; the prependedrmchanges$?to the rm result beforecleanup_orgruns. That can turn a failing org-create/body-capture path intoentry_rc=0and weaken the documented exit-code behavior. The chained trap should preserve the original rc before bodyfile cleanup, then call cleanup_org with that rc semantics intact, or otherwise clean the bodyfile from inside cleanup_org / the existing temp-file array path.Please fix both before re-review.
REQUEST_CHANGES on head
8efca16192.One blocker from #11654 remains unresolved: tests/e2e/lib/collision-proof-slug.sh:117 still rejects
${#slug} -gt 31while the configured CP_ORG_SLUG_MAX_LEN is 32 and the CP regex ^[a-z][a-z0-9-]{2,31}$ permits 32 characters. A valid 32-character slug can still fail the harness startup assertion. This must be-gt 32(or use CP_ORG_SLUG_MAX_LEN directly) and the related diagnostic in test_staging_full_saas.sh:411 should also stop treating >31 as over-cap.The other fixes look correct on this head: the literal prefix lengths now match the callers (
e2e-2307-=9,e2e-cncrg-worker-=17,e2e-cncrg-mk-=13), and the bodyfile cleanup no longer replaces teardown outright. It captures the existing EXIT trap and chainsrm -f "$CREATE_BODYFILE"; cleanup_org, so cleanup_org still runs on EXIT. The trap extraction is from the script's own static trap body, not untrusted input, and is acceptable here.CI/Shellcheck is green on
8efca161, but the boundary mismatch above still blocks approval.REQUEST_CHANGES on head
f975eeae84.The slug helper boundary itself is fixed: assert_collision_proof_slug now checks against CP_ORG_SLUG_MAX_LEN=32, so valid 32-char slugs can pass the helper assertion. The corrected prefix lengths also match the literals I checked: e2e-2307-=9, e2e-cncrg-worker-=17, e2e-cncrg-mk-=13.
Two blockers remain:
Exact-head CI is not green: CI / Shellcheck (E2E scripts) is failing on
f975eeae, and CI / all-required is skipped. This needs to be green before approval.The EXIT trap still does not preserve cleanup_org execution. test_staging_full_saas.sh installs:
trap 'ec=$?; rm -f "$CREATE_BODYFILE"; trap -- '''${prev_exit_trap}''' EXIT; (exit $ec)' EXIT
Reinstalling the previous EXIT trap during EXIT handling does not execute it; the trap body then exits with the original code without calling cleanup_org. So the resource teardown path is still skipped/leaky. It needs to actually run the prior cleanup body (or call cleanup_org directly) while preserving the original exit code.
Also, the full_saas diagnostic is still 31-based at lines 408-409 (
${#SLUG} -gt 31, "31-char cap"), so it is not lockstep with CP_ORG_SLUG_MAX_LEN=32 as requested.Please fix the trap to execute cleanup and make the diagnostic 32/constant-consistent, then get Shellcheck/all-required green.
REQUEST_CHANGES:
f975eeaestill is not reviewable-green. The functional fixes look aligned (CP_ORG_SLUG_MAX_LEN=32, corrected prefix lengths, e2e-pv cap, and EXIT trap captures ec before cleanup), but exact-head CI is red: CI / Shellcheck (E2E scripts), run 364608 job 498121, fails on tests/e2e/test_staging_full_saas.sh:384 with SC2154: 'ec is referenced but not assigned'. Please rewrite the trap/cleanup in a shellcheck-clean form while preserving exit-code propagation and cleanup_org chaining, then rerun.APPROVED:
ca909394clears my #11674 blocker. Exact-head Shellcheck is green (run 364679 job 498246, 49s) and all-required is green. The EXIT trap now captures the original exit code first, removes the bodyfile, invokes the captured cleanup_org body inline, and exits with the preserved code, so cleanup is not clobbered. Slug enforcement now uses CP_ORG_SLUG_MAX_LEN=32; a 32-char uuid-suffixed slug passes, 33-char fails with max=32, and the e2e-pv prefix_len=7 path stays <=32 and regex-valid.