feat(executor): emit incident.codex_wedge JSONL on SSE wedge #18
Open
core-devops
wants to merge 1 commits from
feat/codex-wedge-obs-emit into main
pull from: feat/codex-wedge-obs-emit
merge into: molecule-ai:main
molecule-ai:main
molecule-ai:bump/runtime-0.3.53
molecule-ai:bump/runtime-0.3.52
molecule-ai:bump/runtime-0.3.51
molecule-ai:bump/runtime-0.3.50
molecule-ai:bump/runtime-0.3.49
molecule-ai:bump/runtime-0.3.48
molecule-ai:bump/runtime-0.3.47
molecule-ai:bump/runtime-0.3.46
molecule-ai:ops/ecr-lifecycle-iac
molecule-ai:bump/runtime-0.3.44
molecule-ai:bump/runtime-0.3.43
molecule-ai:bump/runtime-0.3.42
molecule-ai:bump/runtime-0.3.41
molecule-ai:bump/runtime-0.3.40
molecule-ai:fix/73-add-platform-provider-surface
molecule-ai:bump/runtime-0.3.39
molecule-ai:bump/runtime-0.3.38
molecule-ai:bump/runtime-0.3.37
molecule-ai:bump/runtime-0.3.36
molecule-ai:bump/runtime-0.3.35
molecule-ai:bump/runtime-0.3.34
molecule-ai:bump/runtime-0.3.33
molecule-ai:bump/runtime-0.3.32
molecule-ai:bump/runtime-0.3.31
molecule-ai:bump/runtime-0.3.30
molecule-ai:bump/runtime-0.3.29
molecule-ai:bump/runtime-0.3.28
molecule-ai:bump/runtime-0.3.27
molecule-ai:fix/codex-runtime-pin-0.3.26
molecule-ai:bump/runtime-0.3.26
molecule-ai:bump/runtime-0.3.25
molecule-ai:bump/runtime-0.3.24
molecule-ai:ci/align-requirements-runtime-0.3.23
molecule-ai:bump/runtime-0.3.23
molecule-ai:bump/runtime-0.3.22
molecule-ai:bump/runtime-0.3.21
molecule-ai:bump/runtime-0.3.20
molecule-ai:bump/runtime-0.3.19
molecule-ai:feat/coding-discipline
molecule-ai:fix/codex-gpt-required-env-coverage
molecule-ai:liveness/heartbeat-2026-06-11
molecule-ai:fix/codex-timeout-docs-drift
molecule-ai:bump-requirements-0.3.14
molecule-ai:bump-runtime-0.3.14
molecule-ai:fix/keystone-runtime-pin-autopromote-gate
molecule-ai:chore/runtime-0.3.13
molecule-ai:chore/bump-runtime-0.3.11
molecule-ai:bump/runtime-req-0.3.10
molecule-ai:chore/runtime-bump-0.3.10
molecule-ai:fix/anti-skip-assertion-hardening
molecule-ai:fix/fake-codex-binary-for-tests
molecule-ai:fix/codex-executor-reset-on-timeout-653
molecule-ai:fix/codex85-cp-admin-promote
molecule-ai:harden/coverage-gap-codex-template-auth-env
molecule-ai:fix/cp529-trim-unroutable-byok-ids
molecule-ai:fix/2128-codex-danger-full-access
molecule-ai:fix/codex-resolve-runtime-version
molecule-ai:fix/codex-git-askpass-wiring
molecule-ai:bump/runtime-0.3.9
molecule-ai:chore/runtime-0.3.9
molecule-ai:fix/codex-sandbox-network-config
molecule-ai:fix/digest-step-non-fatal
molecule-ai:fix/codex-chat-priority-steer
molecule-ai:fix/publish-image-surface-digest-and-drift-guard
molecule-ai:fix/pin-runtime-038-a2a
molecule-ai:chore/runtime-0.3.8
molecule-ai:fix/codex-resync-org-header
molecule-ai:fix/codex-oauth-resync
molecule-ai:chore/bump-runtime-0.3.7
molecule-ai:chore/runtime-0.3.7
molecule-ai:fix/internal-728-codex-accept-openai
molecule-ai:feat/internal-718-p4-pathb-registry-projection
molecule-ai:feat/platform-managed-openai-responses
molecule-ai:fix/consumer-drift-ssot
molecule-ai:fix/bwrap-shim-net_admin-blocker
molecule-ai:chore/runtime-0.3.6
molecule-ai:chore/runtime-0.3.5
molecule-ai:chore/runtime-0.3.4
molecule-ai:chore/runtime-0.3.3
molecule-ai:chore/runtime-0.3.2
molecule-ai:chore/runtime-0.3.1
molecule-ai:chore/runtime-0.3.0
molecule-ai:chore/runtime-0.2.5
molecule-ai:chore/runtime-0.2.4
molecule-ai:fix-44-validate-needs-adapter-tests
molecule-ai:fix/l4-vlm-image-descriptions
molecule-ai:chore/runtime-0.2.2
molecule-ai:fix/l4-vision-attachments
molecule-ai:fix/codex-0130-agentmessage-type
molecule-ai:fix/codex-0130-notifications
molecule-ai:fix-app-server-streamreader-limit
molecule-ai:chore/runtime-0.2.1
molecule-ai:fix/codex-0130-notification-schema
molecule-ai:chore/runtime-v0.2.0
molecule-ai:chore/gitea-pypi-pip-index-url
molecule-ai:fix/runs-on-docker-host-pin-t390
molecule-ai:ssot7/converge-askpass-filename
molecule-ai:fix/codex-auth-refresh-portable-python-path
molecule-ai:ci/publish-pin-and-of-labels
molecule-ai:fix/source-configs-secrets-d-load
molecule-ai:fix/422-on-provider-name-in-model-field
molecule-ai:rfc-529-layer-a-auto-promote-pin
molecule-ai:ci/docker-host-pin-validate-runtime
molecule-ai:feat/git-askpass-env-helper
molecule-ai:fix/codex-subscription-provider-not-minimax-513
molecule-ai:fix/codex-wire-api-responses-513
molecule-ai:fix/codex-publish-image-pin-linux-publish-runner
molecule-ai:fix/republish-codex-283f371-flaked-ecr-login
molecule-ai:feat/codex-cli-0130-and-codex-auth-json-infisical
molecule-ai:fix/t4-conformance-runs-on-docker-host
molecule-ai:chore/sop-checklist-gate
Milestone
No items
No Milestone
Projects
Clear projects
No project
Assignees
agent-dev-a
agent-dev-b
agent-pm
agent-researcher
agent-reviewer
agent-reviewer-1
agent-reviewer-cr2
app-fe (Molecule AI · app-fe)
app-lead (Molecule AI · app-lead)
app-qa (Molecule AI · app-qa)
claude-ceo-assistant
claude-ci-reader
core-be (Molecule AI · core-be)
core-devops (Molecule AI · core-devops)
core-fe (Molecule AI · core-fe)
core-lead (Molecule AI · core-lead)
core-offsec (Molecule AI · core-offsec)
core-qa (Molecule AI · core-qa)
core-security (Molecule AI · core-security)
core-uiux (Molecule AI · core-uiux)
cp-be (Molecule AI · cp-be)
cp-lead (Molecule AI · cp-lead)
cp-qa (Molecule AI · cp-qa)
cp-security (Molecule AI · cp-security)
cui (Zhanlin Cui)
dev-lead (Molecule AI · dev-lead)
devops-engineer
documentation-specialist (Molecule AI · documentation-specialist)
fullstack-engineer (Molecule AI · fullstack-engineer)
godwin
hongming
hongming-ceo-delegated
hongming-codex-laptop
hongming-kimi-laptop
hongming-pc2
hongming-personal
infra-lead (Molecule AI · infra-lead)
infra-runtime-be (Molecule AI · infra-runtime-be)
infra-sre (Molecule AI · infra-sre)
integration-tester (Molecule AI · integration-tester)
molecule-code-reviewer
molecule-runtime-release-bot (Molecule Runtime Release Bot)
plugin-dev (Molecule AI · plugin-dev)
pm
publish-runtime-bot
pypi-publisher (Molecule AI PyPI Publisher (RFC#596))
release-manager (Molecule AI · release-manager)
sdk-dev (Molecule AI · sdk-dev)
sdk-lead (Molecule AI · sdk-lead)
sop-tier-bot (SOP Tier-Check Bot)
technical-writer (Molecule AI · technical-writer)
triage-operator (Molecule AI · triage-operator)
Clear assignees
No Assignees
Notifications
Due Date
No due date set.
Dependencies
No dependencies set.
Reference: molecule-ai/molecule-ai-workspace-template-codex#18
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.
Delete Branch "feat/codex-wedge-obs-emit"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Summary
{service="molecule-tenant"}. Pairs with thecodex-wedge.ymlLoki ruler in operator-config (separate PR).@openai/codex@0.130.0— see investigation below.codex-cli upstream investigation (VENDOR-DOC CHECK)
openai/codex#23061and#22793(stream disconnected before completion) track related streaming instability against subscription auth — neither has a verified upstream fix.0.130.0until upstream lands a verified fix AND we reproduce the wedge against a 0.131 image.Why not switch to
auth_mode=openai_api(CTO billing touchpoint)The wedge is correlated with
chatgpt_subscription+gpt-5.5, so api-key fallback is a viable workaround — but it routes traffic through a DIFFERENT OpenAI account. CTO-only decision; this PR adds the obs signal so the call can be made on data.Schema (frozen — Loki ruler depends on it)
Loki query example (after both PRs ship)
Test plan
tests/test_executor.py::test_wedge_emits_incident_jsonl— JSON shape matches rulertests/test_executor.pysuite passes locally (10/10)service=molecule-tenantin LokiSurfaces the 2026-05-18-class wedge (codex turn emits zero events for 90s) as a structured log line the tenant Vector pipeline ships to Loki under {service="molecule-tenant"}. Pairs with the codex-wedge Loki ruler in operator-config (separate PR). Schema (frozen by the matching ruler): event_type, workspace_id, turn_id, deltas_at_wedge, wedge_duration_seconds, codex_cli_version, model, auth_mode, ts Notes on the broader investigation: - Upstream codex-cli 0.131.0 release notes (May 2026) do NOT mention any SSE / chatgpt-subscription / no-events fix; open issues #23061 and #22793 (stream-disconnect-before-completion) track related instability with NO verified fix. Therefore this PR INTENTIONALLY does NOT bump the @openai/codex@0.130.0 pin — that would be a blind upgrade against the same hypothesised upstream bug. Bump only after upstream lands a verified SSE / app-server-stream fix and we reproduce the wedge in a 0.131 image. - We do NOT switch the production prod-team auth_mode to openai_api (api-key) — that's a CTO billing touchpoint (different OpenAI account); the obs signal lets us decide that with data instead of hypothesis. - The wedge-detection logic itself was already added by PR#14 (the 2026-05-18 deadlock fix); this PR only adds the structured-log emission at the same site, plus a regression test. Auth-mode label derived from credential-env presence (mirrors provider_config._BUILTIN_PROVIDERS selection order) so the line is emitted even if the process wedged before render_provider_toml.py finished writing ~/.codex/config.toml. Test: tests/test_executor.py::test_wedge_emits_incident_jsonl validates the JSON shape against the ruler's expected fields.REQUEST_CHANGES after 5-axis review of
89f664e.Correctness: The wedge incident emission is placed at the existing inactivity timeout site and the test covers the JSON payload for the ChatGPT subscription path. However,
_derive_auth_mode_label()does not actually mirror provider_config selection for the MiniMax route: the repo's configured third-party provider is driven byMINIMAX_API_KEY, but this code only checksANTHROPIC_AUTH_TOKEN/ANTHROPIC_API_KEYbefore falling back tounknown. A MiniMax workspace wedge would therefore emit the wrongauth_mode, undercutting the Loki grouping this PR adds. Please include the active compat-provider env names, at leastMINIMAX_API_KEY, or derive from the same provider registry used at boot.Robustness: The log emission is exception-contained and emits once per timeout, which is good. Current PR CI is also red on runtime validation / validate, so this needs a green rerun before merge.
Security: No secrets are emitted; the auth label is categorical only.
Performance: One small JSON serialization on the terminal wedge path is negligible.
Readability: The new helper is readable, but its comments overstate parity with provider_config while missing the MiniMax auth path.
View command line instructions
Checkout
From your project repository, check out a new branch and test the changes.