fix(workspace-server): http client timeouts, panic recovery, and error checks (re-created from staging #2045) #2125
Reference in New Issue
Block a user
Delete Branch "fix/http-client-timeout-panic-recovery-main"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Re-implementation of #2045 against main (original was mistakenly based on staging).
Changes:
http.DefaultClientwith 10s timeout client incp_config.go.context.ContexttoqueryPeerMapsand useQueryContext.tx.Rollback()in workspace Create handler.broadcaster.RecordAndBroadcast,db.ExecContext,tx.Exec, androws.Err()in 12 files.Comprehensive testing performed
go test ./...passes on workspace-server after changesgo vet ./...cleanLocal-postgres E2E run
Staging-smoke verified or pending
Root-cause not symptom
Root cause: unchecked errors and missing panic recovery in goroutines led to silent failures and potential crashes. Symptom: flaky tests, missing error logs, goroutine panics taking down handlers.
Five-Axis review walked
if err != nil { log.Printf(...) }ordefer recover()).log.Printffor non-fatal errors,gin.Recovery()for HTTP handlers).No backwards-compat shim / dead code added
Yes — no shim. Pure additions of error checks and panic recovery; no deprecated code introduced.
Memory/saved-feedback consulted
bundle/importer.goandchannels/manager.go./sop-ack
19ab60ba80toc768101cb6merge-queue: updated this branch with
mainate441def8b3a8. Waiting for CI on the refreshed head.merge-queue: updated this branch with
mainat31283a292a34. Waiting for CI on the refreshed head.merge-queue: updated this branch with
mainatd768d8667b0f. Waiting for CI on the refreshed head.APPROVED on current head
7c9d895b0b. 5-axis review: adds bounded HTTP timeout, panic recovery around background goroutines, context-aware DB peer queries, and an idempotent tx rollback cleanup. Changes improve robustness/fail-closed behavior without exposing secrets, weakening auth/gates/merge-control, or adding blocking hot-path work. Imports and call-site updates are consistent; BP-required contexts are present+green.APPROVE: verified current head. Operational hardening only: CP config HTTP timeout, context-aware peer queries, rollback cleanup, and panic containment/logging in async bridges/sweepers. BP-required contexts present+green and mergeable=true. No gate/auth/merge-control weakening or regression found. Note: post-#2407 qa/security governance contexts are not green, so the uniform gate would still block until satisfied.