From 484d151e99c1bec71c5ffeb3f08dc7df0c6d9dc2 Mon Sep 17 00:00:00 2001 From: Ben Date: Tue, 21 Apr 2026 19:20:15 +1000 Subject: [PATCH] fix(mcp): reset circuit breaker on successful OAuth reconnect MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Previously the breaker was only cleared when the post-reconnect retry call itself succeeded (via _reset_server_error at the end of the try block). If OAuth recovery succeeded but the retry call happened to fail for a different reason, control fell through to the needs_reauth path which called _bump_server_error — adding to an already-tripped count instead of the fresh count the reconnect justified. With fix #1 in place this would still self-heal on the next cooldown, but we should not pay a 60s stall when we already have positive evidence the server is viable. Move _reset_server_error(server_name) up to immediately after the reconnect-and-ready-wait block, before the retry_call. The subsequent retry still goes through _bump_server_error on failure, so a genuinely broken server re-trips the breaker as normal — but the retry starts from a clean count (1 after a failure), not a stale one. --- tools/mcp_tool.py | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/tools/mcp_tool.py b/tools/mcp_tool.py index c393a09f..aecc0cc2 100644 --- a/tools/mcp_tool.py +++ b/tools/mcp_tool.py @@ -1429,6 +1429,16 @@ def _handle_auth_error_and_retry( break time.sleep(0.25) + # A successful OAuth recovery is independent evidence that the + # server is viable again, so close the circuit breaker here — + # not only on retry success. Without this, a reconnect + # followed by a failing retry would leave the breaker pinned + # above threshold forever (the retry-exception branch below + # bumps the count again). The post-reset retry still goes + # through _bump_server_error on failure, so a genuinely broken + # server will re-trip the breaker as normal. + _reset_server_error(server_name) + try: result = retry_call() try: