Hi all,
We are running SQL Server 2022 (16.0.4195) with AlwaysOn on a 2-node Failover Cluster on Windows Server 2025. The nodes are patched individually overnight using System Center, one node per night. The cluster uses a File Share Witness hosted on a third, external file server (not on either of the cluster nodes).
Here's the issue:
Every time one of the nodes is rebooted (e.g. during patching), the remaining node logs a critical Event ID 1177: "The Cluster service is shutting down because quorum was lost". This results in unnecessary alerts, even though the behavior seems to be transient and self-recovering. I assume that there will be a direct impact on the AG groups, but we have not yet entered any production databases.
We have verified the following:
QuorumArbitrationTimeMax is set to 90 seconds (same as our older clusters). The File Share Witness is healthy and reachable. Dynamic Quorum and Dynamic Witness are enabled.
Cluster Functional Level is 12 (default for WS 2025), whereas older clusters (WS 2019) use level 10 and do not exhibit this issue under the same conditions.
From the logs, it looks like the remaining node temporarily loses quorum right as the second node is shutting down.
This does not happen in any of our older environments, running Windows Server 2022 and older.
Questions:
Has anyone else experienced this behavior on WS 2025 clusters? Could the stricter quorum handling in Functional Level 12 be the cause? Any recommendations on how to prevent this (besides ignoring Event ID 1177)?
Thanks in advance for any insights!