Windows for business | Windows Server | Storage high availability | Clustering and high availability
Hello Ravinder, I am Henry and I want to share my insight about your issue.
The error message provides two pieces of information:
- "Cluster service on node HP-BASE-NODE1 did not reach the running state.": This means the ClusSvc (Cluster Service) on your new 2022 server was started by the wizard, but it failed before it could successfully join the cluster.
- "The error code is 0x5b4" and "This operation returned because the timeout period expired.": The error code 0x5b4 is ERROR_TIMEOUT. This confirms that the existing cluster nodes tried to communicate with the new node's cluster service, but they didn't get a successful response back in time.
The issue is not with the Domain Functional Level upgrade (which is a prerequisite), but seems certainly with Networking, Permissions, or Node Configuration.
You can refer the steps to troubleshoot:
- Networking and Firewall. The cluster service communicates over very specific ports. If these are blocked, the join will time out.
- Windows Firewall: Ensure the Windows Firewall on the new 2022 node (HP-BASE-NODE1) and the existing 2016 nodes has the "Failover Clustering" rules enabled for the correct network profile (Domain/Private).
- Physical/Network Firewalls: Check with your network team to ensure there are no hardware firewalls or network ACLs between your new server and the existing cluster nodes that could be blocking traffic.
- Required Ports. You can reference https://learn.microsoft.com/en-us/troubleshoot/windows-server/networking/service-overview-and-network-port-requirements#cluster-service: The key ports for cluster communication are:
- Port 3343 (TCP): The primary port for Cluster RPC.
- Port 135 (TCP): RPC Endpoint Mapper.
- Ports 49152-65535 (TCP Dynamic): The default RPC dynamic port range.
- Connectivity Test: From an existing 2016 node, run this PowerShell command to test connectivity to the new node:
Test-NetConnection -ComputerName HP-BASE-NODE1 -Port 3343
. The TcpTestSucceeded should be True. If it's False, you have a firewall or network routing issue.
- Active Directory and Permissions. When adding a node, the existing cluster's "Cluster Name Object" (CNO) in Active Directory must have sufficient permissions to modify the computer account for the new node.
- CNO Permissions:
- In "Active Directory Users and Computers," ensure "Advanced Features" is enabled under the View menu.
- Find the computer object for your cluster name (not the nodes).
- Go to its Properties -> Security tab.
- Ensure the CNO has "Create Computer Objects" and "Read all properties" permissions on the Organizational Unit (OU) where your server computer accounts (HP-BASE-NODE1, etc.) reside.
- DNS Resolution: Ensure all nodes (new and old) can resolve each other's names (and the cluster name) correctly via DNS, both forward and reverse. Use nslookup from each node to test this.
- Node Configuration and Validation
- Patching: Ensure your new Windows Server 2022 node is fully patched with the latest Windows Updates. Mismatched patch levels between nodes can sometimes cause communication issues.
- Antivirus/Security Software: Temporarily disable any third-party antivirus or security agent on HP-BASE-NODE1. These are notorious for interfering with cluster communication ports.
- Run Cluster Validation: Before you try to add the node again, run the validation test. From one of the existing 2016 nodes, open PowerShell as an Administrator and run:
Test-Cluster -Node "EXISTING-NODE1", "EXISTING-NODE2", "HP-BASE-NODE1"
.This will generate a detailed HTML report. Review this report carefully. It will tell you exactly which check is failing (e.g., "Network Communication," "Validate Active Directory Configuration," etc.). This report is the key to solving the problem.
Hope one of these works for you.