Vmss nodes stuck during deallocating with NetworkingInternalOperation error during maintenance

Niels Witte 0 Reputation points
2025-07-25T09:15:39.3766667+00:00

We received a maintenance notification for our VMSS with tracking ID: 7SY2-VV0

Action Required: Urgent preventive repair may affect one or more of your virtual machines in westeurope You are receiving this notice because you currently use Azure Virtual Machines.Maintenance Summary: Azure has identified a degradation to a network device that is connected to one or more of your virtual machines that requires urgent repair to prevent unplanned failures. During this preventive repair event, Virtual Machines (VM) will be rebooted.To avoid any disruption, we will attempt to migrate VMs to a server under a healthy switch. You can also take preventive actions on VMs by following the steps mentioned in the Recommended Action section of this notification. If the virtual machine isn’t moved by the platform and no action is taken before maintenance begins, each virtual machine listed in this notification may be unavailable for up to 15 minutes while it is being migrated to a healthy server. This will occur sometime between 7/29/2025 5:24:05 AM UTC until 8/2/2025 5:24:05 AM UTC in westeurope

Since we do not want to risk our nodes for being unavailable for 15 minutes, we decided to perform the manual steps as listed in the notification:

For virtual machine scale sets:

  1. Log into the Azure portal and navigate to the virtual machine scale sets.
  2. Select a virtual machine scale set and select 'Instances' from the left-side settings menu.
  3. Select the virtual machine you want to migrate.
  4. Select 'Deallocate' to stop and deallocate the virtual machine. Please ensure the virtual machine's status has transitioned to 'Stop (deallocated)'.
  5. Select 'Start' only after the previous change in status has been completed.

Our scale set consists of 5 nodes so we deallocated and stopped one node of the scale set (in this case the last one). And it has been stuck in 'Failed' state since.

Provisioning failed
Error
ProvisioningState/failed/NetworkingInternalOperation
An unexpected error occured while processing the network profile of the VM. Please retry later.

We tried removing that node, and replacing it with a new node but all operations fail with a similar error: "NetworkingInternalOperation"

Some more background information. The cluster is hosted in West Europe, based on 5 nodes of type Standard_B4ms and runs a number of Service Fabric application and services. The nodes are connected with a virtual network to the other Azure services such as Azure SQL databases, storage accounts, service bus and load balancers.

Can you please help us complete the migration to the healthy network environment?

Azure Virtual Machine Scale Sets
Azure Virtual Machine Scale Sets
Azure compute resources that are used to create and manage groups of heterogeneous load-balanced virtual machines.
{count} votes

1 answer

Sort by: Most helpful
  1. Niels Witte 0 Reputation points
    2025-07-28T21:28:14.21+00:00

    Completely shutting down the scale set caused the deallocation to succeed and then starting the scale set again solved the issues.

    Stop-AzVMSS -ResourceGroupName $myResourceGroup --VMScaleSetName $myVM
    Start-AzVMSS -ResourceGroupName $myResourceGroup --VMScaleSetName $myVM
    

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.