Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
The Azure Kubernetes Service (AKS) Communication Manager streamlines notifications for all your AKS maintenance tasks by using Azure Resource Notifications and Azure Resource Graph frameworks. This tool enables you to closely monitor your upgrades because it provides you with timely alerts on event triggers and outcomes. If maintenance fails, it notifies you with the reasons for the failure, reducing operational hassles related to observability and follow-ups. You can set up notifications for all types of autoupgrades that utilize maintenance windows by following these steps.
Prerequisites
Configure your cluster for either Autoupgrade channel or Node autoupgrade channel.
Create a planned maintenance window for your autoupgrade configuration.
Note
Once set up, the communication manager sends advance notices - one week before maintenance starts and one day before maintenance starts. This is in addition to the timely alerts during the maintenance operation.
How to set up communication manager
Go to the resource, then choose Monitoring and select Alerts and then click into Alert Rules.
The Condition for the alert should be a Custom log search.
In the opened "Search query" box, paste one of the following custom queries and click "Review+Create" button.
Query for cluster auto upgrade notifications
containerserviceeventresources
| where type == "microsoft.containerservice/managedclusters/scheduledevents"
| where id contains "/subscriptions/<subid>/resourcegroups/<rgname>/providers/Microsoft.ContainerService/managedClusters/<clustername>"
| where properties has "eventStatus"
| extend status = substring(properties, indexof(properties, "eventStatus") + strlen("eventStatus") + 3, 50)
| extend status = substring(status, 0, indexof(status, ",") - 1)
| where status != ""
| where properties has "eventDetails"
| extend upgradeType = case(
properties has "K8sVersionUpgrade",
"K8sVersionUpgrade",
properties has "NodeOSUpgrade",
"NodeOSUpgrade",
""
)
| extend details = parse_json(tostring(properties.eventDetails))
| where properties has "lastUpdateTime"
| extend eventTime = substring(properties, indexof(properties, "lastUpdateTime") + strlen("lastUpdateTime") + 3, 50)
| extend eventTime = substring(eventTime, 0, indexof(eventTime, ",") - 1)
| extend eventTime = todatetime(tostring(eventTime))
| where eventTime >= ago(2h)
| where upgradeType == "K8sVersionUpgrade"
| project
eventTime,
upgradeType,
status,
properties,
name,
details
| order by eventTime asc
Query for Node OS auto upgrade notifications
containerserviceeventresources
| where type == "microsoft.containerservice/managedclusters/scheduledevents"
| where id contains "/subscriptions/<subid>/resourcegroups/<rgname>/providers/Microsoft.ContainerService/managedClusters/<clustername>"
| where properties has "eventStatus"
| extend status = substring(properties, indexof(properties, "eventStatus") + strlen("eventStatus") + 3, 50)
| extend status = substring(status, 0, indexof(status, ",") - 1)
| where status != ""
| where properties has "eventDetails"
| extend upgradeType = case(
properties has "K8sVersionUpgrade",
"K8sVersionUpgrade",
properties has "NodeOSUpgrade",
"NodeOSUpgrade",
""
)
| extend details = parse_json(tostring(properties.eventDetails))
| where properties has "lastUpdateTime"
| extend eventTime = substring(properties, indexof(properties, "lastUpdateTime") + strlen("lastUpdateTime") + 3, 50)
| extend eventTime = substring(eventTime, 0, indexof(eventTime, ",") - 1)
| extend eventTime = todatetime(tostring(eventTime))
| where eventTime >= ago(2h)
| where upgradeType == "NodeOSUpgrade"
| project
eventTime,
upgradeType,
status,
properties,
name,
details
| order by eventTime asc
- Configure the alert conditions with the following settings:
- Measurement: Select "Table rows"
- Aggregation: Select "Count"
- Aggregation granularity: Select "30 minutes"
- Threshold value: Keep at 0
- Split by dimensions: Select "status" and choose "Include all future values"
When selecting "status" in the Split by dimensions dropdown, the available values are: Scheduled, Started, Completed, Canceled, and Failed.
Note
These status values will only appear if your cluster has previously executed auto upgrade operations. For new clusters or clusters that haven't undergone auto upgrades yet, the dropdown may appear empty or show no available dimensions. Once your cluster performs its first auto upgrade, these status values will become available for selection.
- Check an action group with the correct email address exists, to receive the notifications.
Assign Managed System Identity: After you create the alert rule, assign a managed identity so it can access the necessary resources. This step is performed after the alert rule is created, not during initial setup. To assign a managed identity:
- In the Azure portal, go to Monitor > Alerts > Alert rules, then select your alert rule.
- In the alert rule pane, under Settings, select Identity.
- Set System assigned managed identity to On.
- Click Save to enable the managed identity for the alert rule.
Tip
If you don't see the Identity option, make sure your alert rule has been created and you have the necessary permissions. Assigning the managed identity is always a separate step after alert rule creation.
Make sure to assign the appropriate Reader roles.
In the alert rule, go to Settings > Identity > System assigned managed identity > Azure role assignments > Add role assignment.
Choose the Reader role and assign it to the resource group. Repeat "Add role assignment" for the subscription if needed.
Note
After Communication Manager is set up, it sends advance notices one week before maintenance starts and one day before maintenance starts. It also sends you timely alerts during the maintenance operation.
Set up Communication Manager
Go to the resource, select Monitoring, select Alerts, and then select Alert Rules.
On the Condition tab, for Signal name, select Custom log search.
In the Search query box, paste one of the following custom queries and then select the Review+Create button.
The following query is for cluster autoupgrade notifications:
arg("").containerserviceeventresources | where type == "microsoft.containerservice/managedclusters/scheduledevents" | where id contains "/subscriptions/<subid>/resourcegroups/<rgname>/providers/Microsoft.ContainerService/managedClusters/<clustername>" | where properties has "eventStatus" | extend status = substring(properties, indexof(properties, "eventStatus") + strlen("eventStatus") + 3, 50) | extend status = substring(status, 0, indexof(status, ",") - 1) | where status != "" | where properties has "eventDetails" | extend upgradeType = case( properties has "K8sVersionUpgrade", "K8sVersionUpgrade", properties has "NodeOSUpgrade", "NodeOSUpgrade", "" ) | extend details = parse_json(tostring(properties.eventDetails)) | where properties has "lastUpdateTime" | extend eventTime = substring(properties, indexof(properties, "lastUpdateTime") + strlen("lastUpdateTime") + 3, 50) | extend eventTime = substring(eventTime, 0, indexof(eventTime, ",") - 1) | extend eventTime = todatetime(tostring(eventTime)) | where eventTime >= ago(2h) | where upgradeType == "K8sVersionUpgrade" | project eventTime, upgradeType, status, properties, name, details | order by eventTime asc
The following query is for Node OS autoupgrade notifications:
arg("").containerserviceeventresources | where type == "microsoft.containerservice/managedclusters/scheduledevents" | where id contains "/subscriptions/<subid>/resourcegroups/<rgname>/providers/Microsoft.ContainerService/managedClusters/<clustername>" | where properties has "eventStatus" | extend status = substring(properties, indexof(properties, "eventStatus") + strlen("eventStatus") + 3, 50) | extend status = substring(status, 0, indexof(status, ",") - 1) | where status != "" | where properties has "eventDetails" | extend upgradeType = case( properties has "K8sVersionUpgrade", "K8sVersionUpgrade", properties has "NodeOSUpgrade", "NodeOSUpgrade", "" ) | extend details = parse_json(tostring(properties.eventDetails)) | where properties has "lastUpdateTime" | extend eventTime = substring(properties, indexof(properties, "lastUpdateTime") + strlen("lastUpdateTime") + 3, 50) | extend eventTime = substring(eventTime, 0, indexof(eventTime, ",") - 1) | extend eventTime = todatetime(tostring(eventTime)) | where eventTime >= ago(2h) | where upgradeType == "NodeOSUpgrade" | project eventTime, upgradeType, status, properties, name, details | order by eventTime asc
The interval should be 30 minutes, and the threshold should be 1.
Make sure that an action group with the correct email address exists, so that you can receive the notifications.
Make sure to give the Read role to the resource group and to the subscription to the managed identity of the log search alert rule.
Go to the alert rule: Settings > Identity > System assigned managed identity > Azure role assignments > Add role assignment.
Select the Reader role and assign it to the resource group. Repeat Add role assignment for the subscription.
Verification
To upgrade the cluster, wait for the autoupgrader to start. Then verify that you promptly receive notices on the email configured to receive notices.
Check the Azure Resource Graph database for the scheduled notification record. Each scheduled event notification should be listed as one record in the containerserviceeventresources
table.
Related content
- See how you can set up a planned maintenance window for your upgrades.
- See how you can optimize your upgrades.
Azure Kubernetes Service