For Platform Engineers

Monitor Container Health with Azure Metrics

On-call engineers are racing against time: pod restarts, memory leaks, CPU spikes, and networking anomalies all need to be caught fast. The data is already in ContainerLogV2 and AzureMetrics, but correlating signals across tables and time windows consumes valuable minutes.

Below are real-world KQL patterns for infrastructure monitoring that cut through the noise and surface anomalies your team needs to act on immediately.

Track Pod Restart Trends

Monitor pod restart frequency over time to identify stability issues and plan infrastructure updates.

ContainerLogV2

KQL Query

ContainerLogV2
| where TimeGenerated > ago(7d)
| where LogMessage has_any ("restart", "back-off", "crash")
| extend imageTag = tostring(split(ContainerImage, ":")[1])
| summarize RestartCount = count(), LatestRestart = max(TimeGenerated) by PodName, ContainerName, imageTag
| order by RestartCount desc

Correlate CPU and Memory Spikes

Find infrastructure anomalies by detecting correlated spikes in CPU and memory usage across nodes.

AzureMetrics, Heartbeat

KQL Query

AzureMetrics
| where TimeGenerated > ago(1d)
| where MetricName in ("Percentage CPU", "Available Memory Bytes")
| summarize AvgValue = avg(Total) by bin(TimeGenerated, 5m), Computer, MetricName
| pivot MetricName
| where ['Percentage CPU'] > 80 and ['Available Memory Bytes'] < 1000000000
| summarize SpikeCount = count() by Computer
| order by SpikeCount desc

Stay on top of your infrastructure.

KQL Remix translates your monitoring intent into production-ready queries instantly. Spend less time writing KQL, more time keeping systems healthy.

Join Waitlist

Explore more KQL operators

View full KQL reference

Other Azure surfaces

Threat hunting in Azure Sentinel Debug exceptions in Application Insights