Hi jasmeet ghai,
For your issue in high cpu load, I definitely recommend checking this article on Debug high CPU usage in .NET Core : https://learn.microsoft.com/en-us/dotnet/core/diagnostics/debug-highcpu?tabs=windows
Here are steps you can follow to diagnose your problem:
Step 1: Initial Setup and Process Identification
First, ensure you have the necessary tools and identify your application's process.
- Install .NET Diagnostic Tools: If you don't have them already, install dotnet-counters and dotnet-trace globally. These are command-line tools that work across Windows, Linux, and macOS.
dotnet tool install --global dotnet-counters
dotnet tool install --global dotnet-trace
- Start Your Application: Ensure your ASP.NET Core Web API is running.
- Identify the Process ID (PID): You need the Process ID of your running .NET application.
dotnet-trace ps
- This command lists active .NET processes. Locate your web API's process and note its
PID
.
Step 2: Real-Time CPU Monitoring
Use dotnet-counters
to observe CPU usage live and confirm the spike when you access the problematic page.
- Monitor CPU Usage: Execute the following command, replacing <YOUR_PID> with the actual PID you found.
dotnet-counters monitor --refresh-interval 1 --process-id <YOUR_PID> --counters System.Runtime[cpu-usage]
-
--refresh-interval 1
: Updates the display every second.
-
--counters System.Runtime[cpu-usage]
: Specifically monitors the CPU usage counter.
- Action: While
dotnet-counters
is running, open your browser and navigate to the problematic "mpn" page. Observe the "CPU Usage (%)" counter. You should see a noticeable jump.
Step 3: Collect a CPU Usage Trace
This is the core diagnostic step, using dotnet-trace
to capture a detailed profile of your application's execution.
- Start Trace Collection: Execute the command to begin tracing. Provide a meaningful output filename.
dotnet-trace collect --process-id <YOUR_PID> --providers Microsoft-DotNetCore-SampleProfiler --output <OUTPUT_FILE_NAME>.nettrace
-
--providers Microsoft-DotNetCore-SampleProfiler
: This is the crucial provider for CPU sampling. It captures managed call stacks every millisecond, allowing you to see what code is executing.
- Action: Immediately after running this command, access your "mpn" page in the browser to trigger the high CPU scenario. Let the page load and the CPU spike for about 10-30 seconds (the duration depends on how long the problem persists). Then, press
Ctrl+C
in the terminal where dotnet-trace
is running to stop the collection.
- Output: A
.nettrace
file will be generated in the current directory. This file contains the collected performance data. dotnet-trace diagnostic tool
Step 4: Analyze the Performance Trace
The .nettrace
file is not human-readable directly. You'll need a specialized tool to visualize and analyze the data.
Analyze with Visual Studio Performance Profiler (Windows): This is the most user-friendly way for Windows developers.
- Open Visual Studio.
- Go to Debug > Performance Profiler (or press Alt+F2).
- In the Performance Profiler window, click "Open Report..." and select the
.nettrace
file you generated.
- Focus on the "CPU Usage" report:
- Call Tree: Explore the hierarchical call tree to see which functions are consuming time, including their children.
- Hot Path: Visual Studio often highlights the "hot path" – the sequence of calls that consumed the most CPU.
- Functions List: Sort by "Total CPU" or "Self CPU" to identify the most expensive individual methods. Look for methods unique to "mpn" or those showing significantly higher CPU time than expected.
Step 5: Compare and Identify the Bottleneck
The crucial step for your scenario is comparing the problematic page's trace with the well-performing copy.
- Repeat for "mpn1": Perform Steps 2-4 again, but this time, access "mpn1" to collect its performance trace.
- Side-by-Side Analysis: Load both the "mpn" trace and the "mpn1" trace in your chosen analysis tool (Visual Studio or PerfView).
- Key Comparison Points:
- Call Stack Differences: Are there entirely new or different call stacks present in "mpn" that are absent or minor in "mpn1"?
- Method Time Differences: For shared methods, does a particular method consume significantly more "Self CPU" or "Total CPU" time in "mpn" compared to "mpn1"? This often points to inefficient loops, larger data processing, or more complex logic within that method.
- External Calls: Is "mpn" making more frequent or longer-running calls to databases, external APIs, or file systems that are not present or are optimized in "mpn1"?
- Garbage Collection (GC): While less common for pure CPU spikes (unless it's an extreme allocation rate), check if GC activity is disproportionately higher in "mpn," indicating excessive object creation.
By systematically applying these diagnostic steps and performing a detailed comparison between "mpn" and "mpn1," you will effectively identify the specific areas of your code or external interactions responsible for the high CPU usage.