 Are you wondering why your application is taking so long to run? Well, did you know in Visual Studio, there's a tool that can help you investigate the issue? Learn more on this episode of Visual Studio Toolbox. Hey, everyone. Welcome to Visual Studio Toolbox. I'm your host, Leslie Richardson, and today I'm joined by Esteban Herrera, who is APM on the Visual Studio profiling team. Hey, Esteban. How's it going? It's going well. Glad to have you back. Thanks. Glad to be here. Sweet. So this is a larger part of our overarching performance profiling series, and today we're actually going to kick started and go in-depth on every profiling tool that you should be using in Visual Studio. So what are we talking about today? Today we're going to do a deep dive into the CPU usage tool. A reminder, you can reach these tools by hitting that Alt-F2 shortcut on your keyboard or going to the debug top menu and going to the performance profile. Sounds good. Sweet. So what is the CPU usage tool? It sounds important. Yeah. So let me just jump right in. The CPU usage tool is one of the many diagnostic tools that we offer in the suite of tools. It's one of the most powerful and the easiest way to talk about it is to lay a little bit of groundwork first by answering some questions. The first is what does it actually measure? It measures two things specifically. One is the percentage of CPU utilization over time. That's a percentage of your total CPU usage. The same thing that comes up when you look at your task manager in that after you're done recording a trace, they'll show up in the graph that you see on your screen right now. The interesting thing about this view is you can see peaks and valleys of CPU utilization and hopefully narrow in on an interesting time during your trace where your CPU was either spiking when you weren't expecting it, or maybe it was in a valley of plateauing when you were expecting it to be working harder. So it's just a really easy way to quickly narrow in on what part of your program you want to be looking at. The next thing it measures is how many times we saw a call to a function on the call stack over time. So for this explanation, we're going to imagine a really simple program like you see on your screen now where a main function calls function A, function A calls functions B and D, and function B ends up calling function C. The pretty straightforward execution of code. Now, if you can imagine the call stack during the execution of this code, if you were to sample it at different moments of time, that first sample would have just your main function. That's that first tick on that graph. Then when it calls function A, you would get a frame on the call stack on top of main that says function A, and so on and so forth. That's the program executes. What we do is we record two things. One is how many times total did we see a certain function over that sample? So in this case, main was at the bottom of every single call stack in this sample. This is way oversimplified of course, but you see it every time, so that's a total of eight times. Then we also count how many times you see a function by itself, meaning it's at the top of the call stack. So function A, for example, we see it seven times total, but only at that second interval, that sixth and eighth interval, do you see function A at the top of the call stack by itself? So we keep track of that number as well. Now, because we know how often we're gathering these samples by default, it's a thousand times per second, by seeing how many times we saw a certain frame on the call stack, and multiplying it by the sample rate, we can estimate that function A took seven total milliseconds in this case of CPU time, and three of those milliseconds, function A was the only thing executing at that time. So with the combination of those two things, we should be able to narrow in on what parts of your program and what specific functions your program is spending a lot of time or work on. One quick thing is we can actually customize that sampling rate. Like I mentioned, by default, it's a thousand times per second, but if there's a little gear icon, which I'll point out when I search over to Visual Studio, but it allows you to change the sampling rate from any custom rate. So with that, unless you have any questions about that, we can just jump into some code. Yeah, it's cool. So this is one of those profiling tools that's technically available in the diagnostics hub window, which we talked about in an earlier part, right? Yeah, that's true. The CPU usage graph, you'll actually see it's on by default when you're debugging. I know some people tend to hide that, but again, it's a really useful way to narrow in on interesting times during your program's execution. I agree. I try to close it a lot less often these days. Yeah, I mean, it really depends on your scenario. If you're doing a performance investigation, I think that's a great place to start. As we mentioned in previous episodes, this tool can be run in parallel with a bunch of other tools in the diagnostics suite. So sometimes you want to see how CPU and your database are behaving during a certain time and a certain CPU spike might point to an interesting time in your database or vice versa. Again, it's all about how quickly we can get you to your spot in code where you might be able to make some changes to make your program more performant. Sounds good. Well, I'd love to see it in action. Yeah. So for this demo, I just wrote up a really simple application where I start 10 timers each in its own thread. That are just timing from one to 10 seconds. And when? Timers. Yeah, that's a lot of timers and this is purposely bad code just to kind of demonstrate high CPU usage. I like that it knows it's bad. I'm a bad timer. Exactly. So we'll just jump into how to launch the tool. I'm going to press Alt F2 and you'll see this window where you can select what tools you're going to run during this trace. So I'm going to make sure that the CPU usage tool is selected. And here's that gear icon that I was talking about earlier, where you can change the sampling rate of the tool. So by making it to a higher sample, that's at the cost of potentially how long it's going to take to get all that info or what's the tradeoff? The tradeoff is the size of the trace. It's just a lot more data. I mean, if you change it from a thousand samples per second to 2000, you're doubling the amount of data that we're gathering. The overhead should be too high during the execution of your program. But when the trace ends and we're trying to build all these call stacks, it might take a little longer. But you might need that granularity. We've found that a thousand samples a second is almost always enough a resolution to figure out what's going on. But as you might see in this trace, sometimes it isn't enough. There's, I'll point out where you might want more granularity in the tool if it comes up. Gotcha. That makes sense. So to launch the tool, one other thing to point out is you'll see this warning here, when you try to start any profiling tool and you're in debug mode, it'll warn you about that and recommend that you search for release mode. And the reason for that is when you're running in debug mode, we kind of incur some additional overhead in order to make the debugger and some of the profiling tools have more information available to them and be able to step forward and back and do some of those clever things. But that's not how it runs in production. So to get an accurate performance trace, you really want to make sure that you're running in release mode. And so when I search the release mode, you'll notice that warning changes. We'll make sure that my startup project is to this demo and then I just hit start. So when I hit start, my program is just going to create 10 timers, wait for all of them to finish. The whole time the CPU usage tool will be sampling the call stacks. And when it's over, you'll see it build a report on that. So we're running a program. That is nice that it gives you the warning about changing to release mode, though I would definitely be that person who forgets, and then I'd be sitting through the whole profiling thing at like 4,000. It's not a shot sample, you know. Yeah, it's all right. And we land on this, what we call the summary page. And you'll notice a few things right away, quite a lot of information. There's that CPU graph. And you see it uses a ton of CPU here at the beginning. And then as timers kind of start pending, those resources are winding down. And you'll notice as I highlight different time selections on this graph, the things on the bottom are updating. And I am really filtering the data to just the things that happened between these two time stamps. So again, if you see a really big spike, perhaps like right here it's interesting, then I can narrow in on that and see information that relates just to that time period. That's pretty neat. And yep, I'm also digging the pie chart on the side there. That always makes things look more official, but it looks like it's breaking down where the usage is being spent. Yeah, well, I love the good pie chart. This pie chart will, what it does is when we know the module that a function is in, and we know what that module is related to. So for example, if it's something having to do with I.O. Or something that interacts with the kernel or with the file system, we tag each of those calls. And then we can give you the top five categories of how your time is spent in your program. So for some more complex things, you might see that networking is taking up a third of your time or I.O. is taking up three fourths in this example. And so that's another way that you might be able to focus your work to get more and more performance application. One other thing, just going back to this timeline real quick, is when you run another tool in conjunction with the CPU usage tool, for example, that database tool, if you highlight a time selection on this timeline, it will also filter the other tools. So again, that's a very useful way to quickly get to interesting spots in your code. Cool. So if you had a database tool tab open as well, then it would just automatically pivot over to that interval. Exactly. In that case, it would tell you what database queries were happening during that time. Sweet. Yeah, that saves a lot of time. It's not fun going back and forth and changing a slider. In general, with profiling investigations, we get so much data that it's all about narrowing to the important data as quickly as possible. And so this is what this summary page is attempting to do by showing you the graph, the pie chart, and then these two tables that I'm about to go through. The first one is the top functions. So if you recall that PowerPoint slide, that list of functions where it tells you how many times it happens in total over a certain amount of time and how many times that function was by itself, these are the same two graphs right here. And we'll show you the top five for both of those. So for example, in this time sample, this external call to system.datetime.get now happened 99.5% of the time, and it was by itself almost every single time. So that's a pretty good indication that whatever is calling this system function is probably a place to go tweak the performance of your code. Great. And then below that, you'll see the hot path. And the hot path is similarly it tries to get you to your most common function call and the most common path that you arrive there. So code isn't always as simple as this example. There's several ways that you can arrive at a certain method call. But what we try to do is try to identify the one that is the most common. And that's defined by if the function that we're looking at is responsible for at least roughly a third of its caller functions time, then we tag it as a hot path. And we just drill down from there. And so again, this might be a way to see the part of your code that's most responsible for work or time. Cool. And clicking one will redirect you to the. Yeah. So depending on which one you click, if you click on something in this functions table, it'll take you to something called a functions view, which I'll show you right now. The functions view is, again, that same table that I talked about in the PowerPoint. It's just a flat list. It doesn't tell you exactly how you are right somewhere, but it tells you that in this CPU demo.exe, these are all the functions that were called. This is how many times it showed up by themselves. Or total is how many times it showed up by themselves. And it'll give you a percentage. So you can sort by these columns to find the function that you find interesting. And then we also tell you what module the function is in. And so if you have a complex project, that's another way that you can focus in your investigation. Cool. Now going back to that summary page, if you click on a hot path or another way to get there to change the view to the call tree, you will get to a call tree view, which when I click on a function, it'll tell me exactly how I arrived at that function. And you'll notice that this time it's not a flat table. You'll actually be able to expand. And you can see everything that happened during the CPU demo.exe's execution looks like it called some native code, which then it calls. Some external code. And then it looks like this is the first call to my actual code. We can see how many times it occurred. The call frame occurred in that sample. And again, you can follow that path down to a call that you might find interesting. And then wherever we can, if we have the symbols, we will show you a preview of your code down in this page. And it'll show you exactly what line of code was making that call. So in this example, it looks like this call to get a new timer was happening a bunch of times. Oh, in that sample. So I think this is still narrowed to just that timeline. So if I clear the selection, yeah. So I get a much higher number of samples. And again, these percentages show you what percentage of time was spent on that line or on the calls that that line makes. Great. That's a really nifty tool, if I do say so myself. I think that can save people a lot of time trying to narrow things down and play trial and error as to what's taking up the most time. Hopefully. One other great tool in this call to review is that you can expand the hot path. And that's that same concept that I talked about on the summary page, except there can be several hot paths within a call and several hot paths within an application. So you can highlight a node and then expand the hot path under that node. And it'll step into the calls that this makes until it reaches a point where no one thing is responsible for 30% or more or a third or more of the time spent there. So in this case, it steps you all the way down to some external code. And so if I were to investigate here, I would just probably start by looking at its caller function, double click on it, and it'll take me exactly to where in my code that's happening. Great. I like that fast shortcut. So I'm going to find it yourself. Yeah. Another way to navigate that same kind of nested tree is in this caller-callee view, which if you select any method and say view in caller-callee, you'll get a similar, you'll get almost the same data, but you'll get this view that shows you the function that you're currently looking at, the functions that it calls here in this right box with the red boxes. And you'll also see its parent function that called it. And so you can step backwards and forwards through code until you get somewhere like the bottom of the call stack where you can hopefully find some code to change. Awesome. Yeah, so many different views. Yeah, there's quite a few. And the nice thing is that you can switch between them pretty seamlessly. This last view I want to highlight is this module's view, which will show you every module that your program is utilizing. And again, it'll try to attribute time spent during that trace to those DLLs and tell you specifically what functions within those are responsible. So external code, I probably don't have much power to change in this scenario, same with this private core lib. But if I step into my DLL, I can see the functions that my program's utilizing. And again, here something might pop out, like for example, a bad timer that I can go then preview when I double click on it. And if I click on this link above the preview, I can actually go to my code to change it. Awesome. Yeah. Yeah, I like that there's so much flexibility for the user if they want one view over another, then they can have that option if they're more visual learner, then they can have that option too, so. Yeah, and another awesome thing, which I think we talked about during the higher overview episode is that you can save these DIAC sessions. And that actually saves the trace that I collected. And I can share it with somebody on my team, or I can look at it later to compare runs. And so that's a really good way to measure performance improvements or to try to diagnose things with the team. Absolutely. There is one last thing I wanted to call out. And that is in this summary page. These filters in the top right corner allow you to filter your sample to any specific thread. So if you know a thread that you are particularly trying to diagnose what's happening on it, or it'll tell you the percentage of utilization of those threads. So there might be one that's like 95%, and you might want to narrow to just that thread. It also allows you to hide native code, to hide external code, or show it, and to stitch asynchronous code so it kind of gets squashed into one single line for each asynchronous method. And again, those categories that this pie chart represents, you can also filter by those. So if I see that I have a bunch of work happening in IO, then I can just look at the IO top functions and hot path and filter that way. That's awesome. Yes, there's a thread. It's cool that you can save this file for later and still get that same functionality so that you don't have to retread and do the whole trace again. The only tricky, well, I wouldn't even call it tricky, the only thing there is in order to show those code previews and take you to source and code, you're going to need access to the same symbols. But within a team, it can be a really powerful way to help diagnosing an issue or even to take traces and see if one, if something runs faster than before. Sweet. Yeah, well, I learned a lot more about the CPU usage tool than I realized, than I thought I would. I feel like the CPU usage tool has hidden depths that most people, even ones who do use CPU usage tool are unaware of. So it was really cool to see the whole expanded tool. Yeah, and I think this is true that the CPU usage tool can run with the most other tools out of the ones that we offer. I think it may be able to run with every other tool. But I would just recommend to leave it on, even if you're doing a database investigation or some memory investigation. It's just a really good starting point because if an application is performing slower than you expect, the CPU is a pretty good place to look at whether it's being peaking and not it doesn't have the resources to do more or it's not being utilized as much as you would expect. It can be a really good way to start your investigation. Yeah, absolutely. Sweet. So anything else before we close out? I think that's it. In the description, we'll have a link to the docs for this tool. And I'll make sure that my contact information is there as well if you have any questions. I would encourage everyone to try it. You might see an issue with your program that you didn't know was there. And there might be just some low hanging fruit to make your application more performant. So I would just encourage you to try it and reach out if you have any questions. Sounds great. Well, thanks for being here, Esteban. And I'm sure we'll see you in a future profiling video in the near future. Yeah, happy to be here. I look forward to it. Great. So until then, happy profiling. Happy profiling.