 Hi everyone, in this video we will learn about Perf. Perf is a tool which offers a rich set of commands to collect and analyze performance data. If we want to know which parts of the programs are taking a lot of CPU time, then we can use Perf. Let's open a terminal and let's go through some of the very basic commands of Perf. So I'll start with the stat command. Perf stat is used to collect statistics about any program. So we can give it any command, let's say ls and it will show us the statistics when ls is run. So I'll use sudo, okay. First it executed ls and then it shows the counter statistics for ls, it gives us how many context switches are there, how many CPU migrations, page fold cycles, instructions etc. This is very basic use of Perf to collect statistics about any program. Let's say we want to collect stats for the complete system. So we want to collect statistics about every process that is running on any processor. We can use the hyphen aflac to collect the statistics and let's say we want to collect the statistics for one second, perf stat hyphen a and then we can give it command sleep one. So this will count the statistics for one second. So it says statistics for system wide. So we have these many context switches in one second, these many page folds etc. And we can use the minus d flag to get some detailed statistics. If I run it again, then it shows us some other statistics such as l1 cache loads, l1 cache load misses etc. Another important command that we can use with Perf is perflist. So if I run sudo perflist, it will list all the events that we can count using perf. So it shows us some hardware events such as branch instruction, branch misses. Then there are some software events for which kernel maintains a counter such as CPU migrations or context switches. And then we have some cache events. This PMV refers to performance monitoring unit. So there is a separate unit with the processor which counts various hardware events which include these branch instructions or cache misses etc. So there are a lot of events, let's say we want to measure certain event for instance, let's try to measure these branch instructions. So what we can do is we can use the hyphen e flag. Let's say we want to measure branch instructions for one second across all the processors. So I can use hyphen e branch instructions and then it will give me the count for the branch instructions for one second. The next command that we will see is perf free code. The perf free code is used to collect the profile data about a certain program in a separate file. So let's say I want to record the data for complete system for one second. I need to use sudo. If we do ls then we can see there is this perf.data file. This is a binary file which stores the profiling data for our complete system. So how can we read this perf.data file? We need to use perf report to read this binary file. Let's use perf report with input file perf.data. So this shows us what is the percentage of CPU cycles that were used for this particular command and brought with a shared object and so on. So let's quit this using Q and let's try to profile some C program. So I have written this perf.dama.c file. Let's first have a look at its contents. This is a very simple program. It has two functions. One is compute function and other is compute intensive function. And in the main function, we just call these two functions. In the intensive function, we iterate 10 to 7 times and we just add one to the sum a dummy variable and return that sum. In compute function, we iterate 10 to 6 times. So compute intensive function uses 10 times more number of iterations than compute function. Let's try to compile this GCC perf.dama.c. And let's run the executable and try to profile it using perf. So I'll use sudo perf.recode with dot slash a dot out. And this will generate the perf.data file. Now let's try to see the profile using sudo perf.report.refini.proff.data. So here it shows us the percentage of CPU that is used for various functions and the command that is eroded out and then the function name. Here we can see that almost 88% of CPU is used only in compute intensive function. And almost 9% is used in the compute function. So compute intensive function is almost 10 times more expensive than compute function. And this also shows us what is the bottleneck of our program. That is the compute intensive function. So that's how we can profile various C programs and try to optimize their bottlenecks. If you want to see more details about perf, then you can have a look at the man page of perf. And also there is a very nice tutorial on perf which you can find online. So this explains various commands of perf and what else you can do using perf. So that's it for this video. Thanks and have a nice day.