 That's all right. So welcome back. Today we're going to cover the debugging and profiling. Before I get into it, we're going to kind of make another reminder to fill in the survey. Just one of the main things we want to get from you is questions, because the last day is going to be questions from you guys about things that we haven't covered, that you want us to talk more in depth. And the more questions we get, the more interesting we can make that section. So please go on and fill in the survey. So today's lecture is going to be a lot of topics. The other topics revolve around the concept of what do you do when you have a program that has some bugs, which is most of the time. Like when you're programming, you're kind of thinking about how you implement something. And there's like a half life or like fixing all the issues that a program has. And even if your program behaves like you want, it might be that it's really slow, or like it's taking a lot of resources in the process. So today we're going to see a lot of different approaches of dealing with these problems. So first, the first section is on debugging. Debugging can be done in many different ways. There are all kind of like the most simple approach that like pretty much like CS students will go through will be just you have some code, and it's not behaving like you want. So you probe the code by like adding print statements. And this is called printfd debugging. And it works pretty well. Like I have to be honest, like I use it a lot of the time because how kind of simple to set up and how quick the feedback can be. One of the issues with like printfd debugging is that you can get a lot of output. And maybe you don't want to get as much output as you're getting. And people have four like slightly more complex ways of doing printfd debugging. And make this, there we go. And one of these ways is kind of what is usually referred to logging. So the advantage of doing logging versus doing printfd debugging is that when you're creating logs, you're not necessarily creating the logs because there's like a specific issue you want to fix. It's mostly because you have built a more complex software system and you want to log when some events happen. And one of the core advantages of using a logging library is that you can define severity levels. And you can filter based on those. Let's see an example of how can we do something like that. So yeah, everything fits here. This is really silly example. We're just gonna sample random numbers. And depending on the value of the number that we can interpret as a kind of how wrong things are going. We're gonna log the value of the number. And then we can see what is going on. And let me check the NE2 disable this from errors. And if we were just to execute the code as it is, we just get the output and we just keep getting more and more output. But you have to kind of stir at it and make sense of what is going on. And we don't know what is the relative timing between printfs. We don't really know whether this is just an information message, like a message of whether something went wrong. And if we just go in and undo, not that one, but I want to undo, that one we can set that former. And now the output looks something more like this. So for example, if you have several different modules that you're programming with, you can identify them with different levels. Here we have, we have debug levels. We have critical info different levels. And it might be handy because here we might only care about the error messages. Like those are like the, we have been working on our code so far so good. And suddenly we get some error. We can log that to identify where it's happening. But maybe there's a lot of information messages, but we can deal with that by just changing the level to error level. And now if we run this again, we're only going to get those errors in the output and we can just look through those to make sense of what is going on. Another really useful tool when you're dealing with logs is as you kind of look at this, it has become easier because now we have kind of these critical and error levels that we can quickly identify. But since kind of humans are fairly visual creatures, one thing that you can do is use colors from your terminal to identify these things. So now changing the formatter, what I've done is kind of slightly change how the output is formatted. And when I do that, now whenever I get a warning message is color coded by yellow. Whenever I get like an error, like fade red. And when it's like critical, I have like a bold red indicating something went wrong. And here is like a really short output, but when you start having like thousands and thousands of lines of law, which is not unrealistic and happens every single day in a lot of apps, kind of quickly browsing through them and identifying where like the error or like the red patches are can be really useful. A quick aside is you might be curious about how the terminal is displaying these colors. Like at the end of the day, the terminal is only outputting characters. So how are these, like how is this program or like how are other programs like LS that like has all these fancy colors? How are they telling to the terminal that it should use these different colors? This is nothing kind of extremely fancy. What these tools are doing is doing something along these lines. Here we have a command I can clear the rest of the outputs so we can focus on this. There's some special characters, some scape characters here. Then we have some text and then we have some other special characters. And if we execute this line, we are like red, this is red. And you might have picked up on the fact that we have a 25500 here. And this is just telling the RGB values of the color we want in the terminal. And you pretty much can do this in any piece of code that you have and that you can color code the output. And your terminal is fairly fancy, supports a lot of different colors in the output. This is not even all of them. This is like a 16 of them. And I think it can be fairly useful to know about that. Another thing is maybe you don't enjoy like the young thing like logs are kind of really fit for you. The thing is a lot of other systems that you might start using will use logs. As you start building larger and larger systems, you might rely on other dependencies, common dependencies might be web servers or databases is a really common one. And those will be logging their errors or like exceptions in their own logs. Of course, you will get some client side error, but those sometimes are not like informative enough for you to figure out what is going on. In most unique systems, the logs are usually plates and their folder core like slash bar log. And if we list it, we can see there's like a bunch of logs in here. And so let me scroll a little bit. So we have like the SATA monitor log, like some weak logs, things related to the Wi-Fi, for example. And if we output the system log, which contains a lot of information about the system, we can get information about what's going on. Similarly, there are kind of tools that will let you more like sanely go through this output. But here, looking at the system log, I can look at this and say, oh, there's some service that is kind of existing with some abnormal code. And based on that information, I can go and try to figure out what's going on, like what's going wrong. One thing to know when you're working with logs is that more traditionally, kind of their own, like every software had their own log, but has been increasingly more popular to have a unified system log where everything is placed. And you can like pretty much any application can log into this system log, but instead of being in a plain text format, it will be compressed in some special format. An example of this was what we covered in the data running lecture. In the data running lecture, we were using the journal CDL, which is kind of accessing the log and outputting all that up. Here in Mac, now the command is log so, which will display a lot of information. I'm gonna just display the last 10 seconds because logs are really, really proposed. And just displaying the last 10 seconds is still gonna output a fairly large amount of lines. So if we go back through what's going on, we here see that like a lot of Apple things are going on since this is a MacBook. And maybe we could find errors about like some system issue here. Again, they are fairly verbose. So you might want to practice your data running techniques here like 10 seconds equal to like 500 lines of logs. So you can kind of make an idea of how many per second you're getting. And they're not only useful for figuring out some other programs, they're also useful for you if you want to log there instead of into your own file. So using the logger command in both like UNI and Linux and Mac, you can say, okay, I'm gonna log this hello logs into this like system log and we execute the command. And then we can check by going through the last minute of logs since it's gonna be fairly recent and gripping for that hello. And we find their entry, fairly recent entry that we just created that say hello logs. And as you become more and more familiar with these tools, you kind of will find yourself using the logs more and more often. Since even if you have like some bug that you haven't detected and the program has been running for a while, maybe the information that is already in the log can tell you enough to figure out what is going. However, printf debugging is not everything. So now I'm gonna be covering the buggers, but first, any questions on log so far? So what kind of this hello logs, it says that you did something with hello at that time? Yeah, like it say for example, I can write a bus script that like the text or like it checks every time what the wifi network I'm connected to. And every time it detects that that has changed, makes an entry in the logs and says like, oh, now we have changed wifi networks. And then you might want back and pass through the logs and check like, okay, when did my computer change from wifi network to another? And this is just kind of like a simple example, but there are many, many ways, many types of information that you could be logging here. More commonly you would probably want to check like if your computer for example, is entering sleep, for example, for some unknown reason or like sudden like hibernation mode, there's probably some information in the logs about who asked that to happen or like why is that happening? Any other questions? Okay, so when printf debugging is not enough, the best alternative after that is using exit that and hello. So he is using a debugger. So a debugger is kind of a tool that will wrap around your code and will let you run your code but it will kind of keep control over it. So it will let you kind of step, like it will let you step through the code and execute it and set breakpoints. And you probably have seen the burgers in some way if you have ever used something like an IDE because IDEs have this kind of fancy or set a breakpoint here, execute. But at the end of the day what these tools are using is just this common line, the burgers and they're just presenting them in a really kind of fancy format. Here we have kind of a completely broken bubble sort. Simple sort, you know, don't worry on the details but we just want to sort this array that we have here and we can try doing that by just doing Python bubble. And when we do that say, oh, there's some index error, list index error frames. We could start adding prints but as if we have like a really long string we can get a lot of information. So how about we go up to the moment that we crash? Like we can go to the moment and examine what the current state of the program was. So for doing that I'm gonna run the program using the Python debugger. Here I'm using technically I Python debugger just because it has nice coloring syntax. So it's probably easier for both of us to understand what's going on in the output but there are pretty much identical in any way. So we execute this and now we are given a prompt. We are being told that we are here at the very first line of our program and we can, L stands for list. So as with many of these tools this kind of like a language of operations that you can do and they are often like mnemonic as it was in the case with beam or with themex. So here L is for listing the code and we can see the entire code. S is for step and we let us kind of one line at a time go through the execution. The thing is we're only triggering the error some time later. So we can restart the program and instead of trying to step until we get to the issue we can just ask for the program to continue which is the C command. And hey, we reach the issue. We go to this line where everything crashed. We're getting this list index out of range and now that we are here we can say, huh. Okay, first let's print the value of the array. So this is the value of the current variable array. Okay, doesn't, so we have six items. Okay, what is the value of j here? So we look at the value of j, j is five here which will be the last element. But j plus one is gonna be six. Oh, so that's true here in the out of bounds error. So what we can do, we have to do is this n instead of n has to be n minus one. Like we have identified that the error lies there. So we can quit which is Q. Again, because as a post mortem debugger and we go back to the code and say, okay, we need to append this n minus one. Okay, that will prevent the list index out of range. And if we run this again without the debugger, okay, no errors now but this is not our sorted list. This is sorted but it's not our list. Like we are missing entries from our list. So there's some behavioral issue that we're reaching here. The, again, we could start using print of debugging but kind of a hunch now that's probably the way we're swapping entries in the bubbles or program is wrong. And we can use the debugger for this. We can go through them to the moment we're doing a swap and check how the swap is being performed. So a quick overview. So we have a kind of two, four loops and in the most nested loop, we are taking if the array is larger than the other array. Thing is if we just try to execute until this line, it's only gonna trigger whenever we make a swap. So what we can do is we can set a break point in the six line. We can create a break point in this line and then the program will execute and the moment we try to swap our variables when the program is gonna stop. So we create a break point there and then we continue the execution of the program. The program holds and say, hey, I have executed and I have reached this line. Now I can use locals which is kind of a Python function that returns you a dictionary with all the values to quickly see the entire context. Say, okay, the array is fine. Nine is six, okay, I'm just at the beginning and I just step, go to the next line. Oh, and identify the issue. Like I'm swapping one item at a time instead of simultaneously. So that's what's triggering the fact that we're losing variables as we go through. And that's how that's kind of like a very simple example, but debuggers are really powerful. Most programming languages will give you some sort of debugger. And when you go to more low-level debugging, you might run into tools like, you might want to use something like, where is this, oh, sorry, like GDB. And GDB has one nice property, is GDB works really well with C, C++ and like all these C languages. But GDB actually lets you work with pretty much any binary that you can execute. So for example, here we have slip, which is just a program that's gonna slip for 20 seconds. And it's loaded and then we can do run and then we can interrupt this, sending an interrupt signal. And GDB is displaying for us here like very low-level information about what's going on in the program. So you're getting the stack trace. We're seeing like we're in this nano-slip function. We can see the value for like all the hardware registers in our machine. So you can get a lot of like low-level detail using these tools. And I think that's all I wanna cover for debuggers. Any questions related to that? Another interesting tool when you're trying to debug is that sometimes you want to debug as if your program is a black box. So like you maybe know what the internal saw the program at, but at the same time, your kind of your computer knows whenever your program is trying to do some operations. So this is in Unix systems. There's this notion of like user-level code and terminal-level code. And when you try to do some operations like reading a file or like reading the network connection, you will have to do something called system calls. And you can get a program and go through those operations and ask the, and ask what operations did this software do? So for example, if you have like a Python function, there is only supposed to do a mathematical operation and you run it through this program and it's actually reading files? Why is it reading files? It shouldn't be reading files. So let's see. We have, and this is best ratio. For example, we can do something like this. So here we're gonna run the LS minus L and then we're ignoring the output of LS but we're not ignoring the output of S trace. So if we execute that, we're gonna get a lot of output. And this is all the different system calls that this LS has executed. You will see a bunch of open, you will see FSTAT. And for example, since it has to list all the properties of the files that are in this folder, we can check for the LSTAT call. So the LSTAT call will check for the properties of the files and we can see that, effectively like all the files and folders that are in this directory, they're being accessed through a system call through LS. Interestingly, sometimes you actually don't need to run your code to figure out that there's something wrong with your code. And like so far we have seen kind of ways of identifying issues by running the code but what if you, like you can look at a piece of code like this, like the one I have some right now in the screen and identify an issue. So for example, here we have some really silly piece of code defines a function, prints a few variables, multiplies some variables, slips for a while and then we try to print bus. You could try to look at this and say, hey, bus has never been defined anywhere. Like this is a new variable like, oh, you probably meant to say bar but you just mistyped. Thing is, if we try to run this program, it's gonna take 60 seconds because like we have to wait until this time, slip function finishes. So here slip is just for kind of motivating the example but in general you may be loading a data set that takes really long because you have to copy everything into memory. And things we, there are programs that will take source code as input, will process it and will say, oh, probably this is wrong about this piece of code. So in Python, this is called, like in general these are called static analysis tools. In Python we have, for example, PyFlex and if we get this piece of code and run it through PyFlex, PyFlex is gonna give us a couple of issues. First one is the one, like the second one is the one we identified, like here. This is a non-defined name called bus. You probably should be doing something out of that. And the other one is like, oh, you're really refining the full variable name in that line. So here we have a full function and then that we're kind of saddling that function by using a loop variable here. So now that full function that we define is not accessible anymore. And then if we try to call it afterwards we will get into errors. And there are other types of static analysis tools. MyPy is a different one. MyPy is gonna report the same two errors but it's also kind of complain about type checking. So it's gonna say, oh, here you're multiplying an int by a float and you should like, if you care about the type checking of your code you should not be mixing those up. It can be kind of inconvenient kind of having to run this, look at the line going back to your beam and kind of figuring out like your editor and figuring out like what the error matches to. And there are really solutions for that. And the way is that you can integrate most errors with these tools and here you can see there's like some red highlighting on the bus. And if we read the last line here, say no, undefined name bus. So as I'm editing this piece of Python code my editor is gonna give me feedback about what's going wrong with this. Or like here, I have another one saying like, oh, the definition of unused foo. And even there are some stylistic complaints. So oh, I will expect two empty lines. So like in Python you should be having like two empty lines between like a function definition. There is like a research on the lecture notes about pretty much like static analyzers for a lot of different programming languages. And there are even standard analyzers for English. So I have my notes for the class here. And if I run it through this static analyzer for English there's a great good. It's gonna complain about some selective property. So like, oh, I'm using very which, like it's a wizard war and I shouldn't be using it or quickly can weaken meaning. And you can have this for spell checking or for a lot of different types of stylistic analysis. Any questions so far? Oh, I forgot to mention, there as depending on the task that you are performing there will be different types of debuggers. For example, if you're doing web development pretty much all like in like both Firefox and Chrome have a really, really good set of tools for doing debugging in the, for websites. So here we go and say inspect element. We can get the, I don't really know how to make this larger you know that I think of it. But we're getting like the entire source code for the web page for the class. What? Control plus for the, oh yeah, there we go. That better? And we can actually go and like change properties about the goals so we can say like we can edit the title and say this is not a class on the plugin and profiling. And now the code for the website has changed. And this is one of the reasons why you should never trust any like screenshots of websites because like they can be completely modified. And you can also modify the style. Like here I have things using the dark mode preference but we can alter that like in, you know because at the end of the day the browser is rendering this for us. And we can check the cookies where there's like a lot of different operations. There's also like a built-in debuggers for JavaScript. So you can like a step through JavaScript code. So kind of the takeaway is depending on what you are doing you will probably want to search for what tools programmers have built for them. And now I'm gonna switch gears and stop talking about debugging which is kind of finding issues with the code ready to kind of more of the behavior and then start talking about like how you can use profiling and profiling is how to optimize the code. And it might be because you want to optimize the CPU, the memory, the network. There are many different reasons that you want to be optimizing. As it was the case with debugging the kind of first order approach that a lot of people kind of have experienced already is oh let's use just printf profiling so to say. Like we can just take, let me make this larger. We can take the current time here. Then we can check, we can do some execution and then we can take the time again and subtract it from the original time. And by doing this you can kind of narrow down and fence down different parts of your code and trying to figure out what's the time taking between those two parts. And that's good but sometimes can be interesting the results. So here we are sleeping for a point five seconds and the output is saying oh it's point five plus some extra time which is kind of interesting. And if we keep running it, we see there's like some small error. And the thing is here where we're actually measuring is what it usually referred to as the real time. Like real time is as if you get like a clock and you start it when your program starts and you stop it when your program ends. But the thing is in your computer it's not only your program that is running, there are many other programs running at the same time. And those might be the ones that are taking the CPU. So to try to make sense of that, a lot of, you will see a lot of programs using the terminology that is real time, user time and system time. Real time is what I explained which is kind of the entire length of time from start to finish. Then there is like the user time which is the amount of time your program is spent on the CPU doing user level cycles. So as I was mentioning in Unix you can be running user level code or kernel level code. And system is kind of the opposite. It is the amount of CPU, like the amount of the time that your program spent on the CPU executing kernel more instructions. So let's solve this with an example here. I'm gonna time which is a command like this cell command is gonna get these three metrics for the following command. And then I'm just grabbing a URL from a website that is hosting Spain. So that's gonna take some extra time to go there and then go back. And if we see here, if we were to just, we have like a two prints between like the beginning and the end of the program, we could think that like this program is taking like 600 milliseconds to execute. But actually most of that time was spent just waiting for the response on the other side of the network. And we actually only spent 16 milliseconds at the user level and like nine seconds and total 25 milliseconds actually executing CURL code. Everything was just waiting. Any questions related to timing? Okay, so timing can become still tricky. It's also kind of a black box solution. Or if you start adding print statement, you can, it's kind of hard to have print statements with time everywhere. So programmers have figured out their tools. These are usually referred to as profilers. One quick note that I'm gonna make is that profilers, like usually when people refer to profilers, they usually talk about CPU profilers because they're like the most common, identifying where like time is being spent on the CPU. And profilers usually come in kind of two flavors. There's like tracing profilers and sampling profilers. And it's kind of good to know the difference because the output might be different. And tracing profilers kind of instrument your code. So they kind of execute with your code and every time your code enters a function call, they kind of take a note of it. It's like, oh, we're entering this function call at this moment in time. And they keep going and once they finish, they can report you, oh, you spend this much time executing this function and this much time in this other function, so on and so forth. Which is the example that we're gonna see now. Another type of tools are tracing, sorry, sampling profilers. The issue with tracing profilers is they add a lot of overhead. Like you might be running your code and having this kind of profiler next to you making all these counts will hinder the performance of your program so you might get counts are slightly off. So sampling profiler, what it's gonna do is gonna execute your program and every 100 milliseconds, 10 milliseconds, like some defined period, it's gonna stop your program, it's gonna halt it, it's gonna look at the stack trace and say, oh, like you are right now in this point in the hierarchy and identify which function is gonna be executing at that point. And the idea is that as long as you execute this long enough, you're gonna get enough statistics to know where most of the time is being spent. So let's see an example of tracing profiler. So here we have a piece of code that is just like a really simple re-implementation on grep done in Python. And what we want to check is what is the bottleneck of this program? Like we were just opening a bunch of files, trying to match this pattern and then printing whenever we find a match. And maybe it's RX, maybe it's the print we don't really know. So to do this in Python, we have the C profile. And here is just, I'm just calling this module. I'm saying I want to sort this by the total amount of time that we're gonna see briefly. I'm calling the program we just saw in the editor. I'm gonna execute this a thousand times. And then I want to match, like the grep arguments here is I want to match this RX to all the Python files in here. And this is gonna output some, this is gonna produce some output that we're gonna look at it. First is all the output from the greps. But at the very end, we're getting output from the profiler itself. And if we go up, we can see that, hey, we spent by sorting by like, we can see like the total number of calls. So with the 8,000 calls because we executed this a thousand times. And this is the total time, amount of time we spent in this function, cumulative time of time. And here we can start to identify kind of where the ball in the case. So here, this built in method IO open saying that, oh, we are spending a lot of the time just waiting for greeting from the desk. Or there we can check, hey, a lot of time is also being spent trying to match the RX, which is something that you will expect. One of the caveats of using this tracing profiler is that as you can see here, we're seeing off function, but we're also seeing a lot of like function that correspond to building. So like functions are like third party functions from the libraries. And as you start building more and more complex code, this is gonna be much harder. So here is another piece of Python code that don't read through it. What it's doing is just grabbing the course website and then it's printing all the, it's parsing it and then it's printing all the hyperlinks that it's found. So there are like these two operations. It's like going there, grabbing the website and then parsing it, printing the links. And we might want to get a sense of how those two operations compare to each other. And if we just try to execute the C profiler here and we're gonna do the same, this is not gonna print anything. And I'm using a tool that we haven't seen so far that I think is pretty nice. It's TAC, which is the opposite of CAT. This is gonna reverse the output, so I don't have to go up and look. So we do this and hey, we can get, we get some interesting output. We're spending a bunch of time in these built-in methods. Soccer gets our info and like in create dynamic and method connect and posix that like, nothing in my code is directly calling to this function. So you don't really know what it's the split between the operation of making a web request and parsing the output of that web request. So for that, we can use a different type of profiler which is line profiler. And the line profiler is just gonna present the same results, but in a more humanly readable way, which is just, oh, for this line of code, this is the amount of time things took. Ah, yeah, of course. And so it knows that it has to do that. We have to, like, add a courier to the Python function. We do that and as we do that, we now get slightly cropped output, but the main idea, we can look at the percentage of time and we can see that making this request get operation took 88% of the time, whereas parsing the response took only 10.9% of the time. And this can be really informative and a lot of different programming languages will support this type of line profiler. Sometimes you might not care about CPU. Maybe you care about the memory or some other resource. Similarly, there are memory profilers. In Python, there is memory profiler. For C, you will have ball grind. So here's a fairly simple example. We just create this list with a million elements. That's gonna consume, like, megabytes of space and we do the same thing in another one with 20 million elements. And to check what's the memory allocation, how it's gonna happen and what's the consumption, we can go through a memory profiler and we execute it and it's telling us the total memory usage and, like, the increments. And we can see that we have some overhead because it's an interpreted language. And then when we create this, like, million... And this list with a million entries, we're getting this many megabytes of information, then we're getting another 150 megabytes. They were freeing this entry and that's decreasing the total amount. You are not getting a negative increment because I'm bagging probably the profiler. But if you know that your program is taking a huge amount of memory and you don't know why, maybe because you are kind of copying objects where you should be doing things in place, then using a memory profiler can be really useful. And in fact, there is, like, an exercise that will kind of work you through that, kind of comparing an in-place version of Quickshort with, like, a non-in-place that keeps making new and new copies. And if you're using the memory profiler, you can get a really good comparison between the two of them. Any questions so far with profiling? Yeah, what? Yeah, like the thing you might be able to figure out, like, just looking at the code, but as you get more and more complex, for this code at least, as you get more and more complex programs, what this is doing is running through the program, and like, for every line at the beginning, is looking at the kind of the heap and saying, oh, where are the objects that I have allocated? Oh, I have seven megabytes of objects. And then it goes to the next line, looks again, it's like, oh, now I have 50, so I have now added 43 there. And again, you could do this yourself by kind of asking for those operations in your code, every single line, but that's not how you should be doing things since people have already kind of written these tools for you to use. As it was the case with, let me, ah, control vision. As it was the case with S trace, you can kind of do something similar in profiling. Like, you might not care about these specific lines of codes that you have, but maybe you want to check for outside events. Like, you maybe want to check how many CPU cycles your program is using, or how many page faults it's creating. Maybe they have like bad cast locality and that's being manifested somehow. So for that, there is the Perf command. And the Perf command is going to do this, where it is going to run your program and it's going to kind of keep track of all these statistics and report them back to you. And this can be really helpful if you are working at a more low level. So we execute this command, which I'm going to explain briefly what it's doing. And it's, this program is just running in the CPU and it's just a program to just hog one CPU and like test that you can hog the CPU. And now if we control C, we can go back and we get some information about the number of page faults that we've had or the number of CPU cycles that we utilize. Another useful metrics from more code. And even for some programs, you can look at what the functions that were being used where. So we can record what this program is doing, which we don't know about because it's kind of program someone else has written. And we can report what it was doing by looking at the stack trace and we can say, oh, it's spending a bunch of time in this random R standard library function. And it's mainly because the way of hogging a CPU is just creating more and more like pseudo random numbers. And there are like some other functions that have not been mapped because they belong to the program. But if you know about your program, you can display this information using more flags over there. And there are really good tutorials online about how to use this tool. Oh, one more thing regarding profilers is so far these profilers are really good about aggregating all this information and giving you a lot of these numbers so you can optimize your code or you can reason about what is happening. But the thing is humans are not really good at making sense of lots of numbers. And some humans are like more visual creatures and it's probably much easier to kind of have some sort of visualization. Again, like programmers have already thought about this and have kind of come up with solutions. A couple of popular ones is a flame graph. A flame graph is a way of a sampling profiler. So this is just running your code and taking samples. And then on the y-axis here, we have the depth of the stack. So we know that the bus function called this other function called this other function, so on and so forth. And on the x-axis is not time. Like it's not the timestamps. Like it's not this function run before, but it's just time taken. Because again, this is a sampling profiler. We're just getting a small glimpses of what was going on in the program. But we know that for example, this main program took the most time because the x-axis is proportional to that. And they are interactive and they can be really useful to identify the kind of the hotspots in your program. And another way of displaying information and there is also an exercise on how to do this, is using a call graph. So a call graph is gonna be displaying the information and it's gonna create a graph of which function called which or function. And then you get information about like oh, we know that main called this person function 10 times. And it took this much time. And as you have a larger and larger programs looking at one of these call graphs can be useful to identify which is calling, like what piece of your code is calling this really expensive like IO operation for example. With that, I'm gonna cover the last part of the lecture, which is that sometimes you might not even know that like what exact resource is constraining your problem. Like how do I know how much CPU my program is using Java's clinical quickly looking at like how much memory. So there are a bunch of really nifty tools for doing that. One of them is H-stop. So H-stop is like an interactive command line tool and here is displaying all the CPUs this machine has which is 12, is displaying the amount of memory. So it's like I'm consuming almost a gigabyte of the 32 gigabytes machine has. And then I'm getting the, I'm getting all the different processes. So for example, we have CSA, SQL, all the processes that are running in this machine and I can sort through the amount of CPU they are consuming or through the priority they're running at. We can check this for example, here we have the stress command again and we're gonna run it to take over four CPUs and check that we can see that in H-stop. So we did spot those four CPU jobs and now we have seen that like now we have beyond the ones we had before, now we have these four, there we are, like these four stress minus C commands running and taking advance of your CPU. And even though you could use a profiler to get similar information to this, the way H-stop displays this kind of in a live interactive fashion can be much quicker and much easier to parse for you. In the notes, there's a really long list of different tools for evaluating different parts of your system. So there might be tools for analyzing the network performance. There's about looking about the number of IO operations so you don't know if whether you are saturating the reach from your disks. You can also look at what is the space usage. Which, I think here, like, so in CDU, like there's a tool called VU which stands for Disk Usage and we have the minus H flag for human readable output and I think we can do videos and we can get output about the size of all the files in this folder. Did I do control? No, I think I did control. Yeah, there we are. And if we, there are also interactive versions like H-stop was an interactive version so in CDU is an interactive version that will let me navigate through the folders and I can see quickly that, oh, like we have, this is one of the folders for the video lectures for the, yeah, the video, and we can see there's like, oh, there are these four files that have like almost nine gigabytes each and I could quickly delete them through this interface. Another new tool is Elsoff which stands for kind of list of open files. Another pattern that you might encounter is you know some process is using a file but you don't know exactly which process is using the file or similarly some process is listening in a port but again, like how do you find out which one is? So to set an example, we just run a Python HTTP server on port 444 running there, maybe we don't know that that's running but then we can use Elsoff, yeah, we can use Elsoff and the thing is Elsoff is gonna print a lot of information, you need like suit permissions because this is kind of gonna ask for like who has all these items and since we only care about the one who is listening in this 444 port, we can ask grep for that and we can see, oh, there's like this Python process with this identifier that is using the port and then we can kill it and that terminates that process and again, there's a lot of different tools, there's even tools for doing what is called benchmarking so in the cell tools and scripting lecture, I said like, oh, for some tasks, FD is much faster than fine but how will you check that? Well, I can test that with Hyperfine and I have here two commands, one with FD that is just like researching for JPEG files and the same one with fine and if I execute them, it's gonna kind of benchmark these scripts and give me some output about how much faster FD is compared to fine. So I think that kind of concludes, yeah, like 23 times for this task. So that kind of kind of concludes the whole overview. I know that there's like a lot of different topics and there's like a lot of perspectives on doing these things but kind of again, I want to reinforce the idea that you don't need to be a master of all these topics but more that to be aware that all these things exist so if you run into these issues, you don't reinvent the wheel and you kind of like reuse all of our like programmers' time given that, happy to take any questions ready to this last section or anything in the lecture. Is there any way to sort of think about how long the program should take? You know, if it's taking a while to run, you know, should you be worried or depending on what you're doing or start looking into why it's taking so long? Okay, so the task of knowing how long a program should run, I think it's pretty invisible to figure out, like it will depend on the type of program, depends of where you're like making HTTP requests or if you're reading data. One thing that you can do is if you have like, for example, if you know you have to read two gigabytes from memory, like from disk and load that into memory, you can kind of make back of the envelope calculation. So like, oh, that shouldn't take longer than like X seconds because this is how things are like set up or if you are reading some files from the network and you know kind of what the network link is and they are taking say five times longer than you would expect, then you could try to do that. Otherwise, if you don't really know, like say you're trying to do some mathematical operation in your code and you're not really sure about how long that should take, you can use something like login and try to kind of print intermediate like stages to get a sense of like, oh, I need to do a thousand operations of this and three iterations took 10 seconds, then this is gonna take much longer than I can kind of handle in my case. So I think there are ways, it will again like depend on the task, but the thing like giving all the tools we've seen really have like a couple of really good ways of tackling that. Any other questions? You can also do things like run H top and see if anything is running. If your CPU is at zero percent, something is probably wrong. Okay, we're gonna, there's like a lot of exercises for all the topics that we have covered in today's class, so feel free to kind of do the ones that are more interesting. We're gonna be holding office hours again today. Just a reminder, office hours, you can come and ask questions about any lecture. Like we're not gonna expect you to kind of do the exercises in a couple of minutes. They take kind of a long while to get through them, but we're gonna be there to kind of answer any questions from previous classes or even not related to exercises. Like if you want to know more about how will you use Tmax in a way to kind of quickly switch between paints, anything that comes to your mind.