 OK, so let's get started. Welcome to lecture five of Tiger Tools. Today we're going to talk about program introspection followed by package management. And after a break, we'll talk about less customization followed by remote machines like SSH and things like that. So yeah, to start off with program introspection, the kind of opaque title. But what we're talking about here is understanding a program, like debugging a program. So one technique you can use for debugging programs is print depth style debugging. You have a program, something's going wrong somewhere, and so you stick a bunch of print statements all over the place just to kind of know what's going on. Like you print out values and variables, things like that. And that's actually an excellent debugging technique. But sometimes it isn't quite good enough. And that's when you bring out more powerful tools, like debuggers. So debuggers let you interact with the execution of a program. Don't worry about reading this code for now, but this is the program we're debugging. So debuggers let you do things like let a program run and then halt the execution of a program when it reaches a certain line in the code. And then at that point, you can, once the program's frozen, poke around it, what's in memory, print out values and variables and things like that. And you can even do other sophisticated things, like execute the program one line of code at a time, getting to stop after every line of execution so you can poke out what's in memory and things like that, just to inspect the state of the program. You can inspect values and variables and there are all these other advanced features that debuggers offer. And so, basically we're going to go through a demo of GDB, the GNU Debugger, which is a debugger that supports many C-like languages. And this is just to show you what kinds of things that debuggers use for. And then after that, I'll briefly show you some other debuggers, one for Python, and then show you graphical debuggers in the context of debugging JavaScript programs in the web browser. Okay, yes, so to start off with, we're going to be looking at this program over here on the left. Now you may be familiar with C, you may be not familiar with C. It doesn't really matter. This is a really simple program that the product execution runs main. And what this does is it loops through the integers i equals one to 10 and calls this function save with the integer i as an argument. And I've compiled this program already. And if I run this, it will just print out the integers one to 10, just in tech form. This is kind of an interesting program. It doesn't really have any bugs, but we're going to demonstrate debugger features in the context of this program. So the way you use GDB with C programs is when you compile, you go special flag to the compiler to get to include extra information in the binary that the debugger will use to present you, well, more information as it's running. And so that's done with the dash G flag for DCC, for example. And depending on your environment, depending on what compiler you're using stuff, this might be slightly different, but the general idea is the same. Like when producing the final binary of the executable in the code, normally you want to strip out all this extra information like what line number corresponds to what instruction in the code or things like that. There's just extra stuff that makes the binary larger. But when you're debugging a program, you want all that stuff there so the debugger can use of it. And so here we pass the dash G flag and so if you want to output to the file example and then compile example.c and we'll run this. And okay, now we have a binary example that we're going to debug. So the way we invoke the debugger is just called gdv with the name of that file and it prints out a bunch of stuff. Basically we get a gdv prompt. And now we can do some things like run the program. The transfer that is just run and we see like, okay, it does the thing and prints the output we expect. So now to show you some of the more advanced functionality of the debugger that gets a little bit more than just print out style debugging. So one thing you can do is you can get execution to halt when it reaches a certain line. And you can do that in terms of line number or you can even do it in terms of specifying names of functions. So you can say, okay, when I run this program I want it to start running and as soon as it hits main I want it to stop so I can do b for breakpoint main. And now I see like, okay, I've set a breakpoint. This is the address of the function in the file example.c on line 24. And now if I run this program what will happen? It'll start running and then as soon as execution reaches the point where I set a breakpoint the program halts execution and the debugger returns me to this debugger prompt where I can now interact with the program. So like now I can do things like print out states of variables and things like that. So I can try to print i and it's like i is zero right now because it's actually, well, in this case, referring to a location on the stack has not been set to one yet. I'll show you a little bit more of a printing in a moment, but I'll show you one other type of breakpoint. Here we did a breakpoint based on the name of the function. We can also say that when we reach a certain line number in the code we want to stop execution and give us our debugger prompt. So we can say breakpoint example.c and this is line 18, so I can say okay, when you reach line 18 start running. And I'll show you another debugger command. If I type it c for continue, I can even type in the full thing. This means resume execution when I've stopped execution or paused execution. So now the program keeps running till it hits any breakpoint I've set. If I do info breakpoints, it'll list all the different breakpoints I've set. So here I have one breakpoint, the first one I set in main. So that's not the one that triggered it this time. This time what triggered is the breakpoint I set and say not when I set by giving a line number example.c of i18. And so now my program's run until this point and the first time it hit that line and the program now is halted and I got this debugger prompt again. And now I can do things like inspect the arguments to say I can do print i and since this is the first time it's called it will say okay, like i is equal to one and so on. And now at this point I can do stuff beyond just inspecting values and variables. Like this I could have done by sticking actual printout statements in here, right? I could have put in printout i equals something like that, right? And I could compile that and run my program and it give me some permission about the values and variables. But here I can do other stuff too. I can actually step through execution line by line so I can do, there's a couple different commands. There's step to continue execution by single line of C code. There's the command next to go to the next line of C code and this kind of skips function calls. So if I was on this line of code for example and I don't want to go into the code of printout but I kind of want to skip over that and keep going through the lines of my program I can use the command next. And then there's finish which is to step out of a program. Oh, sorry, step out of a function. So like if I've called a function and I'm going through line by line and I don't care about this function anymore but I want to go back to the caller and start stepping through line by line I can use the finish command. So like in this context for example if I type in step it'll say okay now it's on this line of code. All right if I do step it'll go, oh it doesn't have the source code for this so it's doing some kind of garbage output. But it would have been showing me the code for printout. But I don't really want to debug printout because it's part of the library and code is probably correct. And I want to go back to my code so like here I can use the finish command and it goes back to the calling frames like now I'm back to debugging save. And so I can for example keep typing next again and again and again. If I just press enter my debugger it does the same thing as just repeating the last command. You can see it just prints out which line of the program I'm on. So like okay I'm for loop I'm about to call say here now I'm inside say and I'm accessing this array and so on I can just go through this program line by line like that. And at any point I can inspect the value of these different variables. So like now I'm on the for the iteration of this loop. Okay so that's some debugger basics. Some other neat things that you can do with debuggers. You can set up this tool to basically like look at a variable and stop execution when the value of that variable changes. So like I can say okay I want you to run this program and whenever like I changes for example stop execution or another thing you can do is you can set it up so that whenever you read from a certain memory location the execution falls. So let me quit this debugger just to get rid of the break points I've set. There are other ways of doing that and start debugging again. Here I can use the command say our watch that says stop execution whenever I read from this value and I can say it like say the third element of this numbers array. And now I can run my program and it will run to the point where I print i. So I'm inside say the argument to say is for and I'm about to do this memory access that's right here. So this is pretty powerful stuff that you can't do just with print hostility debugging. Right. And so I can probably give you like a many hour long tutorial on GDP commands but we don't really have time to go through all of that. So this is just to show you that debuggers are a thing and they can do a lot more sophisticated stuff than print desktop debugging. Just one last thing I'll show you with GDP. GDP has this kind of command line interface that I've been working with right here where I can do a single command at a time print values of variables and things like that but it also has a slightly nicer user interface. So there's the layout command which can take a couple different arguments and you can look at the documentation to see what exactly it can do. But you can put GDP into this text user interface mode where it kind of has a split window thing and it can show you different things in these different panes. So you can for example get to show you C code in one window. The assembly code corresponds to another window and then have this prompted through window. Let me maximize this just a little bit easier to see. And here I can instead of stepping through this program with step going one line of C code at a time. See here it's even highlighting which line of C code I'm on and what assembly that corresponds to. I can do things like step by individual instructions of the assembly. So if I do SI for example, that's for step instruction. I can go through this one line of assembly at a time. This still corresponds to this for loop here but I can see that I'm going through this assembly line by line. And there are other types of layouts here that I can look at. I can try not to see what's in the registers for example while this program's executing. I can switch to a different layout that shows me that. So here I can see like what's in some of the general purpose registers and so on. If I make this bigger I'll see even more registers. So basically it's a pretty powerful interface that'll let you inspect a lot of things as the program's running. This will even highlight whenever a value in a register changes just so it's a little bit easier to tell what's going on. Like here I'm about to subtract a value from EAX and if I run that instruction I can see my AX registered value has been documented by one and GDB highlights that for me just so I can see what's going on. So any questions so far? This is a rapid fire introduction to what a debugger is and what kinds of things it can do beyond maybe whatever you were using for debugging before this. Yes? What other languages does GDB support? It's mostly C style languages. For example. Sorry what? But like, sorry could you give me an example? Oh so like go for example for C++ or Objective C and we're all supported by GDB. So technically it supports anything that is assembly which is all binaries but to get the mapping to source code lines you basically need a language to produce a binary that has that mapping. So that tells you basically what the debug symbols tell you is for any given address in the memory space in the binary what is the corresponding source file and line and as long as you can produce that file the language will be supported by GDB. Of course GDB has a particular notion of variables and functions where if you have a language that's very different like Haskell for example then GDB will not work very well because it doesn't have the same concept of how a program is running. But if you're using C++, go, rust, not quite Java either. Basically anything that doesn't have a very heavy runtime with the exception of go will work very well with GDB. Any other questions? Yeah. Where do you use the hour watch? Does it scan through the full C file for like the variable of that name? Like you're asking about the argument to the hour watch. Yeah. It's in the current scope. In the current scope. So it understands the C, basically looks at the scope in that C file or whatever function you're in using the debug symbols. Any other questions? Also a random cool side note, you might wonder like how is something like hour watch implemented? Do you interpret the C program and go through instruction by instruction checking for access to that memory location? Things like that would be very slow and there's actually hardware support in your processor for watching specific memory addresses in this way. Like there's specific support for this kind of debugging. It's kind of cool. There's also a layman's version of watch called display, which if you set a display, then every time you step to another instruction, it will print whatever you set to display. So that way you can monitor whether something changes over time. This would be a handy for monitoring registers, for example, like for every instruction I execute, display the following word distributed. Are you gonna mention RR? I was not going to, but you can talk about it. Okay, then I will mention RR pretty briefly. Notice that here the commands you have are things like you can do next and you can do step. RR is a really, really cool debugging tool that lets you step backwards through the execution of your program. So it gives you the operations R step and previous. And it does this by basically keeping track of all the system called the memory operations that you do and then giving you a way to step backwards through your execution. So for example, you can set a breakpoint for what a particular value reaches something or what a particular bug triggers. Let your program run however long it takes until it hits that and then step backwards through the execution to figure out why a particular value became that value. So you can do an inverse watch point, for example. Say give me the last point at which a variable changed before you hit this bug. It's a very heavy debugging tool. It's especially useful for programs where you have non-determinism. So you have a multi-threaded program and there's a race condition that triggers very rarely. Well, it's really nice to be able to let your program just run, maybe run multiple times until that race condition hits and then go backwards and figure out exactly why that happened. Well, any more questions about GDB? So I also want to show some other debuggers that offer slightly different interfaces. So GDB's for these programs that are parallel to a binary and sequence of assembly instructions like C-like languages. But what about, say, interpretive programs like Python programs? Well, so most programming languages have pretty decent debuggers and they kind of more or less work like GDB, but just to give you a little bit of a demonstration of Python's debugger, which is called PDB, this is in the context of some random program and you don't really need to worry about this running program, but I just want to show you what you can do with this tool that you can't exactly do with GDB. So here, the way you use the Python debugger is you import PDB, which is the debugger, and then you call PDB.setTrace. So you can basically stick this line anywhere in a Python program and then whenever execution gets to that point it'll give you a GDB light prompt. So don't worry about all the surrounding stuff, but anyway, I run this program and once the execution reaches that point, I get this prompt. And now what's really cool here is that it's kind of like GDB where I can step through execution of my Python program, but it also lets me do other things, like this is basically a hybrid of this debugger command line and a Python shell, so I can kind of like write Python code and it just runs in this context and I can go and manipulate values of variables in my Python program, things like that, in addition to doing the usual things like printing values of variables. So this can be a really powerful way of debugging, which is really nice for these dynamic, interpretive languages. Yeah, basically I just wanted to mention this tool and say that it's kind of more than GDB, it's a hybrid of an interpreter for the language and also a debugger. I'm not going to go into the details of PDB, but any questions about this? It's this concept that this exists. So quick mention, if you like interactive Python, I Python, there's also like an IPDB, this is the IPython cell inside. Oh. It can be really, it's really convenient because like auto completions, knows a lot of like the dog strings of functions, stuff like that. Yeah, yeah, so basically like every language you'll have some debugger and it will give you this pretty sophisticated way of understanding what's going on with the program. And now beyond just these command line interfaces, you also have graphical debuggers and one really common one to look at is the web browser. So if you go into your web browser, and I think this will be covered more in Thursday's lecture, so I'm not actually going to go through the details of how to use this, but basically here I have a graphical tool that'll let me do more or less the same set of things, but maybe it's a little bit nicer to use in some ways. So using this tool, like if I was looking at a piece of JavaScript, it's a JavaScript code here, and then there are buttons for say halting execution or, let me just worry about what exactly is on the screen, but there are buttons for doing the equivalence of next and step and finish. And there are these panes which will show the values of lots of different variables so I don't have to type out the print command and things like that. So if you prefer graphical user interfaces, there are GUI debuggers too for a lot of languages. Your IDE might have a built-in debugger that'll let you set breakpoints on lines of code just by double clicking on the side on that line of code rather than type like breakpoint, name of file, colon, line number, or things like that. So that's what a graphical debugger is. So that's all we have to say about debuggers. And now we're going to mention a couple other useful things. Next thing is particularly useful if you're doing any kind of systems programming. So this is a command called strace and what this will do is it'll run a program and while the program is running for every system call that that program makes, it'll print out what system call was made and what arguments were passed to that system call. And so this is another kind of tool that can be useful for debugging. So for example, what happens if I just run the echo command and you look echo high? Well, at a high level, the real guts of this is a single system call that calls write with an argument that's like the string that's just the word high. And if I run this with strace, you can see all the other stuff that happens when my shell executes this program and it gets set up. And eventually it gets around to issuing that single system call, write system call to file the script or one, which is standard out, which contains the actual contents of what I want to print. Just wanted to point out this tool is like something that we use to debug system calls or programs that do system calls. And after that quick mention, I think the next main topic we wanna cover is something called profiling. So GDP, like debuggers are useful for, well, debugging a program, if something goes wrong, like it's producing the wrong output, I wanna go and understand why I'm doing that. But another way a program can be wrong is it can be too slow or use too many resources or something like that and these kinds of debuggers aren't really good for understanding that. Here I'm stepping through my code line by line, but it's not gonna say, oh, this piece of your program should be optimized, it's really slow. And so that's what profilers are for. And so again, this is a general kind of tool that can be applied to any language and you'll find specific programs that are used for profiling code written in specific languages. But this stuff we're going to demonstrate to you in the context of Go programs in the Go programming language. And a couple types of profiling we'll show you are CPU profiling and memory profiling. So one thing I might wanna know is, okay, my code's running for a long time, what is it spending most of its time doing? And so here, this is just some of the random program, you don't need to worry about what's being profiled here, but I can instruct my tool to record a CPU profile and it'll basically instrument the process that's running and figure out where it's spending all of its time. And so again, this runs for some time and it produces some output that contains the CPU profiler information and then there'll be some other program that'll be used to visualize or analyze the output of my profiler. And so in the context of the Go programming language and the Go tooling, it can actually produce kind of like this interactive webpage where you can see a call graph along with annotations for where it's spending most of its time. So like here, I ran some program. And you say, okay, here's a function that took a bunch of time and highlights it in red. It actually took up the majority of CPU time spent running my program that I ran. And I can see the breakdown, okay, this calls this function, this calls a bunch of different functions and some of them actually don't end up taking up too much time in terms of total program execution time, but this function, for example, here does use up a lot of the execution time. And so I can follow this call graph down and figure out which is my code, which is runtime code and so on. And based on this, I can go and read my source code and try to understand why it's slow. And basically this will tell me what to focus on when I'm trying to optimize my code. Another thing I might want to look at besides CPU usage and how many cycles my CPU's using is how much memory my program's using. Like maybe my program is super bloated in terms of memory usage and I want to optimize that. Well, then I should know where I do a lot of allocations or what kinds of objects take up a lot of memory. And so the tool used, there's a memory profiler. For Go, this is one way I can do it. So I'm running all my test code, but also recording a memory profile while that runs. And then a couple of seconds later, I'll have the results of that. And then I can run a similar command to what I ran before, except analyze that memory profile. And here it'll show me, okay, here's a function that allocates a whole bunch of memory, and this is actually a breakdown of where in that function it's allocating a memory. And okay, here I'm copying a bunch of objects and that's actually the majority of memory usage in my program. And so again, a similar thing to the CPU profile, by looking at the breakdown of where I'm allocating memory, it'll basically help me figure out where I should spend my time reading and understanding my code and trying to optimize it. And again, for every programming language, the exact commands we use and the exact way the interface looks will be a little bit different, but the general idea is the same. Like profilers instrument the execution of a program and let you understand those results in some useful way. And one last thing we'll show you, just showing that you can do this, of course, for different programming languages. Here I'm using a different tool, something called PERF to do a CPU profile of a different program called AG. This searches files in the current directory for files that contain the character X and will produce a count of them and whatever the program produces in output. But this tool has instrumented the run of this program. And now I can go and analyze this. And here I can see a list of where I spent most of my time and what exactly was the function being grown where I spent that much time. And I can go into here and look at more details and so on. So just to show you like there are different tools for doing these sorts of things. You can also get just high-level summary information if that's all you want. Like the PERF step command, for example, can tell you, oh, I spent these many cycles running this program or how many instructions I executed in the run of this program or things like that. And that can also help you understand the behavior of a program and figure out where you should spend your effort in terms of optimizing it. And also one thing I forgot to mention is kind of the simplest profile that there is. This is a program that's already installed on your machine and it's something called time. And it will give you basic information about the execution of a certain program. So like they go back to this program I reported earlier and it starts to do time. It'll tell me how many seconds I spent running this program and how much of that time was spent running code in user space versus how much time I spent running system calls. It's kind of useful for debugging certain things. So that is a very quick introduction to what profiles look like. Any questions about that? If you're curious about learning more about profiling, there are a lot of good guides and tools online. I also recently did a, so I ported a tool called Flame Graph that shows you really nice like performance profiling graphs to Rust. And in that video, we also do a lot of just discussing how profiling works. So if you just like searched on YouTube for my name, then it's on the website, it's easier than spelling it out. If you search my name there, you'll find the video and then you can have a look through that if you're interested. That is in a lot more detail than this, which may or may not be what you want. But there are really cool things to do with profiling. It really is useful to figure out what is worth optimizing a new program and what is not. And we can also add a link to that in the resources section of this page so you can find it more easily. Okay, so that's what we have for program introspection. So hopefully that's a quick introduction to how you can understand more about what's going on with the program deep reading. So any questions about debuggers or profiling or anything else in general in this area? Okay.