 Hi, I have renamed my talk for today to spying on your programs in which we're all going to become spies and wizards So I wanted to start out with saying that this talk is not really about Python I really love Python and I wrote Python all day today, but this talk is not really about Python in any like serious way What it is about is it's about a kind of like um Fundamental problem when you're programming, which is you have a program and you want to know what your program is doing Normally and like a lot of the times you don't know right like you have some ideas about what you think your program might be doing and often you're wrong and This talk is about kind of treating your program as a black box and Advancing the notion that you might actually know what it's doing and just like observing it and observing it like it's in its inputs Outputs and you're like well, I thought it was gonna do this, but that's what it did So I guess that's like the reality and When you treat your program as a black box like this you don't care of programming languages It was written in because you have no idea what it was doing anyway, and it might as well have been written by like an enemy so And like your past all was basically an enemy right So like normally when you do bug like you look at your source code or yet print statements You like know the programming language, and we're not gonna be doing that We're just going to be wizards And I'm gonna explain what I mean by that Because that's not like a very helpful description One really basic example of a tool that lets you see what your programs are doing without caring about what they are I'm is top right or like any kind of system monitor where It's like well Xorg is using 10% of receive your CPU Chrome is using like you might have like a million You can see I'd like five Chrome processes, which are slowly eating up all my memory, right? And it doesn't matter like whether a Python program or like a C++ program is taking all your CPU It's taking all your CPU and that's it And so this is like a very like broad view of like what a program is doing like how much CPU and memory it takes up And we're going to see how to like learn much much more specific things about our programs, and I'll make that more clear as I go on So we're going to start out with wizard school or where we talk a little bit about operating systems and then We're going to figure out how to find configuration files And we're going to investigate the case of the slow program in which we're going to have three mystery programs And we're going to figure out why they're slow without reading them just by running them so Let's talk about operating systems for like five seconds So What is an operating system for so let's say you go to google.com and Your operating system does a million things for you while you go to google.com, right? You start like typing in the address on your computer and you press keys and it's like It knows how your keyboard works. It runs code every time you press a key You send data over the network and it handles all your network packets, right? Like it knows about these network protocols like TCP IP and it's like, okay. I got a packet Okay, I'm sending you to Google. Okay. I got a packet of removal. Okay And it like interprets all of that so that you don't need to know how these like network protocols work It does things like writing Files to disk or like writing like some kind of cash Chrome whatever to dish this means that you don't have to know how your hard drive works and how file systems work, right? Because like something needs to know how the file system works, but that person is not you You're just like hey off writing system. I just want to open this file. Okay, and it's like yeah, no problem Don't worry about it It's got you covered It does things for you like it allocates memory, right? If you're out of memory, it's like no, sorry It's over this happened to me earlier today because I was like I was running this Java program Which was trying to draw this really big graph and it was going and it was going and it was going in at some point The operating system was like no and Java was like no and then it crashed and then like I couldn't see my graph anymore Anyway, um, that's not relevant The point is that your operating system manages your memory for you So that you don't have to um, I don't like communicate with your graphics card so that like you can Like see stuff on the screen like right now like this is like apparently My operating system knows how to project because I don't I don't know how that works, right anyway So there's all the stuff that does for you, right? But there's this busy thing where it's like how do you ask the operating system to do stuff for you? How does that even work? So this was confusing to me I didn't know how this worked for the first like nine years after I started learning how to program But then one day I learned it and I learned about system calls who knows about system calls Lots of you system calls are the greatest. Hi So system calls are like the the the interbase to your operating system, right? You open a file and you're like, hey, can you open me a file, please? You want to start a program and you're like, hey, can you start me a program with the exec VE system call? You want to change the files for missions? You want to like send data over a network? Lots and lots and lots and lots of stuff happens your system calls And this is going to be like one of the keys to spying on our programs Is system calls because we're going to use s-trace if you know what that is And if you don't that's the whole point is that I'm going to tell you about it So we've learned everything we need to know about operating systems for this talk One is that your operating system does tons of stuff for you and you should love it and it's on your side Except when anyway, it's overall on your side on that program tell it to do what to do using system calls So we are now operating systems experts. Um, I Yeah, and now we're going to use our new found operating systems magical knowledge to debug some stuff And we are going to solve the case of the missing configuration file So who like has ever run a program either your program is someone else's program And you don't know like what it's using to configure itself, but you can't find the file Um, is that really annoying or is it really annoying? It's just kind of really annoying and like you don't like want to go read the documentation like I hate reading documentation It sucks It takes too much time you have to like Google it I'm like, what if like your internet isn't working and then like or maybe you could try to read the codes that sucks, too So no, we're not going to do any of that. We're just going to find out what the configuration file is like a wizard So S trace the program that lets you be a wizard By that I mean it lets you trace what system calls your program is calling S trace is Linux only That there's similar programs for like a DSD like OS 10 and stuff, but I'm just talking about Linux. That's what I have on my computer. So that's it Right. So what S trace as is it tells you every single system call your program calls Which is like pretty much the greatest thing in the universe I have a Kind of obsession with S trace that written like probably like nine blog posts about S trace the S trace on the train also on the plane You should also S trace on the plane. It's really fine. Right. Anyway, here's how you S trace You take your program like Google Chrome any program you want in any programming language. It could even be in like Fortran So You start Google Chrome when you start Google Chrome first thing that happens is it serves Google Chrome Which is like maybe not a big surprise, but it was a surprise to me because I was like, oh, yeah, of course I get it. That's the first system call that happens is the exact system call. So You run S trace and Google Chrome starts Google Chrome. Great. We want to send the config file as Google Chrome is using and it's like side now This is what it looks like when you S trace something I Mostly wanted to show you this because it's really confusing and I want to tell you not to be too afraid And that like nobody will hurt you and that S trace won't hurt you You kind of maybe see some stuff that you recognize you're like, oh live for this dot so dot zero You're like odd something about some sound libraries great. Who knows? So if we are solving the case of the missing configuration file My favorite thing to do is be like, okay, there's all this output that I didn't understand And we're going to ignore all of it and just look at the times when it's opening files So it's opened a bunch of files like dot cash. I have no idea what that is One important thing with S traces is ignoring things when you don't know what it is But you can see like there's like something like dot config slash Google Chrome slash consent to send stats It's like, what did I consent to but that was But that was a configuration by the Google Chrome opens right which is fine But you can run this on your program, right? Like I use a dupe and sometimes I'm like, oh where the Hadoop configurations and like I don't want because I already have to Edit an XML file and I don't want to like have to Google for where that XML file is So you just S trace it and then tell you where the XML file is and then I can I can open it Okay So we have we found our opens If you if I look in it, there's like a key You can S trace for other things like for Execv which is for who here ever writes like wrapper scripts that like call other programs Do you ever have bugs in those scripts You have bugs I have bugs if you use S trace for execv and look at all of the child processes You can see like every process that your program is starting and then you know all of them I'm gonna skip these these are like sending data over the network, which is also awesome You can do tons of stuff with S trace If your program is writing with some kind of log file and you like forgot where it is Or you never knew You can look at the right system call and that'll tell you where it's writing amazing So we have S trace one really the important thing to know about S trace is If you have the urge to S trace a really important program that like a production database never do that That's a huge mistake because S trace can make your program like 200 times slower Because it like stops it every time it runs the system call, which is a lot So only S trace programs, which you're comfortable with being made 60 times slower Okay so Now we're gonna solve the case of the slow program I wrote three slow programs for this talk One of them is slow because it spends all of its time like doing cpu stuff and like doing calculations Um, one of them is slow because it's writing Too much stuff to disk or reading too much stuff from disk Um, and the third one is slow because it's waiting for like a slow server to reply Like it sent out a request to like some server and the server is just like I'm just gonna wait for a long time before I reply to you Instead of actually replying to you on time um This is like kind of a cool thing about performance actually right is because like when your program is slow There's so many reasons in such an exciting world with possibilities and frustration To know why it's slow Um, so written three and we're gonna do a mystery hunt to find out which one is which Um, and we're not going to use s trace. We've already used s trace s trace is over. We're going to use new tools um So let's start with mystery program number one Um mystery program number one. I ran time on it time is a super great tool for programs that start and finish Um because it tells you how long it took it took two seconds It spent zero point zero nine plus zero point zero one seconds on the cpu five percent five percent of time is not a lot of time So what is it doing if it's not Computing I want What? Waiting I owe all I owe could be correct. Um, but what it's really definitely doing is waiting for something It could be waiting for I owe could be waiting for something else But we know it's waiting for something So How do we know what it's waiting for? Does anyone have any ideas? We could use s trace that we're not going to Um, partly because imagine that you actually need your program to keep running At the same speed and like it's not okay to slow down your program and partly because it we already know s trace So we're going to do something else Because that's more fun So um, let's find out what it's waiting for. I'm going to do a demo Live demos always work. Well Yeah, great. Um mystery one So we run it It ran it says hi Um, and it took two seconds. Great. Um, so we're gonna use something called w-chan or the weight channel So this is something I learned like yesterday Which is what I'm talking to you about and talk today because it's like the best thing I learned yesterday um, so the weight channel is like, um Your operating system keeps track of what a program is waiting for any time that it's waiting I had no idea. Um, also I had a command Um already in place and I lost it whenever you started my computer So you're going to get to wait with me while I type it out. Um, oh look um So this is ps indentation. So ps this is a little small. Sorry um ps, um, normally will tell you like things about your processes, right? Like, uh Blah blah blah, whatever Um, so we're going to print for each process. Um the pid The weight channel whatever that means Which is like what it's waiting for and the name of the process and then we're going to grab for mystery one Does that sound good? Great. We're also going to try to like rearrange our windows. Um Great and we're going to do that in a loop Because we haven't started our programs. It's not running, right? So we're going to like a hundred times Mystery grab mystery one We're also going to sleep Great So now it's uh, it's doing that And but our program isn't running yet. So we also need to run our program Okay, amazing. It's happening I found it It's here SK wait data Um, do you know what that means because I didn't really know The way I found that was I googled it. Um, and I googled sk wait data And it was like wait for some data on a network socket It was waiting for some data on a network socket guys Um, I can show you what this program does. Um, it's not that interesting Um mystery one Um request.get localhost 5000. Um, that's the program I wrote Um, I can show it to you It's uh, it gets and then sleeps for two seconds That's all. Um, so it's like not a big surprise that it was slow Um, great. Um, so we debug and move mystery program one and we are like wizards Um, now we're going to debug mystery program number two. Um, we won. It was the network. This is the best Um, so mystery program two We time it again Um, because like as investigators you want to take the first same first step every time Um, it spends 2.7 four seconds 2.7 four seconds total All of that time is like in user space like it's not like none of that is like the operating systems work We're done like So I said that we're not going to use like programming language specific tools Um, but if your program is like 100 on the cpu and it's like all in user space You should just use a programming language specific tools profiler at this point, right? It's just kind of slow. Um, And there's not like any like operating system magicalness that I think is really going to help us out here. Um I can tell you what this program does That's what it does This one this one is not a huge mystery Um, but yeah, there's like I don't have any like special magical tricks for this one Um, I think that if you were going to profile this program just for knowing this you should probably just profile it using your regular tools Um mystery program number three So there are only three possibilities you've eliminated two of them But this one is really fun. Um, and it has some surprises. Um, even though you probably might know what it was if you were paying attention Um, but there are still surprises. I started debugging this and I wrote the program and then I was like what happened I didn't understand so Here we go We ran time again It took 10 seconds 10 cpu What does that mean? again Waiting is someone else said i o you're probably right this time. Um because of elimination But we definitely know that it's waiting um, so Um, but there's more here. There's more to know Um, so we're gonna do some We're gonna run python mystery three Oh look Let's run it again Let's run it again So I ran it three times And then let's keep running it It's like not taking the same amount of time all the time Normally it takes 10 seconds to run Now it's not taking 10 seconds to run Why what No, um, but we're gonna find out what's going on anyway So it takes like varying amount of time to run Um from like 0.3 seconds to like three seconds you whispered something, um, which was correct and you're going to be quiet We're gonna look at something called d stat, um, which I think is really cool Um, so d stat is a little bit like tough and it just keeps on printing what's going on on your system So it's like here's how much cpu. It's happening. Here's how much you're reading and writing Here's how much your network is reading and sending Um, so this is pretty cool because it gives you like an overall picture of what's going on on the system so um, we just care about Disc, let's say. Oh, no, we were going to do that thing. Um, what was that thing? We lost our, um And we're gonna type it out again we're ps And then we look at p.i.d chan command Oh, no, I forgot to sleep everyone This is what happens when you forget to sleep And the grep. Thank you I think that this is the good thing about live demos is that they're just way more fun Um, okay, great So you see some stuff here. Um, I am seeing some sleep on buffer And sleep on page and I'm taking those to mean about the buffers and the pages and this is i.o Which we are already new because of spoilers. Um But we could have also learned that through logic and not spoilers, which is good to know um, but So now that we know it's i.o I'm just going to start looking at dstat and get it to tell me How much it's reading and writing So let's start it So that finished like pretty much right away But you'll notice that even though the program is already done it spent like some time Writing data after it was done That's kind of cool. Let's do it again So it finishes but then it's still writing data. It's not over So like the kernel the operating system is still doing work for our program even though the program already finished Could this be why the program doesn't always take the same amount of time? The answer is yes, right? This is because the file system caches because your operating system loves you and it wants you to be happy And it wants your programs to finish. So it's like, okay, I wrote that data. No problem. Don't worry about it But it's not true It didn't write the data. It's just like waiting Um until like later Um when it will write the data and this means that like if you keep on trying to write more data Someone's like, no, like you have to stop. I don't I can't I don't have space to remember what you wanted to write anymore Um, and what this means is that it takes a different amounts of time to write the data every time Um, and what this also means is if we write, um A different version of the program um Where we do something called fsync Fsync tells us tells the operating system like, yo, I actually need you to write this data right now like No pretending Don't try to trick me And if we try the version with fsync um, which um Is why there was some whispering about flushing your caches um Then it always takes the amount of time which is longer So, um, that was the demo we won again. We always win Unsurprisingly because we constructed these examples ourselves um But this was pretty fun, right? Like we got to treat all of our programs as a black box They were written in python, but that was like an accident that it didn't need to be that way They could have been any programs were basically wizards um One thing I want to like suggest even though I kind of hate giving people advice and talk Um, is that like it's pretty fun to learn more about your operating system and to learn more about like solving these kinds of performance problems Instead of learning a new programming language You can learn the operating system that's already on your computer that all of your programs run on um And it's so fun and it's the greatest That's all I have to say. Thank you