 We are with Janna Bagheel, that is going to speak us about the Python byte code. So, thanks to the speaker. I feel like we need like, you know, stand up comics here to open up the crowd or something. Hi, I'm Anjana Bagheel, and yeah, I hope you guys are excited about byte code, because I am. Can everybody hear me okay? Great. So, who am I? Well, my name is Anjana, and I'm a Pypoholic. I've been addicted to Python for probably about three years. Right now, I use Python as an outreach intern at Mozilla to do some testing work for them. But what I want to talk to you about today is some explorations into the core of Python that I started doing while I was a participant at the Recurse Center, which is a really cool programming community in New York City, where you're allowed to just follow whatever excites you about programming. So, today I'd like to tell you a little bit about an adventure that I had that involved getting started with Python byte code. I'm by no means an expert in it, but I just wanted to bring you along on my first encounters with it and show you why I think it's really cool. So, while I was at the Recurse Center, I came across this puzzle. I think of it as a Python puzzle. It turns out that Python code runs faster if you stick it inside of a function and then call that function. Maybe you guys are already familiar with this. I was not. But, for example, if we have a rather lengthy for loop that does nothing useful, it just evaluates a variable i for each i in a rather long string of i's, if we call that just in the global Python module, it takes quite a bit longer than if we stick it inside this run loop function and then call that function once. And, to me, this was puzzling because looking at this source code, I don't see any real meaningful difference. In fact, all I see in the inside function version on the right is that, if anything, Python should have more work to do because it's got to create a function and then call it. So, I couldn't really understand from looking at the source code why this would be so much faster, the right-hand side. Turns out that looking at the byte code can give us a little bit more insight than looking at the source code for certain types of Python puzzles like this one. And that all has to do with what happens when we run Python code. So, this was something I hadn't really ever thought too much about before. What happens when I actually execute a Python program? And today, I'm just talking about CPython. A lot of this is an implementation detail specific to the CPython interpreter, but hopefully that's what a lot of you guys are using. And differences between CPython and other interpreters are also really fascinating, but not the topic today. So, when we're using CPython to run some Python code, we start out with our beautiful, Pythonic, easy-to-read, nicely-indented source code that looks fantastic. And that gets compiled by part of the CPython that's called a compiler. It gets turned into a parse tree, an abstract syntax tree, a control flow graph. It doesn't really matter for our purposes right now, they're all just different abstractions of what we want our code to do. The important part is that ultimately gets compiled down to bytecode, which, obviously, we'll be talking a bit more about in a moment. And that bytecode, whatever it is for now, gets passed to the interpreter and is what the interpreter actually runs. The interpreter being a virtual machine that is performing operations on a stack of objects. So, the interpreter executes that bytecode, and then you get out whatever awesome stuff your Python program is designed to do. Great. Okay, so this bytecode, what is it? Well, as we saw, it goes kind of in between, it comes at an in-between place between your source code and the effects of your program. So, in one sense, it's an intermediate representation of your program. And in fact, it's the representation that the interpreter itself sees. The interpreter, unfortunately, doesn't get to look at your beautiful, readable, Pythonic source code. It only gets to see this bytecode. So, if we think about the interpreter as a virtual machine, we could think about the bytecode as the machine code for that virtual machine. So, when we think of more languages that are traditionally considered compiled, we think of taking source code and translating that into machine instructions for an actual physical machine. In this case, it's pretty much the same idea. It's just that the machine is virtual and it's the Python interpreter instead of the actual physical machine. And so, since the virtual machine, the Python interpreter that we're dealing with is basically a stack machine, the bytecode that we give it is a series of instructions for what to do, which objects to add on to that stack, which operations to perform on objects that are already on it, how to pop things off and return them back to us. So it's a series of instructions, the bytecode is. And another interesting thing, if you've ever wondered about those .pyc files that pop up all over the place when you're importing Python modules, these are actually caches of the bytecode. This is the bytecode that the compiler has already spit out. And the nice thing about this caching mechanism is that since we saw that from source code to execution, we have those two steps, the compilation and then the interpretation. If we haven't updated the source code since the last time we ran the program, we can skip the first part. We can reuse the bytecode that we already compiled before. So that's what those .pyc files are. And if you've ever tried to read one of those, to open one of them, they're gobbledygook. They're not meant for us measly humans to understand. So how can we humans read this bytecode that's intended to be read by Python? Well, there's a really handy module called dis, which has a fun name that stands for disassembly, so disassembling the bytecode. The documentation is right up there, put the link in there. And this allows us to analyze certain types of Python objects to read the bytecode for that object in a way that we humans can understand instead of looking at the bytes themselves, which isn't that helpful to us. So for example, if I have a really simple function that is called hello and returns, can somebody help me pronounce with the basket here? Kai-cho? Anyway, if we dis this function, hello, we get our first peak at disassembled bytecode. Cool, these two white lines at the bottom here are our really, really simple bytecode. We just have two instructions here, and without really knowing what all these numbers are, what the columns are, what we're looking at, we can already get a sense for what's happening. We're getting some kind of constant, a string onto the stack, and then we're returning it. Sweet. So let's break it down. What exactly are we looking at here? What does it mean when we see the output of dis? So we have a series of rows where each row in the output is an instruction to the interpreter. And on the left-hand side, a lot of the time we'll see a line number. Two here is the line in our source code. So this is just for us to help us know how the source code lines up with the bytecode. Not every line in the instructions will have a line number. As you can see here, the return value line doesn't have one. That's because sometimes more than one instruction can fit on one source code line. So sometimes we only see the line number when it's the instruction that starts the line. And next to that, we can see an offset in bytes. How far into this string of bytes is this particular operation? Not super interesting, in my perspective, for us humans. But what is interesting is the next thing, which is this string, load constant. Load constant, this sort of stands for. And that's the name of the operation. And in a minute, we'll look at some more of those and see what we can find out about all the different possible operations you could encounter when you're reading this disassembled bytecode. If the operation in question takes arguments, which not all of them do, but if it does, then you'll see some information about the arguments on the right-hand side. And in the last two columns on the right, we see the argument index, which interpreting that and what exactly means index in what object, that depends on the operation. There are a few different places that Python keeps track of the different values, like constants or variable names that you would need to carry out a particular operation. And that's all something you can look up in the documentation. But what's more interesting for our purposes now is the value of that argument, which you can see to the right, in parentheses. And this is Python kind of giving you silly human a little hint about what it is that this operation is operating on. So, some operations, we've already seen load constant, which takes an argument C and it pushes C onto the top of the stack, TOS. Then there are things like binary add, which takes whatever is already on the top of the stack, the top two items, adds them together and puts that result on the top of the stack. And then there's things like call function, which its argument is a bit strange. Its argument tells it how many positional or keyword arguments that function is expecting, so that it knows how many objects to take off of the top of the stack and in which order to pass to that function. So, there's a ton of these. I would not be able to cover them all, even if I had an hour or more, whole day. But they're all conveniently documented in the documentation for the disk library, the disk module. So, that's linked at the top of the page here. And for each of these operations, the names that we see, these operation names are just for us humans. Python doesn't care. It has a number for each of them, of course. That's called the opcode or the operation code. And if you're curious about what the correspondence between a name and a code is for a given operation, you can use these attributes disk op map and disk op name. Op map is a dictionary where you can just look up a particular operation name and find out its code. And if you happen to already know the code, you can pass it to op name and it is an indexed list of all the... the sequence of all the operations. So, you can find out which code corresponds to which name. Just some convenience there. And so, now we have a basic idea of how the disk function works, how we can disassemble some bytecode. What can we use it on? Let's try to disk some things. Let's find out what we can disk. I love this name. So, we already saw we can disfunction. Here's a nice little Python example one. We're adding spam and eggs. And if we just add, we see we have a slightly... ever so slightly more complex thing to do here, which is we're loading two things on spam and eggs and then we're doing a binary add on that. Cool. Starting to get comfortable with this. What else can we disk? How about a class? For a really simple class here, it's a parrot. It's got one attribute called kind. It's a Norwegian blue. This is Monty Python humor for anyone that's not familiar. And it has a method, isDead, which always returns true. And when we pass that parrot class to disk, we see that it disassembles each of the methods on that class. So, including the constructor method. And so, here we've got, let's see, a new operation name here in the disassembly of Dunder init. Here we have store attribute. Cool. Starting to get familiar with some of these new operation names. In my experience, a lot of the times they're self-explanatory. But if you're ever curious, okay, I don't know what that code, what that operation name does, just go to the disk documentation, it's all laid out. Another thing we can disassemble if we're using Python 3.2, or newer, is a string that contains valid Python code. So we don't have to actually put that code in a module, we can just disassemble the string directly. It gets compiled to a code object and gets disassembled. So, here we are just assigning spam and eggs on one line, which is a cool thing, Python lets us do. And we see a new thing, like unpack sequence. Also, a pretty self-explanatory operation name. Okay, what about an entire module? Let's say I have a really simple module called night.py, it has one line, it says print the string knee. I can actually disassemble that straight from the command line by passing the m flag and the disk module and then the entire contents of that night.py. Cool, so now we see, we're calling this function print and we see the argument to call function is some number of positional and keyword arguments, that's what I was talking about before, but what we can gather from this is that we're loading on this constant and then we're calling the function print on it. Cool, I think it's cool, anyway. All right, what about another way to disk a module? Well, as we saw, we can use code strings, we can disk code strings, so what if we read in the module using the open.read function? So now we have the whole contents of the module as a string and we can disk that. Cool, it's basically the same thing as last time, there's a little one less kind of return there, but essentially we're getting the same functionality, good to know. And another way we can disk a module is by importing it and then dishing the imported object. In this case, nightstoppy got a little more complicated, we added this method, isFleshWound, or function isFleshWound, which always returns true, and as you'll notice, when I import nights, the whole module's getting executed, it prints me, but in the disassembled bytecode, we don't see any mention of the printing part, all we see is flesh wound, so when you do it this way, when you try to disk a module this way by importing it, it's only going to disassemble the functions in that module. Anything else that's there, just kind of as a script, is going to get, is not going to get put in the output of disk, so that's just something to know about the different ways of using disk. Okay, is there anything else we can disk? How about nothing? What if we pass no arguments to it? In this case, we're not dishing nothing, we're dishing the last traceback, the last error, which is a cool thing, because let's say I tried to print this variable spam, which I had forgotten to assign, so I get this name error, of course, if I do disk.disk with no arguments, I can see the bytecode that tells me exactly where that error came from, so you see the arrow to the left of the operation names there, that indicates that, okay, when I loaded print, that was fine, I found print, okay, but when I loaded spam, I had a problem. So, these are some different things that we can disk, which, if you're like me, is just fun to just spend lots of time just dishing everything you can get your hands on just to see what they do, and apparently can also help you in solving some puzzle challenges that one of the sponsors has out there, but other than that, why do we care about doing this? Why do we want to do this if we're not at a conference where we get free USB power packs if we solve puzzles? Well, as we saw, when we use the disk.disk with no arguments, that's a really useful debugging tool, because sometimes the error messages that we get from Python, although they're usually wonderful, sometimes they don't tell us everything we need to know. So, for example, let's say I had a line in a really complicated mathematical code there that is dividing two, has two division operations on the same line. So, ham divided by x plus ham divided by spam. That gives me a zero division error, and it tells me what line in my code the zero division error came from, but it doesn't tell me whether it was eggs or it was spam that gave me the error. If I disk the trace back, I can actually see that, okay, we were going through, we loaded ham, we loaded eggs, we did a true divide, and there was no problem. Ah, okay, so eggs was fine, then we loaded ham again, and we loaded spam, and then when we did that divide, that little arrow says that's where the problem was. So I know that the problem in my complex mathematical computations is spam, and that's what I have to go back and fix. So this can be a really cool debugging tool for certain situations, and it can also be a helpful tool to solve puzzles, not just the kind that the sponsor has, but also the kind that I mentioned at the beginning, where we have this for loop, which takes a lot longer outside of a function than in, and yet in the source code, it looks pretty much identical. So, let's try and get a little bit more insight here by dissing this outside function module and the run loop function from the inside function module and see how they compare. Okay, so we have outside function.py. Now we know a few different ways of dissing a module. I'm going to choose the reading, the open.read method and get a string called outside, and then diss that. So this is now what Python sees when we run that outside function.py. Okay, I don't understand all of this. I don't necessarily need to. I can get a general sense of what's going on. We're loading this range function. We've got a really somewhat big number that we're loading in. Then we have this new thing, getItter and forItter. ForItter, that's our for loop there. So that's what that looks like to Python, cool. And then inside of that, we're storing i, I guess, for each time we go through the for loop, but then we're loading i because we had a really, really useful for loop in that code that we just saw. And okay, all right, seems to make somewhat sense. Let's see how it compares with inside. So from the inside function.py file, what we care about is this run loop function. So I'm going to import that in. I'm going to call it inside just for convenience and symmetry. And then I'm going to diss inside. At first glance, this looks pretty much the same as what we just saw. So let's see if switching back and forth really fast will tell us anything. Outside, inside, outside, inside. Okay. So what do we notice? Differences. Well, first of all, on the left-hand side, we notice that some of the line numbers are different. That's because we had one extra line in the inside function. We had that function definition. That's probably not important. What else we got? Aha. With the range function, in one case it's loading, oops, it's loading as a name, in one case it's loading as a global. All right. Maybe there's some difference there, but we're only doing that once. So oops. That's probably not that big of a deal. What we probably care more about is what happens inside the iteration. So after that for iter. And here we see, okay, when we're doing inside, we're using something called store fast and load fast. And we're doing outside its store name and load name. See 16 and 19 there? So I don't know what those mean. Store fast sounds like it would be faster. And load fast sounds like it would be faster. But I don't know why or what these do. So how can I find out? Well, I can investigate by going into the disk documentation where it has a list of all of the different operation codes and tells you what they do. I've just copied those over here. Okay. Store name. Name. Name. Name. Name. Name. Name. Name. Name. Name. Name. Name. Name. Name. Name. Name. Name. Name. Name. Name. Name. Name. Name. Name. Name. Name. Name. Name. Name. Name. Name. Name. Name. Name. Name. Name. Name. Name. Name. Name. Name. and dig deeper into cvowl.c, which is where the Python interpreter processes all of these different codes. And there's a really cool talk by Alison Captor called a 1500 line switch statement powers your Python. This is true. There's a huge switch statement where it's telling C Python what to do with all the different operation codes that you might encounter. So if we look at the actual code for those operations, load fast and load name, we see that load fast is like a little bitty thing, it's like ten lines, and it involves a look up into an array called fast locals, which sounds fast, because it is fast. Load name on the other hand, first of all, it's more code, it's longer, it's more complicated, it's about 50 lines, and it involves a dictionary look up, which is quite a bit slower. So it turns out that one of the main speed differences here, which is a little bit tangential to the byte code discussion, is that when you have code inside of a function, because when you define the function, you know how many variables you need in that function, Python can just assign a fixed length array, so when it needs to look up something in that function, it can just index into that array and pull it out really quickly. Whereas when you have it in the global scope, it doesn't know, you might assign new variables all the time, so it keeps things in a dictionary, and so looking up from that dictionary is a bit slower. Anyway, then there's another thing called opcode prediction, which makes it even faster if you combine certain operations together, because C Python can predict what's coming next, and it has an idea, it can save some ticks by doing common operations that always go together by predicting it in advance, and so the combination for it and store fast happens to be one of these predicted combinations, it moves a lot faster than combining for it and store name. So if you're curious, I so strongly suggest you check out this really cool Stack Overflow conversation, why does Python code run faster in a function, and Alice in Captors talks, which talk a bit more about how we can start exploring this giant switch statement that tells Python how to interpret all of these different operation codes.