 I'm playing with Python bytecode. Please welcome Scott and Joe. Hello. All right, so hey, everyone. Thank you all for coming out here today. I know you guys were all really excited to see Scott and Joe come here and talk about playing with Python bytecode. Unfortunately, I have a little bit of bad news for everyone. We just received word from Scott and Joe that they're not actually going to be able to make it today. But fortunately, they sent me their outline. So they said that they were going to talk about C Python's internal code representation. They were going to talk a little bit about creating a code object from scratch, whatever that means. And they were going to talk about techniques for manipulating and updating code objects to do interesting things. So I don't know what any of that means, but they sent me their outline. And I was thinking, we're here. Python's a great interactive language. Maybe we can work through what some of that stuff means together. So if we're going to be creating and mutating functions and code objects from scratch, we probably need some functions to work with. So why don't we start with just a simple, maybe an add function. So we'll do def add of a comma b. And we'll return a plus b. And let's just call that and make sure that works. One, two. OK, so far, so good. So one of the great things about Python is everything's an object. Everything's introspectable. It gives us all these great tools for working with code. So everything in Python just sort of carries around all of its states. So I'm guessing maybe the code is somewhere down here. So everything's secret and interesting in Python starts with a double underscore. So maybe we can find something here. Dunder annotations, that's not it. Dunder call, dunder class, dunder closure, dunder defaults, byte code. I don't see any byte code, but there is this dunder code attribute. This is a code object at some memory address, and it's from file ipython input eight. So this is probably where this byte code thing lives. Maybe we should see what else lives on this object. So we've got a whole bunch of attributes. We've got co arg count, which is two. We've got co selvars, empty tuples. I guess we don't have any selvars. We've got co consts as a tuple containing none. So maybe none is somehow a const, or maybe this signifies that we don't have any consts. Byte code, I don't see any byte code, but again, we've got this co code attribute, and it's a bytes object. So we've got co code. It's a bytes object. I'm guessing this is the byte code. So I guess the byte code for add is pipe 00, pipe 1017, capital S. Clearly that makes sense to everyone, right? All right, you know what? I got an idea. We've got this string. It's full of non-printing characters, right? This probably isn't meant to be interpreted as a string. Maybe a better thing for us to do would be to look at the raw values of the bytes in that byte string. So I can do print list of add.dundercode.code. All right, this is definitely a little more structured. Ooh, not printing thing. This is definitely a little more structured, right? We've got 12400, 12410. So there's kind of a repeating pattern here. Maybe this is somehow the same thing happening twice, but with different values. And then we've got a 23 and an 83, which definitely means something. I was really hoping this would be easier. All right, you know what? I've got an idea. We're here, we're at Pygotham. We're surrounded by some of the best, most knowledgeable Python programmers around. Surely there's someone here in the audience who knows about byte code, who's worked with byte code, maybe can come teach us how byte code works. So is there anyone here maybe who knows about byte code? Anybody? Well, I'm actually a PSF certified byte code expert. Well, ladies and gentlemen, we have a certified byte code expert. Can we get a microphone for him? No need, I brought my own microphone. Wait, you brought your own microphone? Let's get back on track. You had the right idea looking at that bytes object as a list of events, but you're not gonna get very far looking at it like that. Luckily, Python provides a module for inspecting this. Why don't you try import dis? Okay, import dis. All right, the Zen of Python by two meters. No, no, import dis with a D. It's the disassembly module. Okay, all right, import dis. All right, I've imported dis, what do I do with dis? Oh, we'll call dis.dis of ad. All right, dis.dis of ad. All right, well that's definitely better than just a list of bytes values. Can you tell us a little bit more about what this table means? Sure, so while we have eight bytes in our byte code, we actually only have four instructions. We have a load fast, another load fast, a binary ad, and a return value. So the first three bytes, the 12400, represent the first load fast. The 12410 represents the second load fast, then the 23 and 83 are the binary ad and return value respectively. Okay, so 12400 corresponds to a single load fast instruction, but 23 and 83 correspond to binary ad and return value. Why does load fast take up three bytes in the byte code when binary ad and return value only take up one? Load fast says to load a local variable, but it needs to know which local variable to load. So in the 12400, the first byte, 124, is the opcode for load fast. This tells us which operation we're going to perform. Then the 00 are the argument which says to load local variable zero. Wait, local variable zero, didn't we wanna load a? The argument is a 16-bit little endian integer which is an index into an array of local variables. This helps us out by showing us the numerical value of the argument on the right side there. However, it shows us the meaning of that numerical argument in parenthesis. So while it is a load fast of zero, that actually means load a. Okay, so 12400 is a single load fast instruction and then load fast of zero actually means load the local variable a. Where exactly are we loading a two? Load instructions push values onto a stack which is shared between instructions. This allows other instructions to manipulate these values later. If you'll notice, there's no argument in the byte code for binary add because it will just pop the top two values off the stack, add them together and push the result onto the stack. Okay, let me make sure I understand this. At the start of this function, we're gonna have an empty stack. Then we're gonna do a load fast of zero which will load a onto the stack. Then a load fast of one which will push b onto the stack. Then we're gonna do a binary add which will pop both values off the stack, add them together and push the result back onto the stack. And then finally, we're gonna execute a return value instruction which pops the top value off the stack and returns it to the calling stack frame. Exactly. Okay, I think I understand the right-hand side of this table. How about these numbers to the left of the instruction names? Those are the byte code offsets where those instructions appear. So of course, the first argument starts at index zero. However, the second instruction starts at index three because indices one and two are occupied by the arguments to the first load fast. Okay, and then binary add appears at index six because indices four and five hold the argument to the second load fast instruction. Okay, I think I understand this side and I think I understand this column. What's this two in the top left corner for? That's the line in the source file where these instructions appear. This would be a little more apparent if we try to function with more than one line. Okay, how about maybe like a, you know, def add with a sign and we'll do a comma b and then we'll just do x equals a plus b and then we'll return x and then we'll do disc dot disc of add with a sign. Okay, so what you're saying is that this two here says that the first four instructions of this code correspond to the second line of the source where we're doing x equals a plus b and then this three says that the last two instructions correspond to the third line of the source where we're doing a return x and here we can also see that we're doing a store fast instead of a load fast to a sign to x. Think you're getting the hang of this. Why don't we try a function that's a little more complicated? Okay, maybe, I don't know, like an absolute value function. So how about we do def abs, take a single argument x and then we'll do if x is greater than zero return x, else return negative x and then we'll do disc dot disc of abs. All right, let me see if I can do this one. We've got a load fast of zero which is gonna push x onto the stack. Then we're gonna do a load const so I guess there's multiple kinds of load instructions in CPython. We're gonna do a load const of one but that actually means load the value zero. Then we're gonna do a compare op of four and that means greater than. How does CPython know that compare op of four means greater than? Not all arguments are just indices into some array. Here the argument to compare op is an enum which says which relational operator to use. So there are entries for greater than, less than or equal to. Okay, and then after that we're gonna do a pop jump. Hold on, this one might be a little more complicated for you. Luckily, pop jump if false does exactly what it says it does. It's gonna pop the top value off the stack. If it's truthy, it will continue execution like normal but if it's falsy, it will jump to the bytecode offset specified in its argument which in this case is 16. Okay, so when we get to pop jump if false, if the top value of the stack is truthy, we're just gonna continue on executing the load fast and the return value but if it's falsy, we're gonna jump to the instruction at index 16 because 16 is the argument to pop jump if false. If you'll notice, those arrows there are this is hint for the non bytecode experts that this is a jump target. Okay, wait a second. So if X is greater than zero, we're gonna go here and we're just gonna go straight to these two instructions. If it's less than or equal to zero, we're gonna jump here and go load fast, unit or negative return value. There's two instructions at the bottom of this function that are never gonna be executed. Why are those even there? Those instructions are just dead code. So CPython has a fairly simple code generation algorithm and one of the rules is that if a function does not end in a return statement, a load const of none and a return value will be emitted for you even if they can never be executed. That seems kind of wasteful, don't you think? It's only an extra four bytes at the end of a function which is half a pointer. The CPython developers decided it's not worth the extra complexity to remove it. Okay, but say we really cared about those four bytes. Is there some way we could remove them? Well, you don't have to use the CPython compiler. You can just instantiate a code object like any other object. Okay, well, any other Python object I instantiate by calling it constructor, right? So I need to find the constructor or the type for a function, I guess, or a code object. How do I, where do I find the type for a code object? That would be in the types module. Okay, from types and what am I importing? Code type, it is the type of code. Okay, from types import, right, code type. And I guess we should look at the docs to see what we do with this. So do print of code type dot under doc. All right, we've got a billion arguments to this thing. And the docs say create a code object not for the faint of heart. All right, well, fortunately for us we've got a byte code expert here to help us out. Maybe we should, all right, maybe we should see what we wanna do here. So I guess we should just get started, right? So we'll do my code equals code type of, all right, the first argument is arg count. Well, I guess we should probably figure out what it is that we wanna write here first, right? Yeah, I guess we can write our own abs function, right? How about we start off with something a little more suited for you. Maybe like add one? I feel like we could have done something a little bit harder than add one, but fine, okay. So add one, I guess we'll just have def add one of x, and this will return x plus one. You got it. All right. Whoops, I'm gonna put this back one. Add one, x, return x plus one. Okay, and so I guess we probably need those code docs again. So I'll do prints, code type, type dot, done the doc, code type. There we go, cool. All right, so we've got code type here, and we're gonna do my code equals code type of, okay, well, arg count is gonna be one, so we just have one argument. Then we've got clonely arg count. We don't have any keyword-only arguments, so this is just gonna be zero. Next up, we've got n locals. Well, we only have x, there's only one local variable, so that's probably just gonna be one. Next up, we've got stack size. What's the stack size gonna be here? The stack size tells Python how much space to reserve for values on the stack. So this is the number of slots that will be used, these are the slots that will be used when we push values on a stack, and it needs to be able to hold the maximum number of elements that will ever appear on the stack at any one time. Okay, well, in this function, the largest the stack is ever gonna be is right before we execute a binary add instruction when we have both x and one on the stack. So the stack size here needs to be two. Next up, we've got flags, what are the flags about? The flags are a bit mass, holding various options that a function could have. There are many of these, so I've taken liberty of preparing a little material ahead of time. Wait, you prepared ahead of time? Would you be so kind as to hit the down arrow on the keyboard? How did you get this here? Let's get back on track here. So the first flag that we care about is co-optimized. Co-optimized tells the interpreter that it's free to make certain optimizations when executing this code object. In practice, this means that this code object comes from a function as opposed to a module or a class body. The next flag that we care about is co-new locals. New locals says that a new locals dictionary should be allocated every time we execute this code object. Again, this normally means that it comes from a function. Okay, I'm guessing that co-var-args and co-var-keywords tell us whether the function takes star-star-args or star-args. Exactly. The next flag that we are interested in is co-no-free. Co-no-free says that this code object does not share any variables with any other code objects through a cell variable or a free variable. This means that this function is not a closure. Okay, and then after that, we've got co-co-routine and iterable-co-routine. What's the difference between a co-routine and an iterable-co-routine? These two flags were added in Python 3.5 to support the new async keyword. So a co-co-routine says that this code object comes from an async-def function, but co-iterable-co-routine is an old-style co-routine which has been decorated with type-stock co-routine. Okay, well, fortunately, I think that's all the flags, right? Oh, we've got more flags. These are the flags that are enabled when you use a from-dunder-future-import statement. For example, from-dunder-future-import division. Okay, I've seen division, I've seen absolute import with statement, print function. What's co-future-barry as BDFL? This flag says that you've enabled enhanced inequality syntax. Obviously, okay. I guess, can we just get back to that code I was writing before? Why did you re-write my code? You see that I've selected the flags we need, 67. So let's keep going. I'm so changing all my passwords when this is over. Okay, so we did arg count, we did qualen the arg count, and local stack-size flags. Next up is code string. So I don't see byte code anywhere else. So I'm guessing code string is our main event here. So we're gonna need a bytes object with all of the instructions that we need. All right, Mr. ByteCodeExpert, what are the instructions we're gonna need here? 124.00, 100.00, 23.83. You gonna get that down? Care to explain any of that for the rest of us? Sure, of course. What we want to do is load our local variable onto the stack, then a load a constant one onto the stack. Then we will add these values together and return the value. So first, we're gonna start with 124, which is the opcode for load fast. Since we only have one local variable, we can store this at index zero. Next, we are going to load a constant, so we need the opcode 100. Again, we only have one constant, so we can store that at index zero. Finally, we're gonna use 23, which is the opcode for binary add, followed by 83, which is the opcode for return value. All right, I guess that's not that bad. Next up, we've got constants. So you said we're only gonna have one const there, right? So I guess this is just gonna be a tuple containing the value one. Next up, we've got names and var names. What's the difference between a name and a var name? Names are the names of any global variables or attributes which are referenced in this function. Since we don't have any, we can just use an empty tuple. All right, one empty tuple coming right up. Var names, on the other hand, are the names of any local variables in this function. So we can just use the tuple containing the string x. Okay, we've got an x tuple. After this, we've got... The next four don't really have much meaning for a handwritten code object, so we're gonna start with file name, which is the file where this code object comes from. We don't really have one, so just pick your favorite string. Next, we have the name, which is the name of this code object or the name of this function. I guess that should just be add one. Yep. Then we have first line node, which is the first line in our source file where this appears. Since we don't really have a file, we don't have a first line, so just pick your favorite positive integer. Next, we have lnotab, which stands for the line number table. This is a bytes object, which is a mapping between byte code offsets and line offsets. Since we don't care about the line information, we can just use an empty bytes object. All right, one empty bytes object. And then last but not least, we've got free vars and cell vars. What's the deal with those? Those are the names of any variables that we share through cell variables or free variables. Since we don't have any and we also set cono free, these both better be empty. All right, two empty tuples. I guess, moment of truth, let's see if that worked. All right, well, it didn't crash, so I guess we've got a code object. Let's try calling it, so I should be able to do my code of five and I should get, hey, what gives? I thought you said you were some kind of byte code expert. Well, we don't normally just work with code objects, now do we? No, we normally work with function objects. And I bet I know what you're gonna say next, which is that I can make a function object just like any other type in Python, which means I need to import function type from the types module. So I can do from types, import function type, data string, and let's see how we call function. Hopefully it's not as bad as code, all right. Print function type dot under doc. All right, create a function object from a code object in a dictionary. All right, well, I've got a code object and I know how to make a dictionary, so I should be able to do my add one equals function type of my code and an empty dict. All right, well, that didn't crash. All right, my add one of five gives me six. This byte code thing's not so bad. Yeah, well, why don't you see how close you were to the one that Python gave us? Okay, well, let's do a disk.disk of add one. And then we'll just print a separator so we can see the difference. And then disk.disk of my add one. All right, well, we've got some nonsense line numbers over here, but other than that, we've got load fast, load const, binary add return value, load fast, load const, binary add return value. I think we got it exactly correct. Not quite exact. You'll notice Cpython uses a load fast of one. However, you are using a load fast of zero. A load const, sorry. Okay, that's a little interesting. Why is Cpython generating a load const of one? What does Cpython have in the const there? So, see prints of add one.dundercode.co consts. And maybe we'll just put that compared to my add one. What the heck, Cpython just has a none in the co-consts there. Why is that there? Nothing in this function uses none. That's a little quirk of the Cpython compiler. None will always appear at index zero in the consts, even if it's not referenced. Wait, so are you saying that our handcrafted, artisanal, non-GMO bytecode is even more sleek and well optimized than what Cpython itself generates? In a way that does not matter at all. I don't know, that none might be the difference, you know. Wait a second, so if Cpython is just reading values out of this co-consts tuple, what happens if I just update it in place? What if I do my add one? Dot co-consts equals two. Sorry, dot dundercode dot co-consts. That just would have set an attribute. All right, what gives? Why can't I do that? Well, I mean, let's imagine for a second, you did do that. Anyone who had a reference to my add one now just got a reference to my add two, which is probably not what they wanted. I don't know, my add two sounds like a great function. Though I guess I could imagine how you might want your consts to actually be constant. So does that mean there's no way for us to create or update or change around a code object that we already have? Well, we can't mutate it in place, but we can always just make a new one by copying all the attributes from another code object and swapping out any values we wish to change. Okay, so you're saying that what we need is like a function that does a functional update on a function. Okay, I think I can write that. I went ahead and wrote this one for you. It's pretty long, a little complicated. Honestly, you probably get it wrong. Fine, ignoring that. Okay, so this says function that performs a functional update on a function. So what update is gonna do is it takes a function F, it's gonna grab its dunder code and then what it's gonna do is create a new code object passing all the attributes of the old code object but swapping out any that were passed by keyword directly to this function. So now we're gonna have a new code object that's swapped out any attributes that were provided and then we're gonna pass that into a function type and copy the globals and the name and all the other attributes of the code object. Of the function object. Right, okay, so what you're saying is that if I do update of my add one with co-const equals the tuple containing two then that should give me my add two. And then if I do my add two of five, I get seven. Hey, this bytecode thing's not that bad. I think we're all well on our way to becoming certified bytecode experts. That's cute and all but you can only get so far updating the metadata. The real meat of the code object is in the bytecode itself. Well, co-code is just another attribute of the bytecode, right? So if I can swap out co-const with a new tuple I can swap out co-code with some new code. Now you're cooking with gas. What if we wrote a function that swapped out all the 23s with 20s? Wait, 23s and 20s? Binary heads with binary multiplies. I forget we're not all bytecode experts. Okay, so what you're saying is what I need is some sort of like def add to mall which is gonna take a function f. And then I wanna grab the co-code office so it'll be old equals f.dundercode.co-code. And then what I wanna do is replace all the bytes in this code object with a value of 23 with a new byte with a value of 20. So I can do new equals old.replace bytes of 23 with bytes of 20. And then what I wanna do is invoke update to swap out that code on our functions. We wanna do return update of f with co-code code equals new. And now if I do add to mall of my add to I'm gonna get my mall to. And if I do my mall to five, I get 10. This bytecode hacking stuff isn't so hard when you know how everything works. You know, I think there's actually a bug in this code generation algorithm you gave me. No, I don't write bugs. How could there be a bug here? We just swapped out all the binary ads with binary multiplies. Well, no. We swapped out all the bytes with a value of 23 with bytes with a value of 20. And you told me mere moments ago that not every byte in the bytecode is an instruction. Some of them are arguments. I mean, 23 means binary ad. There's no way a 23 byte would ever appear as an argument, right? Well, what if we had a function with 23 local variables? No one's gonna write a function with 23 local variables. Well, now that you mention it, I actually have a function right here that has 26 local variables. So this is my getX function. I use this all the time at work. And this function, you pass it the alphabet and it gives you X back, right? So if I do getX and I star unpack asky lowercase, which is just a string containing all the variables in the alphabet. If I do asky lowercase, then I get X. And this function's not doing any fancy addition or multiplication or anything crazy like that. I'm just returning a local variable. So addToMul shouldn't have any effect on getX, right? That should be a no-op. But X is the variable at index 23 in the alphabet. So that means that if I do addToMul of getX, it's gonna turn it into getU. And you know, this actually could have been a lot worse, I think. I mean, at least here, 20 was still a valid index for us to load. What even would have happened if we had loaded a local variable at an index that didn't even exist? Or a constant, right? Like what if I did update of my addOne with co-consts is an empty tuple? That would turn addOne into some sort of like my bad one. And if I call my bad one of five, ooh. Yeah, I think you may have segfaulted the interpreter there. You know, now that you mention it, there may be some issues with that jump code I showed you before. Like if we're just jumping to the byte code offset specified in the argument, what if that's not a valid instruction? Or what if that's out of bounds entirely? Who knows what would happen? Hey, yeah, and since jump offsets are just some offset into the byte code, that means they'll always be wrong if we ever insert or delete an instruction. They'll all be the wrong indices, right? Yeah, we would need some way to like recalculate them. That seems like a lot of work. Yeah. This byte code hacking thing seems harder than I thought. Hey, didn't you say that Joe and Scott had worked on a library that should help transform code objects like this? Oh yeah, code transformer. I actually downloaded it right before the talk. I was hoping, you know, maybe we get to look at it a little bit. Maybe they have some ideas for solving some of these problems. So we can do, you know, from code transformer. Oops, from code transformer. Okay, let's see what they've got in here. They've got a code class. They've got a code transformer class. Makes sense. Code, core, decompiler, match, patterns. Tests. They've got tests. All right, transformers. That's probably where the meet is, right? So let's see, import. Oh, Joe and Scott wrote add to mall. I guess great minds think alike. I'm sure this is one of the most useful transformers in code transformer, right? All right, so let's see how we use add to mall. Oh, I guess it's a module where they're defining the actual add to mall transformer. So maybe we should go look at the source and see how that works. Okay, so add to mall. A transformer that replaces binary add instructions with binary multiply instructions. This isn't useful, but it's good introductory or example. You know what? I'm gonna agree to disagree there. Okay, so we're saying from code transformer, import code transformer and pattern. And then from code transformer.instructions, we're importing binary add and binary multiplier. So we've got some sort of code transformer base class. We've got some notion of pattern matching. And then we have some sort of objects that represent instructions. That seems a lot nicer than just memorizing 23 and 20 all over the place. For some. Okay, and then in this add to mall class, we're subclassing code transformer and then we're decorating a replace add with multiply method with a pattern of binary add and it's yielding a binary multiply. So it looks like what's happening here is we're registering methods that match certain patterns of instructions and then we're writing generators that yield replacements for those instructions. So this is gonna match a pattern of a single binary add and yield a replacement of a single binary multiply. What do you think that steel method does? Well, through the magic of Emacs, we can see. Okay, steel is steel the jump index off of instar. This makes anything that would have jumped to instar jump to this instruction instead. So this looks like it's some sort of technique for dealing with that jump offset resolution problem that we talked about a second ago. Yeah. Okay, well, if they don't think add to mall is a useful transformer, maybe we should see what kinds of transformers they do have here. Okay, well, we've got as constants, byte array literals, decimal literals, pascal strings, ordered dick literals, that sounds kind of interesting. So I think I read in the docs that you're supposed to use these as decorators. So I do at ordered dick literals and then we'll say def make dick of A, B, C. Comma B, comma C. And then we'll just return a dick mapping A to A, B to B and C to C. And I need to actually map that to C. All right, and then we'll return make dick of one, two, three. And hey, look at that. I get an ordered dick instead of a regular dick there. Yeah. Okay, I think we probably have time for maybe one or two more. Let's see what else we've got here. As constants. Do maybe interpolated strings. Oh, well, everyone's super excited about the F strings that are coming out in Python 3.6. Maybe this is something like that. So let's see, if we do at interpolated strings and we'll do def make stir A, B, C. And we'll just return a string mapping A, B and C. And normally this would just give me the string back, right? But then I have to actually call that. So I do make stir of one, two, three. Hey, look at that. I've interpolated those strings in magically for me. That seems, I don't know, I still kind of like add small, but that seems pretty cool. Well, I think that's just about all the time we have here today. So I want to just sort of, I guess, recap what we talked about, right? So we talked about the code object and all the attributes of the code object and sort of see Python's internal code representation. I guess we talked about how to create a code object from scratch. We talked about some techniques for modifying code objects, but we also saw sort of the dangers of playing God with the CPython compiler in that way. And then at the end here maybe we saw some techniques for mitigating some of those dangers. So I want to thank my assistant here who came up with no notice of any kind. It was just sort of gracefully helping me through this presentation. I know you guys were excited to see Joe and Scott, but I hope you all learned something here and we all sort of became better programmers together. So thank you all for coming out. So in case you haven't figured it out by now, I'm Scott and I'm Joe. We wrote a library called Code Transformer for doing these kinds of terrible bytecode manipulation hacks. We think it's sort of a fun and whimsical topic and so we wanted to do a fun and whimsical talk that sort of reflected the spirit of the project. So a little bit more about us. We work at a startup in Boston called Quantopian that builds tools for anyone to do algorithmic trading in the browser and Python. We do not use any of the techniques you just saw to trade other people's real money. Or anywhere else in the stack for that matter. Yeah. You can find us both on GitHub. I'm github.com slash ssanderson. And I'm a barcode, which some people dislike. What Joe means by that is that he's github.com slash 10 lowercase ls. And then on Twitter, you can find me at the again very reasonable name of at Scott B. Sanderson. And I'm Dunder Klaw name. So we actually do have a little bit more time to talk about some of the transformers we just showed. So I was just gonna run briefly through a couple more interesting transformers. Out of character this time. Yeah. So we saw the sort of simplest possible transformer and the idea here is again that it's sort of unsafe and dangerous in those various pitfalls to trying to arbitrarily manipulate and swap out and mutate code objects. So a better idiom is often to try to have a notion of sort of pattern matching and replacements. The idea here is we wanna match against certain patterns and instructions and yield replacements for them or do whatever other kinds of transformations. So the simplest case is just match a single instruction, yield a single replacement, but we can do more interesting things. So an example of that is the as constants transformer. So what this transformer does is makes it so that certain name lookups that would be globals are instead read as constants. So the reason you might wanna do this is if you're just in normal Python, right? And you do like death. Load up, that's fine. Oh, sorry, yeah. Is that okay for everybody? Cool, right? So in normal Python, right? If you do like defu, or actually I'll do some sense dis, defu, return, LAN of. For spaces. Thank you, Mr. Byte code expert. Right, if I just like return, you know, the length of an empty tube or something like that. You'll notice that the instruction that's loading LAN here is a load global. And load global amounts to at least one dictionary lookup and sometimes two, if it's not in the globals, it falls back to the built-ins, which is not an enormous cost and almost all Python codes should not care about this fact, but if you're truly in a deep nested loop in the bottom of your stack, you might actually care about the difference between like an array index and a hash lookup. And so a thing that you'll actually see in like the CPython source and in like very deep inner loops as you'll see hacks look like this, we'll say underscore LAN equals LAN and then it'll call like underscore LAN here instead. And the reason you do this is that default arguments are evaluated at function creation time here and so this will actually capture LAN as a local variable instead of a global. And so now if we look at the disassembly for this, you'll notice that we're loading LAN with a load fast instruction instead of a load global and that's a slightly faster operation we're doing an array index instead of a hash lookup. But this is sort of a gross hack, right? Like we've added an argument to this function that we never actually wanna use, we don't want it to be passed, we're just doing it because we care about performance in this weird way. So an alternative thing that you might wanna do if that's too hacky for you is something like this. So you can decorate a function with as constants and you can only give it a string which is the name of a built in or you can also pass keywords here that says as constants A equals one. And then if we call A, even if it's not in scope it'll still resolve to one. And if we actually rebind it in the outer scope it'll still resolve to one. And it says, though, that's just a constant there. And so the thing that we wanna do for this transformer is replace various kinds of load instructions. There's actually about five different kinds of loads. So there's load name, load global, load DRF, load class DRF. There's actually another one load fast which by construction we don't have to care about here but that's for completeness, there's another one. And so here's a slightly more interesting pattern where we're using pipe to mean or. So we're saying match the pattern of a load name or a load global or a load DRF and then what we're doing is saying if the instruction is not one of our constants we're just yielding it and if it is then we're replacing the load that CPython emitted with a load of our constant value. So this is an example of a slightly more complex pattern and then Joe is gonna talk about one more sort of complete example. So this is literal, right? Yeah. So this is a special or this is the more general case that overloaded or that ordered dick literals one we saw which is this is a transformer that lets you essentially register a function to be called whenever a dick literal or a dick's comprehension appears in source. And the interesting piece of this is how it handles comprehensions. So while we only showed a static dictionary display or literal, we also handle comprehensions and we need to do that in an interesting way. So here we've got a much more complicated pattern than the ones we've seen before. The important note is that we're actually matching more than a single instruction at a time. So before we only ever cared about when I see this one instruction, I emit a new instruction. But this one is when I see a build map followed by a match any of var which match any says match any type of instruction and then the subscript var means zero or more of this. So I say when I see a build map and then any amount of instructions and then a map add then this is telling me that I'm in the construct for a dictionary comprehension. Where dictionary comprehensions look like a build map which creates an empty dictionary on the stack then a for loop basically where I iterate over my sequence and then I end with a map add to write that value into the dictionary. So we're not gonna talk about all the stuff that's happening here to make that work but the interesting thing here is we're yielding a bunch of replacement instructions and then at the end here we call self.begin in comprehension. So we don't always just care about a single pattern because sometimes we need to know what happened before. So we actually, this pattern is based kind of on like Flex or Alex if anyone's used that where you can enter a new start code which says that some patterns are only valid given some context. So here when we've matched this construct we know we are now in a dictionary comprehension so we begin the incomprehension state. And then you'll notice the next pattern is a simple pattern again it says I only match return values however there's this extra piece of information that says I only match return values if the current state is in comprehension. And this is a simple thing it just alters how we return from a comprehension based on how we rewrote the comprehension above. But what this means is that we can concisely express that we want to alter return values in the inner code object of a comprehension but we don't want to make that transformation happen to return values of functions which use comprehensions. Yeah so I think we still have a couple minutes left for. Okay so that's the end of what we actually had prepared to talk about so for real now thank you all for coming out and if there's any questions I'd be happy to answer questions. Is there any questions? So the question is if you are to do by code hacking how would you go about testing it? So the big concern when you're testing your code is that failure is not often an exception. Failure is often an invalid ref count somewhere which will lead to a seg fault in the interpreter or invalid memory access. So in terms of testing, I mean code transformers itself is tested often by we write out basically what the transformation is either we say these are the new instructions we expect or more frequently this is just what we expect to happen. I don't have a good answer for how should you test in the face of how do I make sure that the ref count is still the same? So you could maybe you see types to grab things like well actually you can get ref count right from in python sys.getrefcount but you have different failure modes and I don't really have a good answer for that. But I mean I think the short answer for that is you test it the same way that you would test anything else like a transformer here is just saying all right if I decorate this function with interpolated strings then the strings should get interpolated in and you can go through some more interesting ones but it's not fundamentally different except that when it's bad it fails in a nastier way. Yeah so the big one is like make sure you're jumping to real things like that sort of thing. Yeah the silent one that you won't notice for a long time is if you do a load instruction but your stack is empty the way the memory is actually laid out is it's the local variables then the stack so if your stack is empty it will load the local variable which will pop it from the stack and it will decref it but if that happens to have enough ref counts you'll get that value back but it will have a deficit of one ref count and you won't notice until maybe even interpreter shutdown where you'll get a seg fault tearing down some totally unrelated module and that is very painful to debug. I don't know how to do that. What are the practical uses of this do you feel? Like can you use it for optimizing code or is it just sort of like a whimsical thing like you said? So it's definitely, do you want to talk about the debugger case? So I have seen a particularly interesting use case of bytecode hacking. It didn't use code transformer, we have a version of it in code transformer but the idea was they used Pyrocyte to inject code into a running server process and then they set a break point in a function by injecting a load const to sys.setbreak or pdb.settrace and then a call function. So that's pretty cool because it meant that they could just like go onto their server box, set a break point in their route, hit the endpoint, get the break point and then when they're done they can just remove the break point like the bytecode's fine, your app is still running as normal. The reason I think that's a reasonable case is because it's debugging and it should go away. The risk with all this is that it's extremely version dependent as in 3.4 to 3.5 changes and there's even no guarantee that like 3.4.3 to 3.4.4 could be the same. So for debugging again it's only ephemeral, like once your process is done it doesn't matter but in terms of actual production code you have to really worry about like the very specific version you're on. So one thing to think about there is there's a joke earlier where we talked about everyone who had add one suddenly got add two. That's actually a case where that's what you want because one of the issues with monkey patching is if people don't refer to something as an attribute of an object they've closed over it and you can't patch it out. So if you patch the bytecode there is really no escape for them for better or worse. You have monkey patched the only thing that they have. Another example of that which is not quite in this vein but it's sort of an example in the same theme is there's a project from Continuum called Numba which is sort of in the same theme this. Here we were taking out bytecode, we were swapping out instructions, we were doing other things with those instructions. Numba sort of takes that to the logical extreme and says take the Python bytecode, throw it in the trash and replace it with LLVM intermediate representation and it uses that in service of doing optimized numerical code. So that's not obviously using this but that's another example where you can sort of use Python to write code that you can then compile into not Python into something else but Python's sort of a nice description language for another actual execution language. If you're worried about performance, Numba is really nice so take a look. Just to repeat whoever I'm hearing the question was are there two to the 16 minus one as a limit for local variables or global variables? So for global variables the load global instruction takes an index into the tuple containing all of the names which are referenced and then it does a hash lookup. So you can have more global variables that you can't reference. There's an argument or an instruction we did not talk about called extended arg. So what extended arg is, is it's an opcode followed by two arguments but all that does is tell the interpreter to accumulate these two bytes and hold it like in an accumulator register and then when it reads the next instruction it will either prepend or append those two bytes to create a four byte integer and you can chain those so that you can use more and more. So there's actually a test in co-transformer for a function, not with arguments but with lines because of the way the Elmo tab works. Yeah, if you wanna see some truly strange odd corner cases of Python look at the co-transformer test suite. You got five minutes still? Okay. Any other questions? Where do we go to learn more about this? So co-transformer has pretty decent docs. You can go, what is it? I guess we didn't say, the project is hosted on GitHub. It is available as free software and it's on my GitHub, tenlaurksl slash co-transformer. There's also a co-transformer that readthedocs.io. We talk about, many of the problems we talked about today, specifics of the instructions themselves so that's a good place. A wonderful resource is the disassembly modules docs so that will actually tell you what all the instructions do so for any given version you go to the Python official docs slash dis or whatever and there's just a list of all the instructions. Some of them are okay, some of them are really good, some of them are incorrect but that's a pretty good resource. There's a bunch of comments in the interpreter loop itself that if you really care, that's the source of truth also so that's in the cpython code base slash python slash eval.c and there's a function called py eval underscore eval frame ex and there's a big switch case that uses computer go-tos that's really quite clear to read actually. You would think it'd be really scary, it's just as like target of binary ad and then there's a brace and it's like pop the value off the stack, pop the value off the stack and push the sum so that's a pretty good resource also. Okay, so code transformer came because I wrote a project called lazy python which I wrote a wrapper object called thunk which takes a lambda and star args and star star quarks. Is it lazy python? Yeah, lazy underscore python and so that project would allow me to defer computations and then parse them back out as substitute expressions or just evaluate them in a non eager way. So in doing that, I had it working just fine where the thunk object was pervasive and it overrode every single attribute and the buffer protocol and every slot that you can put in a type in C but I couldn't change is. So if I said A is B and they were both like a very common pattern is I say like if A is none, do something but none was going to be the actual value none and A is a computation which will evaluate to none. So they're not the same memory address. So what I needed to do is find a way to change that and that's an operator you can't overload. So in looking at that I said, how do is expressions get compiled and it actually is a pair op instruction. So I said, I need to swap that out with a function call and then I started doing just like byte swapping and it seg faulted constantly. So I abstracted that then Scott said, that's useful. Pull that out of the project and make it its own thing. Yeah, and then I got involved after that. The main thing that I've worked on a bunch is that we didn't talk about at all is there's a decompiler that is attempting to add support for taking arbitrary Python by code and inferring what the AST and ultimately what the source was. One possible application of that would be in service of allowing runtime access to the AST of your running projects. You could do sort of macro style things. So you can do that if you have source right now. If you have a function, you can go find your file, you can go find your line number and go parse it. But if you're in a dynamically generated function or a dynamically generated piece of something else, there's no way for you to at runtime reliably get access to the AST of your code. Or you might cheroot and then you can't load the file anymore. Yeah, I was just gonna say another example of that the binary operator is actually exception match. It turns out to be compiled essentially the same way as greater than less than or equal to. So there's a transformer that overrides that so you can pattern match against exception values instead of exception type. So if anyone's ever tried to catch an OS error, for example, but you only wanted one error now, this lets you do that. So you can catch a value error of buzz instead of just value error. All right, well, thank you guys all for coming out. Thank you.