 So if you follow sites like hacker news or reddit you might have seen recently there was version 1.2 of piston released This is a new a python implementation written by his folks at Dropbox Guido himself does not work on it, but some of his colleagues do and I can't think he they have a kind of privilege relation with him like having lunch. So can get some insight on python Anyway, so my first question when I saw that was why the hell new python implementation? I mean there are lots of them. Why would somebody take the time to write a completely new one from scratch? To answer this question, I'm gonna start by a small survey of the existing implementations There's the CPython, which we all love and which is the kind of the canonical implementation the one written by Guido himself there's PyPy, which is Jet compiler You might have heard of our python. It's it's written in python itself. It's kind of a For Ruby folks like it's kind of like Rubinius There's Jyton running on the JVM iron python and .NET There's tackless python. I don't know if it's selective. Basically it it tried to address concurrency issues with the Gael and CPython and Finally Lee Well, there are lots of more of them, but there are a few years ago There was unladen Swaldo, which tried to bring a different kind of jet compiler to the Python world So where does the piston fit in there? All the implementation tried to scratch to scratch the two main itch of Python and one of them is performance because CPython is Not that slow, but not that great either and the second one is concurrency such as stackless piston specifically tries to address the performance issues How it does it is it's it's also a jet compiler. I'm gonna explain it more detail later like PyPy, but it does use the more recent techniques that are introduced in virtual machines of chas v8 by Google, which is the JavaScript interpreter in Chrome and the Also the what's it called a Yeager monkey in the Firefox So I'm gonna it's It uses these new techniques. Well, actually they're older, but they've become more refined over the years This is the method by method jet compilation As opposed to tracing jet compilation. I'm gonna explain the difference between the two later But first I'm gonna oh god I'm gonna explain what's an interpreter and how it jet compiler makes things faster So what's an interpreter if we look at C of all that see which is in the Python source code We can see it just a huge switch statement. So what Python does is it's really it reads your Python file and Choose them for a while and then it outputs in memory a list of upcos to execute so you get this flat a list of of instructions that are fed into the switch statement and Python's are gonna run them in sequence And obviously there's a lot of overhead and because there's this It's not the machine itself that executes our code, but this this C code. So the first thing we can do is a Compile this code. So basically the simplest technique is just to take the The small fragment we have in each case match them with the with the instruction of the Python VM we had and just make a Linear sequence of machine instruction that are just Exactly like what would have been executed by Python and we save a bit of overhead Like we we avoid jumping and branching But this is Just a small part of it. The main reason I want to do that is because we can now perform some pretty good optimizations So we'll take an example I mentioned two kind of jitter Jit compilers we'll take this small Python program and look how they could be optimized So we can see How it tries to achieve this higher performance Well, I'm just gonna walk through it quickly for those who cannot read it So basically you have a foo method and it takes an accumulator and a counter which is X Usually you would why write something like that with corange But I just use while loop to to avoid making a function call to keep things simple So you take X from zero to a to a hundred thousand and then you pass it to function bar Which just gonna switch on it if it's less than 50,000 it's gonna add one to it. Otherwise, it's going to add two to it Then it's gonna add it to accumulator increment the counter and then at the end is gonna return the accumulator So we're gonna see the first technique. It's one method at a time. This is what I'm a piston is implementing And what V8 and all the other fast virtual machine are implementing So basically I've put some assembler there. You don't have to read it or understand it The main point I want to make is that for each method you will get one small piece of assembler It cannot be compiled one shot like this. This is a quite an optimized version. You can see it At the bottom of the first one is just use a add queue, which is the machine instruction to add to fix numbers numbers To achieve this it has to collect type information over multiple iteration the threshold for that are After maybe three iteration it's going to add instrumentation code to record the type to in the function and After a few thousand iteration it will then generate the optimized code because it's quite expensive And you don't want to do this for code that's only going to be executed a few times The other technique is tracing jit compilation so basically the first one Generated assembly snippets based on the static structure of the code one for each method This one will take the dynamic structure of the code. This is a trace the sequence of instruction that are executed We can see the assembly snippet generated for them the the first 50,000 iteration So the the the middle can see the add one is embedded in it, but when it will get to the add to There's a guard statement assertion failed which will trigger the recompilation of a snippet So we can go back and then reinsure in the code and then eventually recompile it to an optimized form So in practice it seems like tracing Compilation would be more effective In theory, I mean but in practice that we see that these virtual machines like v8 and have better performances so the question is Will it will the same performance gain that appear for Python code? Because it really depends on the type of programs That the language features so it's still an open question. We don't know yet. Which one will be faster and Between pi pi which is a also a tracing a jit compiler and piston, which is a by method compiler current the current status of piston is there Implemented maybe 80% of language. So the benchmark are promising But it still does less than the other implementation of Python. So it's hard to say if it's going to deliver the promise So it has a few other nice things like they want to implement a better garbage collectors But I think the most interesting thing about this implementation is the performance that it we may see in the future Yeah, so in conclusion, I think it's a it's a nice thing that people can come up with the implementation It shows how the Python community is alive and trying new ideas. Thank you