 Can everyone hear me? I'm a bit sick, so I'll try to be slow and speak clearly. Sorry about that. So yeah, I'm gonna talk about PIPI and the future of the Python ecosystem. This is kind of like utopic talk, I would say. Like it's my perfect future when like, there's unicorn everywhere and stuff like that. So you can find me on social media stuff if you have questions and stuff like that. So, day to day, I'm a software engineer at least. We're hiring content to me or to people who have list t-shirts around. Have been a PIPI contributor for about four years, I think. I've worked on, the first thing I worked on was on a Google Sumof Code project which was a Python backend for PIPI. That was a failure, which doesn't mean that it's impossible to have a site on Python for PIPI, but that particular approach was not good. I've worked also on PIPI. I was paid to work on PIPI thanks to the community, so that was really awesome. Thank you to the community. And I have a pet project called PyMetabiosis and you might know what PyMetabiosis is if you stay until the end of the talk and you don't run away. So, this talk is about how can we get better implementations. We all love Python, so we don't want to throw away libraries and language features. So, yes, let's take a snapshot of the current situation. So, basically, everybody uses Cpython, but Cpython has quite a few deficiencies, but performance is one massive deficiency and the gil is also one of the infamous gil is one of the other deficiencies. And PIPI, a few people use PIPI quite a bit and PIPI has great performance and PIPI STM is getting there, so that's cool. And according to PIPI stats, at least, other implementations are virtually unused, like hundreds of downloads every month or something like that. So, let's make a quick poll. Who has used Cpython before? Okay. Who has used PIPI, just like typing PIPI in a prompt or whatever. Okay, quite a bit. Who has used Chython, Iron Python? Quite a few people. So, and in terms of other languages, we have fast languages that weren't designed to be fast, like JavaScript or PHP, and they are fast now. There are also other languages like Go that don't have a gil and are getting used, and languages like Lua and Julia as well. So, yes, for example, PHP wasn't designed at all for performance and they managed to have good performance with HHVM, which Facebook uses in other people. Like, if PHP can do it, we can do it. So, yes, what's specific about Python? Because PHP or even like JavaScript, everyone use a fast JavaScript runtime these days. So, why can't you use PIPI today? It's pretty hard to switch between implementation because you have tons of C extensions and C extensions don't work that well on PIPI. So, why can't we improve Cpython? C extensions make assumptions on how the Python runtime behaves, and it's not good because the assumptions that it makes is, well, basically slow implementations. Bye-bye, well, no C extensions or kind of, you can use some of them. So, yes, if you treat like the Python ecosystem, you know, like kind of market view, you would say that like the C extensions are lock-in, it's called in market theory. So, basically you use Cpython, you use C extensions, and then you're stuck. And yeah, more competition would benefit everybody, even like if something tomorrow beats PIPI, like I'm not gonna complain, I'm a Python user. So, if something better than PIPI shows up, it would be great. And yeah, basically if you write in Python implementation tomorrow that covers 100% of Python, if you don't have C extension support, then your implementation has to be really, really attractive. So, that's not great. So, yes, why can't just PIPI or Jython 9, Iron Python implement the CIPI? First of all, even if you implement the CIPI, it's not enough because, for example, if you use Python, use bypasses the runtime and just looks inside structures and stuff like that, they have a PIPI mode where they don't do that anymore, but they do that on Cpython. And even if you implement the official API, you have to use or emulate reference counting. Reference counting is kind of slow, and if you were at Larry Hastings talk yesterday, you can notice that, well, having reference counting make removing the gil particularly impossible. So, there was this experiment with a patch that removed the gil on Python 1.4. It was revived by David Beasley a few years ago, and yes, it was much slower than regular C Python. So yes, basically you have to choose between performance and concurrency on one side and the CIPI on the other side, which is pretty bad. So, I wonder, like other languages have, they have bindings to other libraries, to C libraries, right? Like they don't live in their like closed world. They use C libraries. So, I found two families of CIPI out there. There's the JNI v8 and Lua Julia families. Lua Julia is kind of more of a stack-based CIPI. If you use Lua Julia, you probably know what I'm talking about. And JNI and v8 use handles, which is also what CFFI uses. And these don't make assumptions on the runtime because, well, for example, JNI, the Java is already, the JVM is already fast. So, the CIPI is based on a fast runtime. So, could we have something for Python? We can. You would still need to like remove ref counting and stuff like that. So, that's not many changes, but those are pretty massive changes for people who write C extensions. The CIPI, you can have your own CIPI with CFFI, basically. So, you can already do it today. And yes, it's a big political problem. So, good luck convincing people with massive extensions to use that API. And yes, if you were to go down that path, we would have to have the two APIs on C Python so that people could solely port their code. I think it would be possible to have an incremental way of moving APIs. So, yes, I've spoken about why everything is bad. So, well, that's PyPy fitting this. PyPy is the most flexible implementation around, mostly because it's written in our Python. So, you can switch, for example, the garbage collector easily and you can make a lot of changes pretty quickly. So, PyPy has a JIT, you can go and speed.pypy.org. You will show that, it will show you that it's seven times faster on our benchmarks which doesn't really mean anything. It means that it's faster on our benchmark, it doesn't mean how fast it will be on your code. So, the JIT allows PyPy to compete with other fast and dynamic languages like JavaScript and family. And one of the principles of the JIT is that you pay the cost for what you use. If you have, if you see Python, you need, Python has a lot of overhead if you use naive implementations like see Python and the JIT can remove that overhead of stuff that you don't use. For example, integers are objects. Most people don't use integers as objects. They just use them as regular integers. So, PyPy is able to remove the cost of integers being objects. Other, from introspection is another kind of overhead. So, this, I have a demo that is pretty, well, we've showed that quite a lot of conferences, so you may have seen it. This is a program written in pure Python that does edge detection. This is, this program running on some random video of someone skiing. And so, you can see that it's fairly slow. Don't believe the negative average FPS, that's not possible. So, it's 0.41 average FPS. It's even slow to create, so. That's not very interactive. Let's try it on PyPy now. I do 50 FPS in my laptop. I can also show that I'm not lying to you, so I can use my webcam. That's real time. My webcam is slow, so it's 15 FPS for some reason. It would be much, much faster if I had a decent webcam. So, yes, I talked a bit about CFFI, I've mentioned it. So, we all like to interact with C code. C code is. Not nice, but it's a standard, like you have to be able to call things that implement the C API. So, most people know that, most people who know CFFI know that you can call C code with CFFI, but you can also expose Python functions to C. So, kind of in a C API way, you can embed PyPy and expose functions to C world using CFFI. And it's also very, very fast on PyPy, benchmark it, but it's really fast. And yes, you can do basically everything you can do with the C API. It's like a different way of thinking, but you can do the same thing. Well, STM is a work in progress. We're removing the gil, that's kind of exciting. And you don't have to use threads and locks. Threads and locks are a mess. So, that's good. You get all the benefits without getting all the bad stuff. And yes, as opposed to using multi-processing, you can use memory between threads. So, okay, I said that. Well, PyPy basically can't implement the C API fully, but there are workarounds we can have. So, for example, we could have a bridge between PyPy and C Python, and then tell Python, like this is the method I want to call, do it on your own and call that extension. And with that, we could basically, yes, bring the scientific stack to PyPy. And well, it would be great if we had that. So, I'm gonna show you that we have that. Okay, so I implemented something called PyMetabiosis. It's just a regular Python module that works on PyPy. I'm gonna prove to you that it's PyPy. So, let's import the C Python module. Let's import matplotlib.pylab. Pylab is pretty slow to import in general. That has nothing to do with it. So, let's start plotting some stuff. Not very fancy stuff. So, matplotlib usually doesn't work on PyPy. It's a C extension that we don't support. We get C extensions. In the case where PyMetabiosis is kind of like workaround, we also have, like, better alternatives that you don't require you to use PyMetabiosis for, for example, for NumPyPy. For NumPy, we have NumPyPy. For LXML, we have LXML CFFI, which is kind of a port basically of LXML from Python to CFFI. And for PsychoPG2, we have PsychoPG2 CFFI, and people regularly write bindings for different libraries for CFFI. So, as a summary, we can do a lot better than what we're doing right now. PyPy is working on it. Making an alternative implementation friendly is, ecosystem is pretty hard, like this extension API is not really going away. But at the same time, it would allow us to have a much better ecosystem. So, yes, thank you. Any questions? Just one stupid, maybe kind of not related question. They are no stupid questions. So, in Python, you basically all, all object that the referencing, like getting attributes from objects costs something. And when PyPy compiles it, like, if you have a chain of four, the references, like, from this object, get this, and then get this, and then get this, does PyPy optimize that a lot, or does it just do the same chain of the referencing but in assembler? You still need two differences, two differences of these pointers, but you can do it much more efficiently than if you were doing, so Python objects are implemented as dictionaries on CPython, so like you have to hash your attribute and then look it up in a dictionary and stuff. We don't have to do all that stuff in PyPy. It's just one array lookup, so it's much more faster. Okay. So, you said in the last slide, helps needed. How can we help? Well, for example, write bindings for your favorite library to CFFI, for example, that's a nice way to do it. Yeah, bringing more libraries and lowering the variator entries, the thing that's like, you don't have to be a PyPy expert to do that, so that's definitely the best way to contribute. So, CFFI is compatible with PyPy, so what's the catch of CFFI extensions? So, if it's so great, it allows you to do so many stuff, why isn't everybody switching to CFFI from standard CAPI? Well, I don't know, as the crowd, I don't know. No, but people already use the extensions a lot. CFFI is kind of new-ish, so. Yeah, but it isn't slower or something like that. There is no catch. On CPython, it's slower, but it's CPython. What do you expect? What's the state of Python 3 on PyPy? Apparently, it's 3.2.5, right? Yes, we have support of 3.2. We are slowly reaching 3.3. It's kind of, like, it's not that hard, so people can easily come and contribute, so I'm definitely making a call for people to help porting to Python 3.3, so we can come to the sprints, for example. What would you estimate? How long could it take to Python 3.5? It's like a fully volunteer project, so when it's done. Thank you. Thank you for the talk. Do you think a bridge-like solution can work in the future? So we have thousands of C extensions, and we cannot just get rid of them, but we would like to migrate to PyPy, and we can wait until everything is migrated to CFFI, but it's not likely to happen that soon. Yeah, so that's why I work that bridge. It's a lot simpler, like, you can get 90% probably of the libraries working directly. There you won't get any speed up from those libraries in particular on PyPy, the rest of your code will get faster because it's PyPy, but this is using CPython, so it's as slow as CPython, and there's some overhead in using the bridge, but yes, if you want compatibility with C extensions, right now use the bridge, it's ready, well, it's not ready, but use it and report bugs, and yes, that's usable, not in five years, but right now. Thanks for the great talk. How do your bridge works internally, like you have two interpreters running at the same time, or how does it work? Yes, you have CPython, PyPy basically runs CPython, and they run in the same process, so you can efficiently share memory, for example, well, it's not implemented yet, but in the future, we can have past NumPy array to CPython, well, between CPython and PyPy, without doing any data copy, for example, so that's pretty efficient. We don't have to do RPC codes and stuff like that, so yes, it's in the same process. Anyone else? Otherwise, thanks again, Romain.