 Thank you. So, yes, I've been a PyPy co-developer for a long time and I've had to think a lot about C extensions while working with PyPy and that's a bit of a problem for us dealing with extensions and that's how the PyPy project came about. So, first let's talk about C extensions. As you all know, they are central in the Python ecosystem. Of course we all love to program in Python but also we all love having available powerful libraries that enable all sorts of things including sending a telescope to space and analyzing the vast amounts of data it produces and that is all based on extensions like NumPy which uses the C API directly but also I can think of libraries like Pandas or Scikit-learn which use mainly Cyton and so don't interact with the C API directly but use a tool instead but Cyton does clearly rely on the C API and you also have other possibilities to write extensions in statically compiled languages you can use C++ with well you can use it directly or you can use PyPy 11 you can write the extensions in Rust but all of that relies on the C API so by the way I'd like to know in the room how many people have used C API directly quite a lot and how many people have written extensions in general about the same so to go back to PyPy and I mean in general in the Python ecosystem there's I guess one big drawback of Python and its performance that's why we reach out to different languages when we need performance and there is there are a lot of attempts to get better performance so you have PyPy and but also grow Python which has a somewhat similar approach to PyPy that completely re-implement the core of Python and then the problem becomes how do you deal with the C API because you're not even using C in the core so in that case you have to emulate the C API and that's that's quite difficult as I know but you also have other alternative implementation of Python that started off C Python so well the folks of C Python so should be easy to to keep supporting the C API except that obviously they fork to change things and they want to change the core of the interpreter so then they have to make a trade-off between keep supporting the C API or doing the changes they want to change and another way to to improve performance is inside the C Python itself there is nowadays a lot of focus on performance so you have of course the faster C Python project which well has to tries to change the internals but it's C Python they can't afford to break the existing extensions and and another project to improve performance is this time to deal with concurrency allowing several interpreters inside the same process but that runs into issues with the C API as well so let's talk about the problem the problems with the C API it's well first of all the C API is quite old it pretty much goes back to the the origins of Python and it kind of grew because people started using the internals of C Python to to write interesting extensions such as the ancestor of NumPy started I think in 1995 and over time the C API got formalized but it remained close to its origins as just exposing directly exposing the internals of C Python and because of that the the layout of the of the built-in objects is part of the C API like if you have if you have a float object you can directly access the the member that contains the value and well you're not supposed to do that but you can't prevent extension writers from doing it also the C API started at a time where concurrency wasn't even an issue to see before multi-threading and it's when you call a function from the C API it often reaches out to global interpreter state things such as I know you can do imports and touch system modules or you depend on the encoding and the and also because there are so many extensions out there every time you make a change you break someone's code and to go into a bit more technical details the fact that the API is based on pointers everything is basically a py object star that you pass around and because of that well you have to be able to dereference this thing that you pass around and well if you had a bit of an interaction you could do interesting things such as pointer tagging where you like encode directly into where you use the low bits of a pointer to encode a different information or you could have you could represent say lists of floats or integers by storing just the value instead of having to always have a full-blown object present and and I said it's hard to change they have been attempts to alleviate that but so there is a stable a bi but it has some issues it's not used a lot and I guess that's because it exposes too little like people want things that aren't in the stable a bi but at the same time it's not that stable because there are actually a lot of things that shouldn't have been part of the stable a bi but got put in by accident and the big issue and the central motivation for for each pie is that the reference counting semantics of C Python are crucial to to use the DAPI so let me explain the issues with ref counting specifically by contrasting it with the approach used by Pi Pi but also a great graph Python which is to have a tracing garbage collector so in with reference counting you every single object keeps track of the number of references to it that exists and so you know that when this ref count goes to zero you can delete the object by contrast the tracing GC just records the the edges in in the object graph records which objects refer to which and and figures out that way which objects are alive and which objects are dead and the the issue with reference counting is that basically every time you want to read an object you have to write to the to the ref count field so that that turns every read into a right which is bad for for cash which causes a lot of issues once you start to think about concurrency and that's basically the reason we have the guilt because we need to change this ref count every time we well we need to have a lock so instead of locking every single object we we have the global the global log so what is H by going to do about that so let me start by a bit of history the project started three years ago exactly so it was in in Basel at the last Euro Python and a few Pi Pi developers had had gathered to discuss again but the all these issues and after talking with some C Python and site and death we decided to to create H Pi so since then we've had a lot of interest from the developers of crawl Python which have exactly the same same issues and we well we agreed on this set of goals so get rid of reference counting but also make sure that there is no implicit state make the global state explicit but at the same time we wanted to be want to make the transition as easy as it can possibly be so the new API must not have any overhead on the C Python and we also want to provide new possibilities such as having what we call universal binaries meaning you compile your extension once and it works on all the interpreters and all the versions and also we have tools to help writing extensions with it with H Pi and so to accomplish this goal on C Python there are two modes you can either have well there are basically two versions of H Pi in C Python one which is basically just a set of headers that translates into call that translates it calls into H Pi functions to calls into C Python functions so you only need it you only need H Pi at compile time and the result is an extension that behaves like something you've written against the C Python API but the other mode in the other mode you have an H Pi runtime that implements the same API on all implementations of Python so that the extension only need to calls into the runtime and depending on the interpreter will have you'll get different implementation and with this approach we can also provide a special runtime that offers improved debugging so here's what H Pi looks like very simple code you just include H Pi dot H well there's some there's a macro to simplify the boilerplate but and this in this API call well you see it's almost like in I think in C Python it's by number absolute and said you write H Pi absolute and all functions in H Pi take an H Pi context argument which presents a global state and and these H Pi objects which are the handles so the handles are individual references to an object so meaning that you can have two handles pointing to the same object but they will be different different objects have different values at the C level and now instead of doing an ink ref on an object you duplicate the handle and with that it means that you can match precisely every time you create a handle or duplicate it you need a matching H Pi close and that actually makes programming a bit easier because you can clearly match where the object is created and where it's when you don't need it anymore so going a bit into details there are three kinds of handles the H Pi are only for short lived references so it's for the local variables we use something different for storing objects and that is really helpful for garbage collectors now the context it looks like this so you have a place to store private data well version we don't use it at the moment but the context stores a bunch of built-in objects or handles built-in objects and it stores the whole ABI of H Pi like there's a whole bunch of function pointers and all of H Pi is implemented by calling this context so when you have a function in the in H Pi it's really doing CTX arrow CTX model create and also the this context can implement a debug mode which by the way doesn't need to be recompiled it's the debug mode is just a different context so I just need to say H Pi debug equals one set an environment variable and then you have debug information on on leaks on uses of invalid handles and also same thing for memories that's returned by the interpreter and because we have like only one place where the handle is created we can know if there's a leak we can know precisely which which handle leaked with so which function is responsible for the leak so to convert existing extensions you have to first the one thing you do in staying with the Python it's by only supports hip types so you have to do that first and then it's straightforward translation of type specs and model specs to H Pi and then you can actually do the port which you can do function by function or method by method and we've built a few prototypes inside the H Pi so we have you Jason my thought live which is the Kiwi solver and an attempt at NumPy so to conclude we H Pi is not ready yet and but it will be soon hopefully we still need to implement all of the myriad functions from the C API we also need to solve the packaging problem because at the moment these universal binaries are loaded loaded using a hack in these utils we need to be able to provide wheels we're still on our version 004 so we are not actually a BIS table but the mechanisms are in place and finally another thing we'd really like to have is allowing people to just write each by extensions without knowing it through sighting and possibly other tools so thank you for listening yes thank you very much we have time for one or two questions please ask your questions at the microphone yeah hey nice talk you have said on slide 5.1 that people could that the problem with C Python is for example that you cannot really do pointer tagging so my question is what kind of information would you actually store in the pointer tag because well that could be for example types right but you may have like tons of types and then probably you will not have that much yes well it's depends on the implementation actually this pointer tagging is already implemented in the HPI version for Gral Python so they have well when you do a pointer tagging you well you know that your pointers are you basically take a pointer and you take some of the bits that aren't really unused yes you use this to store some information yes but you have if you you can store a tag on the last few bits so say the last four and no for me three and if the the low bits are not zero then you interpret the rest of the pointer as something so you can store a lot of integer values you can even store floats if you want my question like specifically is about like what kind of information are you storing there because you said that HPI is using actually it it's used on Gral Python so I don't know all the details but I think they can store integers floats and some well-chosen built-in objects like non true false kind of thing and that already removes a check for those yes we have time we have time for one small question short question maybe you can discuss the other question with him later sorry I didn't hear you does only target Python 3 or does it target Python 2 oh it's only a Python 3 at the moment it supports I think from sweet point 3.6 to 3.10 on C Python on Python Gral Python so need the last version yes well thank you very much please give him another