 Yeah, so good afternoon everyone. So as we already have been introduced. I'm Florian and this is my colleague step and we work at oracle labs So we are in software engineers there and we are in the ground team in particular We work on ground python which is ground pie. Sorry, which is a Java based implementation of ground python And we are also hPy code developers So we weren't in the group of the hPy founders, but we were joined very early. So we were there since almost at the beginning. So this talk is about hPy and tries to motivate hPy introduces introduces it and also we try to show you the benefits of it and in the end We want to convince you to use it. So What do I expect or what do we expect from the audience? I mean, it's not strictly necessary But we think it's beneficial if you are experienced programmer python programmer. Maybe even wrote six tensions Yes, should know see a little bit and the memory model and so on And have some weak understanding of the CPython internals and how the CPython had a look like But I think that the previous talks made a good impression on that. So I think it should be fine So let me quickly try to motivate hPy. So CPython as we already heard this day Is is the reference implication for python and it is a byte code interpreter written in C So there are several Alternative roundtimes for example grad pie pie pie try thin you may know some of these and most of them try to improve python execution speed by using some optimization like using a cheat compiler having some different data structures using a moving to see and It turns out some of those projects were pretty successful in doing so So for example here is a chart where we run the pipe performance benchmark on grad python and we are on average 4.x times faster than a CPython most notably compared to CPython version 3.10 so the newest optimizations From marker are not included yet But yeah, and I need to know that see that Python performance benchmark only contains Python code as far as I know So the Python C API it I mean since since Python became very popular And also numerical computation started to use it uses it use it it quickly turns out that Python Performance is maybe not sufficient So the first C extensions appeared because since CPython is also written in C. That's a perfect fit, right? so CPython kind of started to allow C extensions And there was not really a design phase to do so. So what happened is that C extensions Used the existing API's and there are some problematic points with the C API as it exists So for example, it exposes or most notably exposes a lot of implementation details as we heard already today So for example, it exposes Data structures fields of data structures. It exposes reference counting The lifetime of objects is managed by using reference counting It exposes that objects are referred to as point C pointers Which which means that you know the memory location of objects and you can defer further assumptions And all this is happening in C extensions and that makes it very hard for all ternary front times to implement and to support C extensions so at this point I want to refer to Victor Stiner's pep 620 which Really nicely summarizes a lot of problems So let's just pick one example of a problem So reference counting since ground pipe written in Java and Java has the most advanced mature GC implementations It's really bad that we can't really use or like can't really Use our GC and do optimization or like to do its proper work on C extensions Because reference counting is basically preventing at the Java GC to to be used since you know reference counting means you to to a certain degree you do manual Management of the life cycle and the GC then cannot just this I cannot collect collect the garbage since you need to define when it's collectible The GC is but is not only about Collecting garbage is also a memory manager basically So state-of-the-art garbage collectors can do superfast allocations can do superfacilocations They have no to minimal to no pauses of the application and can make use of multi-threading. So The question is now why should you as a user care about this and for this I have one number for you So there's the GC bench dot pi which is kind of benchmark for for the GC and ground python is on this one 17 9 times faster than CPython and also know most of the big pipe is also 25x faster So it does make sense To to have alternative implementations that can do stuff at some point a bit a little bit better, maybe Okay, so let let's now switch to H. Pi H. Pi is a novel CPi for writing C extensions You basically do instead of include PI of not age you include H. Pi dot age It's funded the project is funded directly indirectly were open collective or actually IVM coincide and an account And H. But tries to be a more abstract API. It tries to hide the implementation details. It Ames to be faster on alternative front times and aims to be easier to implement on alternative front times And also a dedicated goal is to be GC friendly So we define we as a each by co-developers define a list of goals So H. But should have zero overhead on CPython because we know if just by switching to H. Pi There is a performance penalty. You won't use it So we really try hard to to fulfill this goal and to reduce the burden of switching to H. Pi Also, there needs to be an increment a incremental migration path because if we know the porting One big C extension as a at once to H. Pi is almost impossible and very error prone So you would also just give up on that. I guess So we H. Pi tries to be fast on alternative front run times as mentioned H. Pi wants to provide a better debugging experience And a very important goal is H. Pi provides the universal universal a bi which Allows you to build one binary that can run on multiple interpreters And the other way round we also Provide backwards compatibility, which means you can run different H. Pi versions in the same interpreter Okay, so how does H. Pi look? It's very simple. I hope you can read the examples So we you start by Just including H. Pi dot H instead of Python dot H as I mentioned and then here we write the very simple C function that Just creates Python Unicode string out of a C string and returns it so in H. Pi we use some some Let's say sugar to define all the methods here It's a macro called H. Pi deaf math which defines that there should be a method Called and the Python attribute is called say hello. It takes no arguments the implementation is is by a convention Say hello underscore input And then we just need to register to the list of methods and into the module In the end you use another macro to generate the module in it and that's it So very similar to see API and that's by intention of course so on the other side you see Little set up the pie how to build H. Pi So it's basically the difference here is that instead of using X modules you register to H. Pi X modules And you need to depend unfortunately to H. Pi at some point if CPython maybe Takes over H. Pi then you can maybe just drop that but let's So then in the end you can just build Run this setup to pie and there is an additional option now available where we can choose the API mode you want to compile for Okay, so how do we reach the zero overhead on CPython there are multiple compilation modes and So the first or the most important one is the universal API Which means as I already mentioned you can build one binary for multiple interpreters this works by By having a H. Pi context which is kind of a function table and you do all the calls indirectly through this context So this will be the path from here. You have your extension you compile for the universal API Then you have some we have our own API tag and then you can run on the different interpreters So we each pair also provides custom APIs Custom API means we map H. Pi functions to interpret the specific API functions So for example, we did that for see a CPython API already Which means we map H. Pi because to see API functions of CPython So like H. Pi unicode from string will be pi unicode from string So there is no there is no runtime overhead in both then in the end. It's just a compile time Dependency and this would just to to show you the path you compile your extension in this mode and then you get an CPython specific shared library and you can only run or Usually you can only run this on the one interpreter And in theory, but that's not implemented yet You can have ABIs for other pythons as well like RAS Python There has been some work on that, but it's not finished yet So last but not least there is the hybrid API Which is used to Which is the mode you use when you do incremental migration. This means that you already use H. Pi But still have some see API function calls since you're incrementally migrating to H. Pi Okay, so how does the incremental migration work So this is just catching the progress you usually do you start by Converting your module definition to an H. Pi module definition and that's very similar. So it's really just Instead of pi module def you use H. Pi module def you fill your fields and you keep your Functions your legacy functions. So that's how it looks in see API just as a legacy methods Here in the definition, and that's the first step basically then you already create an H. Per module, but using legacy legacy API we and then you can continue with Migrating the types to H. Per types, which is very similar You you translate your py types back to it H. Per types back and again keep your functions as legacy methods and slots and members while doing so You can always communicate with or always use the legacy code as well by using these conversion functions H. Per as pi object and H. Per from pi object and After each step you just build your extension with in the hybrid AVM mode and you can test your whole application after each step Okay, so a few performance numbers. We already ported Kiwi solver and Did run run their only benchmark? I think I think which is the such suggest value and The blue line is the C API see a perversion so like the original Kiwi solver and yeah, you see it's about Zero dot one three five and in the C Python API mode. We are almost there I mean it's even a bit faster But I wouldn't give too much on that because there is some measurements error of going up going on, of course So it's basically I mean the difference is very low And then expectedly the universal API is a bit slower But it could also be just just an error, but that Already convinced us or like gives gives us a good feeling that the C Python API mode is really The best be gives you the best performance on C Python So we also did migrate started to migrate numpy and did some measurements there. It's not yet done So there's still lots of benchmarks That to use just C API But some of those already use HPI and this is an in-between step But we are we are kind of observing this while we migrate numpy we look at the benchmarks and here you see The C Python API mode the median of all the benchmarks is almost it's at one Which means it's basically as fast as see the C API version and in the hybrid API mode Which means we are doing the calls through another interaction We are a bit slower like 2% and also the cheater is a bit higher So but that that is already also a good a good outcome for us So we will continue to work on numpy Okay, so a debug mode is Is the thing where we try to to provide the better debugging experience and it can also be called the strict mode So it's an optional runtime mode, which you enable It at load time basically of the module. So you don't need to recompile anything It strictly enforces the HPI contract and does additional bookkeeping of resources So the goal is here to prevent unintentional misuse edge of the API which happens to work on some interpreter because of some optimizations And our debug mode is right now able to detect problems like leak handles Usage after close lifetime of data pointers like if you get the data point of an bytes object that's all also read only so it checks also if if you're writing to this and You may also not store the each by context somewhere and reuse it and we also check it So that's that's the mode where you test if your extension is ready for all interpreters so how the most useful More useful function of the debug mode is the leak detected I think so I wrote here a simple example Where we just create a Python integer from a C long and then we forgot we just forget the close and You can use the leak detector by importing leak detector from H by debug and then you run it as a Context manager and just invoke your your Python using extension and then in the end you get H by leak error because it takes okay You created a handle, but you never closed it So that could be a problem and if you enable the stack trace limit Then it will also show you the stack trace where this handle has been created Okay, so let me for the universal if I want to quickly give a demo So I hope this works nicely so here we have I Have I built numpy with With Sorry, I don't have it on my screen. I built numpy with Python 3.11 in the hybrid mode and here we have That's already growl Python. Sorry. I wanted to start with Again, sorry Okay, so here I have 3.9 and I just run the Python the hybrid the 3.11 hybrid binary With some example where we know that we don't trigger any problematic code path from now Since it's you know, it's an intermediate step so and we can just run this binary on 3.9 and you Also see that we should really use the age by zero which is basically the age perversion and CP 311 so we use the one we built with for 3.11. So we can do the same for Python 3.10 so it also uses the same path and uses the same binary and it works then we can Do it for 3.11, which is the one I built it for of course. I mean that that's expected to work, right? And then most notably there is growl Python, which is like our tool and we can also run this here hopefully Yes, okay So that's that's already a bit impressive because we Sorry, we can now build one binary and use it on on very different interpreters. I mean growl Python is a completely different implementation So some words about numpy hpy This one is a very hard one to migrate of course and we we chose to do numpy because we think if we can do numpy we can basically Migrate any package to now to hpy. We already invested almost a one year for full-time equivalent employee and Yeah, numpy is just huge. It says 180,000 lines of code using C 80,000 lines of code of those are using the CEPI then for we changed 40,000 lines of code already and 15,000 lines of code are Fully migrated to hpy. So there's still a lot of work to do There is also numpy has its own C API so we can use numpy from other C extensions And there are 261 Functions or entries and we migrated already 118 so it's roughly half So still work to do but we think the hardest part has already been done We migrated most of the types or at least the hardest ones and for example We needed the metaclass support for heap types and we were in the group of initiating that for CPython Which is by the way already now merged So if you want to to see our progress, please just check it out. It's all public on on the HPEG project So what's the current status HPEG is currently at zero version zero dot nine we partially migrated Several different packages are like ultra chase and multiple PSU to give you solar pillow Pico numpy, which is an external contribution and Numper as I mentioned we also plan to have a side and back end We already created a proof of concept and we will have a person working on that in the next half year So I hope there is some some real progress going on So Now as we go to my colleague I will say a bit about our future plans and about the HPEG community so right now we want to concentrate on the numpy HPEG work and We want to do this step-by-step working with the numpy developers and After that and once we think we figured out everything hopefully we will release our first release Something about the community so Florian can you help me here? I'm not a native Mac user Right. Yeah, so I was gonna say HPEG lives on github and HPEG itself is basically a CPyton extension We first developed the functionality for CPyton So basically we are doing this translation from the HPEG API to the CPyton API So the work that people do on this in this repository is the design of the HPEG API and Then the implementation for CPyton. So if you know how to develop CPyton extensions, you can already Just find your way around this repository I would hope and you can already contribute. There are issues there There are some issues which can be quite suitable for like a start like a startup tasks we also have documentation slash landing page, but we are VM developers and compiler developers and so on so we're not really good at Web design and things like that. So that's also some an area area where we can use some contributions and Also, like it would be great to extend So there's the set of people who contribute to the design of HPEG with people from different and other Python implementations and From other bindings like site and Pi bind 11 etc Yeah, right Okay, so that's basically from our site. Thank you for your attention and we're happy to take questions Thank you very much. We still have some minutes for questions. So please queue up maybe the first one in the back Hello, I actually have two questions. Sorry about that. I want to say this was really impressive So, thank you for that. That was not a question and now the questions When you have a single extension module that you can load in various interpreters is there a runtime dependency on part of on a package that it's different or is it compiled in So like for example with sip that this used in by cute and stuff like that You would have a different extension modules with sip and then other extensions modules would be universal and just use a different kind of that or is it a Single non-dependency thingy and it includes all of the stuff so the C extension itself is Self-contained usually I mean if the C extension links to some libraries on some I don't know compression library For example, then of course is a dependency, but there is no runtime dependency to HPI except of course on CPython you need the HPI module to have installed because in CPython HPI itself is another C extension. So You just need to install HPI on CPython, that's it So for you for ground Python, for example, we have intrinsic support for for HPI So there is no no dependency. So I Hope that answers the questions. I think it does. Thank you And the second question was if you say you ported a large part of NumPy below and other projects Has this been merged to the projects itself or is it just like a fork for now for most of them? For now, it's a fork. We plan to up stream that but of course, I mean these packaged authors want to have confidence that HPI is something that Yeah stays and we are in the process of convincing and getting users so yeah, right now it's in forks is public, but it's in forks and For NumPy, we have really good support. So Mati is one of the NumPy code developers is on our side basically and Also Sebastian Berg, which is one of the code developers is very interested, but it's still a lot of work to do, right? I mean getting all the benchmarks To stay at a good level and also getting the tests Pass yeah, there's still a lot of work to do Thank you. Thank you now in the front, please To what extent would it be possible to build tools to port? CPython extensions to HPI assuming that the CPython extensions are reasonably well behaved because They can that's a good question Yeah, I think or we think that to a large extent you could automate the very simple the boilerplate Tasks we already we would we have already hired an intern That should have done a tool, but the one yeah canceled in last minute But we already plan to do so. So like a simple tool that would convert your module specification It would convert your types and everything you can do and then something is left Of course, and then you need to manually fix up so that would be the plan and I think we can do a lot Maybe I would add that the very first step With the migration is like a migration to heaps types for example or migration to multi Multi multi face in it a module in slideshifting. So some of the steps You have to take anyway if you kind of want to Have your extension be based on a modern CPython API's and those steps are a bit harder to Like automate, but you would you kind of want to do them anyway? Thank you We have a question also in the back so I wanted to ask how safe is it for a maintainer of package with C extensions to migrate to HPI Right now. So should I wait until there's at least a 0.9 final? Probably yes, so yeah, we are going for stable release of course and I would wait for that But I mean you can you can start now and test So about the safety, it's hard to tell. I mean if if no one ever is using HPI It's hard to keep The project alive, of course, but I mean we are working on it for four years right now And it's still there. So I have good I have confidence that it will stay but And there is a that the risk is you could also switch back again because you can generate from HPI You could generate see a pack holds again. So you could just step back And yeah, maybe one thing I would add for the risks. There are like multiple projects involved in in HPI So I think that kind of should lower the risk factor factor It's open source. It's on github. So so like we're trying to do as much as we can to lower the risk and Yeah, so we're aiming for Some stable release and at that point I would say it's safe to use or should be safe to use And at this point it would be very useful to get feedback Thank you. Thank you. Now we have a question in the front Hey Thank you for your talk Maybe a very pragmatic question. So imagine You migrate numpy to HPI, right? And I have a library that depends on numpy Let's say something like pandas and it's not yet migrated Can I already switch out to the dependency and get benefits from your work? Yeah, how to answer I mean if you have if your module is still on CPI then Then you are kind of bound to to those restrictions, right? You can use you can use numpy HPI because you can just have a little bit of glue code that would just do the conversion as I showed You can use this HPI as and from PI object. So they would just convert between those But if pandas is not not on HPI It's hard to make use of all HPI benefits, right? You could still I mean that that's still possible You can use the debug mode for your dependency that is in the HPI So you can use debug mode in this case for numpy and find some leaked handles, but you cannot use it of course for for pandas So partially, I would say you can make use of the benefits partially Okay, thank you. We have the last two questions first I will read one that we have remote. It's how is the pep seven zero three affecting HP why so the no-gill We We discussed it a lot. So we tried eight pay always tries to be prepared for that Since it's not yet Implemented it's hard for us to to do anything But yes, it should be and we try to enter that should be ready for the no-gill This is being recorded, you know Not sure Maybe I would just quickly add to that because of the design of HPI, which is GC friendly and Abstract things I think HPI would be actually if if CPython was using only HPI right now All the extensions were in HPI This pep would be in much much better place because they could do much better things even implement GC instead of reference Counting which would be better with free trading As well as for multiple interpreters as well. So we have now the question in the back Yeah, hi. Thanks for an interesting talk I would like to ask it might be simple question. What are the age in the HPI name means? Yeah, it's on the website. So I assume you didn't visit it so far. No, it means handle it stands for handle So it's handle pie basically The core idea is that instead of pie object star pointers to actual memory where the objects lie You work with handles and those are abstractions. You shouldn't be able to see through them into the implementation details So quickly the last question Thanks, I think you've answered this before probably but just to clarify with the growl pie implementation Are you able to leverage the Java GC? Mechanisms or do you are you locked into reference counting still? So for each plan. No, we are not locked into reference counting in HPI We have I mean we didn't go into detail too much here But we have another kind of handles which is the HPI field which you use if you store in your C Structure some other object that is tied To the lifetime of the owner object and that's something that the GC knows and can use so yes, absolutely We make use of it Okay, thank you. Cool. So we are running out of time. I'm really sorry, but maybe you can catch them afterwards I guess that you will be available during the conference. So if you have any other question, maybe you can ask them So let's thanks again