 So after this very nice overview of pi script, we're going to go a bit more in detail and see what is Necessary to actually take a package and run it in the browser for instance in pi script or in other environments So myself, I'm an ML engineer or where I do consulting work At Cimario, which is a company based in Paris my background is computational physics And I'm also so I'm a maintainer of piodide and previously I was working on scikit-learn So in this talk, we're going to introduce piodide in a bit more detail Which is a build of CPython for for the browser and we're going to talk Explore the topic of how can we optimize the Python packages or Python applications to be smaller and load faster on the web page So Just a brief slide on web assembly So as you as we you already so web assembly is this binary instruction format which allows you to to run Like for instance a code in C inside the browser So it's portable. It doesn't it doesn't depend on the architecture It's secure and then you can run it for instance in the browser or in Node.js So because it allows you to run arbitrary applications in particular You can take the CPython interpreter and run it inside the browser with this technology and of course if you want more details about this Please stay for the follow-up talk by Antonio So what is piodide piodide is we take the CPython interpreter We and we compile it to web assembly using mscripten, which is this built-in chain Which allows you to to do that so since since last year CPython also has tier 3 wasm support So so it's it's now it's a target. It's a it's a build target Which is to some extent supported by CPython In addition what we have is a is a python javascript in foreign function interface Which allows you to make calls between python and javascript so from python you can call javascript objects and from javascript You can call python objects So this is for instance what you saw in the examples of Pyscript where you interact with for instance the DOM or things like this And in addition we so we build a number of packages for this target platform So for instance here I illustrated some of the scientific python packages, but we have a fair fair number of those They are distributed with the just deliver CDN, which is a CDN Which is an open source CDN used mostly for javascript historically And we can also install things from pipi, which are assuming there are pure python the wheels Now what is actually involved in building a python package for the web? So if you have a pure python package the good news is you can just use it as this so Assuming it has a wheel so if it has a wheel so it The file extension would look like something like this Then you can just load it with micro pip and it should work assuming doesn't use unsupported functionality in the browser if you have a python package with C extensions or rust extensions which is the case for instance for scientific computing packages or Like a fair amount of packages in python then you need to cross compile it for this target So similarly essentially to any other platform you need to compile it somehow There are two ways of doing this The first way for instance is by adding it to the pydi distribution which which involves writing a meta YAML file Which is fairly inspired by conda, and if you do that then you will be able to use it through pydi the second way is Using pydi build which is this new tool that that we would have designed which which is essentially wraps pipe a build Which is the standard way of building wheels on python, and it allows to build wheels for for this platform Hopefully we have Very soon we'll be able to do this with CI build wheel which is a which is a CI project that allows to build wheels for different platforms in CI and While it's not yet possible to upload such wheels to pypi, there is work in progress to make it happen Of course, this is the like the ideal case This this allows you to have a wheel, but then the question is will it work in the browser in in if it's just Your code does just like pure python Pure calculations for instance it will work if it does uses some functionality which might not be supported in the browser for instance it tries to access sockets or it tries to Create sub-processes that that will not work So in that case you will have to apply some work around for your package to make it work And I would also like to mention that we're like pyrite is not the only project that does this There is also M script in forge, which is an initiative to also build packages for for the for the Wasm But more inspired by conda forge and use like that aims to be eventually integrated there A few words about the foreign function interface, so so this is an interface that allows you to communicate between JavaScript and Python So for instance You can use JavaScript objects from Python if you do this import GS So so from it from GS you can import things and those are the essentially the JavaScript Functions or or objects that exist and you can call them from Python So so there's a like automatic conversions of simple native types, and then other more complex types are proxied and on the on the other way In JavaScript you can essentially called call Python functions With this This way so so this this this allows you to essentially kind of merge the two ecosystems together By using things from JavaScript or from Python interchangeably And if you want to learn more about this we have you can have a look at our documentation link below So there's a there's now a growing ecosystem of Projects that are used the use pyrite. So we saw this morning pie script Very nice presentation There is also for instance Jupiter light, which is a build of Jupiter for Web assembly running exclusively in the browser And this has also very large Implications first for teaching so this is like a really easy way to deploy notebooks for for teaching without without having to maintain any infrastructure and for instance for Recently there's also been added support to two things like things gallery Which is a this way of adding interactive examples to the documentation so for instance if you take most scientific pattern packages they use this product this package to have examples and With this with this approach you can essentially have live example that you you can run in the documentation So previous like this is currently done with binder and it works well but however the problem of binder is which is a hosted instance of Jupiter that somebody needs to pay for it and While with this approach essentially since the commutation is client side you don't need to do that There's also other interesting Like initiatives for instance, there is direct P project which aims to integrate essentially Python code in the react apps and there's also a number of dashboard applications So we have for instance that that that are written in Python So for instance you have streamlit you have put the dash and you have Voila in Jupiter Which allow you to have some kind of interactive dashboard written in Python and Traditionally this requires you to run a server to run a server that will interact with this dashboard, which is a potentially a bit problematic to deploy you need to keep paying for it, etc And recently you have you there was an effort to make these dashboards also in a version that runs in web assembly So you can just have a static file and then your dashboard will be interactive It will still be in Python, but you will Not have to host anything like a server that runs permanently now this We have seen a bit a quick overview of pyodite and now let's look a bit on the topic of size and load time, which is a fairly important subject in the web So the situation right now is that essentially if you are running Python You don't really care that much about how big packages are or how much things are being downloaded to your your computer I mean as long as it's not gigabytes. It's fine And What we have is that we have a fairly large packages Which with a lot of code with sometimes things that are not necessarily critical to run So for instance some some projects will will include tests inside the inside the package to distribute because it's just convenient You can run tests with your with your package. It's it's it's very nice But then in the browser you end up loading a lot of things that you don't actually need Which will impact your load time and your essentially user experience so for instance on the right you have an example of loading pandas in in pyodite and You can see that while it took It took it had to download 18 megabytes and then it finished in 15 seconds I mean this is probably might be okay for some some kind of application So for instance if you do teaching and you have a room full of students, that's fine I mean that allows you to do things and it's okay If you are on a high-traffic website and you you just want to do a small dashboard that uses pandas That's not okay because you don't want to download such amount of data just to to for a small dashboard So so the question is of course what can we do something about it? And it's worth mentioning that The browser is not the only resource constrained environment where a thing can give can be run So you have also projects like micro python which allow you to really shrink the size of of python on Like have python code run with much much much less Size but then the limitation is of course that you don't have access to the full ecosystem of python packages That you might need to develop your application So what can we do about it? There are many many directions that could be like explored So one topic is compression. We can say okay. We'll have those all those files. Let's just compress them better We can talk. We'll talk about this the other thing is Well, can we just transform the source code somehow to make it smaller or like less or or like? Yeah, just or remove things. Maybe etc. Maybe minify code So for instance JavaScript will minify code can we do that in python? and the final topic is bundling which is We take a full application and we only keep things that we actually need We only keep the files we actually need and maybe only the functions we actually need so again in JavaScript This is this will be tree shaking Is it possible in python? That's that's a question So in this in this figure you can see some of the things that were already implemented in pyodite So for instance, we already do CDN compression or unvendoring tests, but other things still need exploration and more work So let's let's talk about compression a bit So all the python packages on pi pi are wheels Well, can be distributed as wheels and those by definition are essentially zip files In zip you have the deflate compression, which is okay, but it's not it's not optimal either on the other side the peculiarity of of this wasm wasm Subject is that we are essential line in the browser and we are getting files from the CDN So this this content delivery network and the content delivery network can apply compression Which will be transparent to decompress by a browser So the nice thing about this it can allow you to do a bit better compression that what is provided in the zip files So in particular brutally is quite quite efficient It's it's really Really really good for for python source code and wasm files so for instance on the right you have this an example of compression ratio for For standard library for the python started library if you take the python files it's it's like for instance four times with gzip and it's going to be six times with broadly and Wasm is also optimized to be well compressed by CDN For the python bytecode the situation is a bit more Complex in in sense it is compressible, but it's it was never really designed to to work well with compression So that's not a use case that was like anybody had in mind And so it's possible that if we add some additional domain specific compressor or like Pre-compressor before the CDN it will improve things, but this will need like this will require going quite deep into the CPython internals Another thing to keep in mind if you if you do your deployment yourself is that actually if you take a wheel Which is a zip file which is compressed and then you recompress it with the CDN It will actually make things works because the CDN then will not be very efficiently compressing things So what we do in pyodite for instance is we disable the compression in zip files and only use the CDN Another thing you can do is try to reduce the code size and this could be done with abstract syntax tree transformations So there are many things you can do for instance if you take Simple things you can just remove comments So there's a lot of comments in some projects for instance in the example of standard library There's 27% of comments then you can group Imports I mean those are the kind of things that linters do You can also normalize parentheses single double quotes So this doesn't really matter if you don't compress but once you compress it helps and then there are more other things You can do that that will help but they are there will also Lose like make make your code a bit less pytonic or you will lose some usability So for instance if you remove dog strings again for some locations is fine if you're doing teaching. That's not okay And even more if you really really do want to do minification Well, you can start like renaming locals to be shorter or even globals But then this potentially can break things and that's really really not good for disability But is is the package load time only the size and it turns out that no because if you if you look what's involved is Essentially, you have the first download time and then depending what what you distribute Many things happen. So if you have just Python files the the source code What what happens is they need to be parsed and compiled by the pattern Python interpreter into pie see files before they are Run. So normally in your Python installation. This is done at install time But because in the browser you do both at the same time. Well, essentially this has to happen and it takes some time Or alternatively you can distribute pie see files, which are then faster to load and faster to import But then if you have some errors, you will not have code snippets inside and we cannot distribute both because it's it's heavy So so we have currently we have separate bills for pie and pie see files, but it's a it's a it's a choice and Then a web assembly while it was compiled already for so so your your application was already compiled to a assembly It actually needs to be specialized in the browser So so compiled with a specialization to the target architecture. This is fast But it still takes a bit of time and so as a result for if you take for instance Depending on application. Well, there's different points might might impact differently your runtime So for instance for scientific applications point three is more often a bottleneck And even further for instance if you look If you look at import time in a standard Python So, you know some packages takes a while to import and this has been optimized that have been effort to optimize this But you have to keep in mind that also Python in the in the browser is a bit slower So it can be slower by a factor two to five I mean depending on the code and so for instance if you take pandas a pandas on your computer is going to take zero Zero five second to import while you multiply that by a few few times And then you will end up with just a few seconds due to import time So in optimizing import time in your package is actually important because in the browser will be worse And then Then you can do different things such as try to remove the modules that are not necessary So so if you really want to you have a given application You want to to to optimize it you just remove the modules for instance doing dependency parsing of modules you have this the standard library called module finder Which essentially finds import in your application code Compuse the dependency graph and then you could only put those but the problem Of course is that there are a lot of things that are optional. So for instance in pandas the two parquet method is is If you use that you will need pyro dependency, but then if you don't use it You will not need it. So so so that your your your dependency parser need to be able to understand that and worse python is very dynamic so you can do a lot of things dynamically including imports and then this approaches will break and And also this doesn't really extend to non python code. So for instance if you look at Shared shared libraries, so so some some some packages will need shared libraries that this will not work Of course, this is not a problem only for wasm the other Any kind of bundler that produces a single application for python has this issue? So for instance, you have pie installer or pie to app. Those are have a bit of the same challenges The specificity of the browser, however, is that we are essentially not only shipping python We're also shipping the OS so and scripting provides all the OS layer Let's say which we can relatively easily modify. So in particular, it's possible to say, okay Well, can we just intercept all the file opens like all the files that we're being opened in an application? And only keep those so this is what is done. For instance in the pie diet pack project package, which essentially will you will give it example of code it will It will parse parse it will try to run it inside the browser sandbox and we'll it will see what are the files that are being accessed So this for instance in the example of pandas here will allow you to get to have some reduction But there's still it's still not like a 10 10 10 times smaller application final application and the reason for that is We often have top level imports in the code So for instance, here's an example of patlip where here the imports that patlip does and you can see that Well, some of those are very probably very necessary Others are like very specific. So for instance regular expressions are probably only used if you do some globbing or maybe like antipath is probably not used at all in the browser and then stat is going to be used only if you do some stat Like if you if you do this status of the files and then you really parse again, so so all of this is very Like we put all the imports at the top and then a lot of things are not used and if you multiply that by hundreds of file while you end up with just including everything and And having large applications so of course we cannot change this style because this is This is essentially specified by the pep and it's like too late to change that But what we can do is see if there are other approaches to to mitigate this. So for instance, you have the This pep which proposed lazy imports which which essentially Proposed to load load make imports happen only when they're accessed And and this would actually improve things but unfortunately that path was rejected And we can go in further is saying well in the browser the Installation time and the import time is a bit less. There's this less strict boundary between them So for instance, if you do in pie died you do pie died run Python as saying import numpy What will happen actually is is it will both first install numpy and then run it and then import it So could we just consider that we have some remote file system and then we do just load files as needed from it This is currently not possible because we cannot Do this synchronously, but it will be possible with with the GS promise integration Once it's like added to pie died and enabled in browsers But there is still problem probably a problem of latency because you have you know, again Import pandas accesses 450 files. You had some latency on each request. I mean it's going to be huge Even if you do things in parallel However, this might might give you some ideas of some things you can try Like if you load only a small subset of the files you need or maybe on errors, etc So in conclusion pie pie died now has is a project that was started five years ago by Michael drop boom at Mozilla It was it started really experimental and now we have Essentially a lot of things that that got stabilized that that matured. So we have official CPI ton support And and there is a growing ecosystem of things building on top of them We have also a Lot that so this this new technology allows you allows a lot of different use cases that were not possible before in education in like privacy preserving applications, etc. And like makes a Python available to a new new category of users In this particular case of reducing the size of Python application This is a fairly challenging topic So we are working on developing some developing some tools, but some some things might need also effort community effort to to make them Python ecosystem more web friendlier because this was never really a concern before right? This was not something we thought before of course here. I presented only a single topic technical topic of reducing the size of Python application for the web But we are working a lot of other things in pie died and you can see you can find them in the in our road And we are also welcome new contributors. There's like because it's fairly still a new topic You know the wasm of Python with wasm. There are a lot of low hanging fruit So there's a lot of things to where it's easy to start I would like to acknowledge a lot of the maintainers for the pie died project people who contribute to pie died our sponsors and then the broader community of Projects that we depend on and script and see Python just the worst CDN and Downstream projects so Jupiter light pie script Boston iridium I died in the past etc. And we also had very helpful discussions in the wasm summit about some of this subject at her Python a few days ago So thank you and join us for this print this weekend Thank you so much Roman There was no question in this court yet One statement The statement was the panel package has been seen running in the browser. I Will use the moderated privilege and ask you one thing What is the most interesting thing and the most complex thing that do you have seen running in the browser? Complex and interesting. Well, I mean complex. I would say to some extent even a diamond tainer I'm still kind of surprised that all of this is even running and possible And Interesting I find like there's a lot like so pie script There's a lot of exciting things on that side and I also find that there's a lot of exciting things for education So so making things possible to like to learn Python without like all this complexity That's usually involved with packaging or installing things. Thank you. I think we have one question here upfront Is there an equivalent to a sub process run? So I can I have a like an executable that's also run in wasm Use something like sub process to start and get the output from it. Yeah, I mean basically it's web workers So so that's what we saw before but There's some idea to to like implement some I mean It's not exactly supercessive like there's some idea that you could implement a multiprocessing API on top of web workers But like it's it hasn't been that yet. Yeah, okay. Thanks Thank you so much and we can continue with the Q&A in Discord And in the whole way track Although sorry, I overlooked Yeah, no, I just got it. Um, so my question is what what do you see the future of Python in in the browser to be? like given that Python as a huge It's way behind JavaScript in many things running in a browser How do you see how do you see the future of Python in a browser? Well, I think it's it's it's so it's not going to replace JavaScript in my opinion, right? It like there are still use cases were clearly Just the size is going to be a limiting factor But it just like it will allow some applications that were not possible before and like for instance if you have back-end code Wristing in Python and there's a lot a lot of legacy code within Python It will just make it much easier to to to make it accessible to users without having to develop some kind of Web service, etc. Right even if it's like performance wise and not yeah Yeah, and I mean performance is going to improve. It's just like also matter of time a matter of resources, etc. Yeah Okay, thanks Next question, please Thank you for the presentation. I have the following question You mentioned that we will be able to call JavaScript for from Python and and vice versa What do you think which limitations will we have? Okay, so I'm not the best person to talk about this who this somewhere might be in the room but another maintainer of Python but essentially the one limitation is that you have the memory management between Python and JavaScript is not exactly the same And so right now we have we have ways to managing it But it requires a bit more manual effort to to do that So so that's like that the ideal like one of the limitations I mean the other yeah, that that's probably the main one I would say Thank you Another question at the back, please Thank you for for the nice talk I was wondering about the integration with the graphical part with graphic libraries By Qt So so yeah, so I know that like for instance Qt people are working on building Qt for wasm I don't I don't think there is a Like Python build for that yet But it's I mean for Qt for instance is just a matter of like build time because building Qt is really challenging we have I think We have the pixel Integration in in pyrite now so you can make some simple games with it It's it's possible. It's just a matter of effort and resources and things like this. Okay. Thank you We'll need to conclude the session. Thank you so much once again. Thank you