 Yeah, thanks everyone and it's nice to see that so many in people are interested in this talk Yeah, my name is Anna Lena and I will be giving today's presentation about an unbiased evaluation of environment management and packaging tools And a few words about myself. I work as a machine learning engineer at a fantastic German company called Okay At a fantastic Germany German company called in ovex And I also like working on coding and machine learning Projects in my free time which I usually share with the community either on my personal web page or on my github profile and Two of the most popular ones which you might find interesting as well are machine learning basics Which teaches fundamental machine learning algorithms in plain python like? support vector machines or neural networks Decision trees and so on and really with a focus on understanding the algorithms rather than having a great performance of the code and The other one is my magical universe where I interest to do Python features like decorators or context managers using a world of magic So if you like Harry Potter like I do a lot or other magical stuff, then you will definitely enjoy this project Okay, and let's start with the talk itself first of all regarding the goals so when I started coding and many years ago and I created my first Python package and I was quite surprised that it was much harder than I expected and Since or for the last years lots of tools came up and some have been around for a while It's can be very hard to understand which tool to use in which situation for like what you want to do And the tools change very quickly new ones appear And and my goal of this talk is really to shed light on this jungle of the different packaging and environment management tools And I want to do that without Introducing my personal preferences I will try to do that since I found a lot of blog posts or talks on the individual tools But no good overview of the different ones that are out there so For the purpose of this talk I identified five main categories that we will each look at in turn And we will look at a short definition of motivation And then we will look at tools that you can use for this category and especially single purpose tools but also in the end multi-purpose tools and Yeah, this is like the fundament of this whole talk the categorization these are the five main areas that I identified We have package management. We have Python version management environment management which can be With virtual environments, but doesn't have to use virtual environments We will talk about this later as well and there then their package publishing and building You will see this Venn diagram a lot So don't worry if you can take it in there is also a PDF version of the slides on the discord Server I also published them to the website, but they don't show up there So if you want to take a look at it more closely, you can do that later on as well and Yeah, you already see here that there are tons of tools that can do different things and Some can do the same things and at the end of this talk. Hopefully we will have Have gotten a chance to take a look at most of these tools Now and as a short introduction to each category and let's Say you have a personal project that you want to work on besides your work stuff or your work projects and Your project supports several Python versions and you of course then have to be able to install them But also switch between Python versions That's what the first categories about Python version management Then there is one and very popular single-purpose tool pi and but also some other tools that can do this for you Which we will look at Now in your personal project, you usually have dependencies. So your code depends on other packages and For this you need package management Very popular is the tool pip which you have probably all used before to install packages or upgrade them and so on And okay, so you have your code and your little project and you can install other packages You can have different Python versions, but then you have a dependency conflict For example, your work project uses a different version of pandas than your personal project And you will run into the problem that you need like an isolated environment for your code This is what environment management is about usually using a virtual environment Again, we have multi and single-purpose tools here, which we will look at and Then once you have worked on your code for a while, you might want to share it with a community or with other developers so you first have to build your package and then publish it and These are the last two categories. We will look at and again here You see both for a package building but also for publishing it so putting it on pi PI or Some other index we have lots of tools So let's start with the first category, which is Python version management So what I understand under for this is that a tool that can do this is able to install Python versions But also allows you to switch between them easily and Why is this important at all? There are several reasons for example, and you might work on different projects that might require different Python versions And then you might have a project that supports several Python versions Or you might just want to try out the newest or an older version of Python and what features it has to offer And This is our Venn diagram now And you can see here that we have one single-purpose tool called pi and but also some multi-purpose tools that can Or that allow you to install and switch between Python versions, which are conda rye and pyflow Now we will look at the single-purpose tool first, which is pi and And it is very easy to use so the most important commands are just pi and install and you can Type in the version that you need and then you can easily switch between the versions So for example pi and shell can be used to select a Python version for your current shell Then there is pi and for local which automatically selects a certain Python version for your current Directory or pi and global which sets the Python version for your user So and this is already the first category Python version management and let's jump into the next one About virtual or in general environment management To me a tool that can do that is able to create and manage environments virtual environments and most of the time and And yeah, why do we need virtual environments? There's is a big use case for this namely that as I mentioned projects have requirements usually they depend on other packages or other code and Often projects require different versions of the same package and you want to solve possible dependency conflicts that can be caused by having multiple projects and Yeah, if you just use pip to install a project this places it with your System-wide Python installation which can cause different problems and I know that there are different solutions for this I just wanted to mention this as another possible motivation for using virtual environments Okay, and Python has two main single-purpose tools to work with virtual environments Which are venv and virtual and and again? We have also multi-purpose tools like pip and fend conda, which we will look at later But also other ones like PDM poetry rye pie flow and hatch Let's take a look at the single-purpose tools first Venv is a built-in Python package So you can just use it out of the box with Python and it is very simple very limited in what it can do but it can be very helpful if everything you want all you want to do is create a virtual environment and Yeah, it's easy to to use you use Python And then you use the flag minus m which is used for calling a module directly with venv And you set the name of your virtual environment, and then you can activate it easily and deactivate it you activate it by And calling the activate script and and you just type deactivate to deactivate your environment And once you're in your environment, you can just install packages there certain versions of them as you like now and There's one more and virtual and which is a bit more sophisticated and it offers more features than venv For example, then this a bit slow and also not extendable. So some people prefer using virtual and And but this is a package you have to install first But then it is very similar for the usage Creating an environment is a bit easier. You just type virtual and and whatever name you want to give to your virtual environment But the commands for activating and deactivating your environment are similar Of course, these aren't just now the most basic commands. There are lots of other commands that you can use with virtual and Okay, so we have looked at Python version management already and virtual environment management and Before we get into packaging. I want to make sure that we are all on the same page regarding What packaging is about or what we need for this and especially two important files the first one comes now which is pyproject.toml so Packaging in Python has come a long way in the last years and Previously we used when you created a package you had to define a setup.py file and used setup tools as a build tool But then PEP 518 came around which introduced the usage of a so-called pyproject.toml file and This file is used to define all your project information. So you have And like the name of the project there which dependencies it has you can define scripts for some tools also Set up of your environments. So whenever you create a package now you need to define a pyproject.toml file and Yeah, this might be too small for some of you to read but this is an example file from the pandas library and Where yeah, you have different sections. Toml is just one file format for configuration formats Which is very simple and yeah, just follow the link here if you want to take a look at a proper example Okay, so Now we can jump into package management. So a package management tool is able to install Or download and install libraries and also their dependencies for you So that means that if you want to use a certain package like pandas You just use a command to install that package and all its dependencies are installed for you as well Why do we care about package management at all or why do we care about packaging so much? so what is nice about packaging is that it allows you to define a hierarchy of modules and Then you can import a code from these modules using the dot syntax So from module or from x dot y import that and I thought about a good example but I couldn't come up with one so and Yeah, something like from date time dot daytime import some function and Also packaging allows you to share code easily with other developers since You have your pi project or tumble file which defines all the dependencies of your project and other Developers then don't have to install the requirements themselves. They just use your package code or the Distribution file for that package and then everything is done for them. We're just really nice This is just some motivation for using packaging in the first place Now I'm regarding package management. We have the standard package manager pip which you and are probably aware of and Then there are multi-purpose tools like pip and van konda But also and PDM poetry Ryan Piflo, which can do package management for you some of them use pip or pip X under the hood, but They use have like their own commands, but we will see that later as well Now pip is the standard package manager for Python it's shipped with Python so you can use it out of the box and Yeah, it allows you to install packages from pipe II but also from other indexes It's very easy to use you just type pip install and then the name of the package like pip install pandas pip install requests pip install flask and And there are of course lots of flags you can use and so on but this is not and like this is out of scope Now and I already mentioned in the beginning that package Building is an important step if you want to share your code or your package easily with others so and What does it mean to build a package building a packet means that you have to create distribution files so called distribution files and These are usually a wheel file dot wheel and a tar ball file, which is dot tar dot gz And if you're not aware of how this works, don't worry It's not and that important for this talk You should just be aware that when you package your code these distribution files are created for you And these are also the ones that are downloaded by tools like pip if you share your code on pipe II now and Tools for building a package. There are lots of them. There's for example setup tools Which has been around for a very long time, but then there's flit mature in and scons There's hatch or I think I have a laser pointer here as well. Yeah, there's hatch py flow and rye These are very tiny here since they are in the middle of the Venn diagram and PDM and poetry and Yeah, as I mentioned a very old and tool for this Very old build back at the setup tools. It has been around for a very long time It was developed in 2004 as a replacement or improvement of distutils, which was used beforehand for packaging And and yeah, it can still be used you if you want to build your package using setup tools You just specify it in your pie project or tumble file there's a certain section for it for the build system you want to use and If you then want to create the distribution files, you actually have to use the build module And then you just type python minus M again minus M flag for calling the module build. I Haven't seen this a lot. I've never done it myself Usually I use one of the multi-purpose tools for building a package But in theory you could also just use the build tool with setup tools to create the distribution files And in the end now you have your code and in a nice distribution format and then you want to publish it Which means that you put it on pi PI or another index. Maybe your company has a private one Yeah, and Again, we have the tools here. There's one single-purpose tool Which is here twine, but then again the others from just what we have just seen our flit mature in ensconce Can do that but also hatch PDM poetry rye and pie flow Okay, and let's look at twine and it's the official pi PI tool for Uploading packages. It also works with other indexes than PI PI And but it's a single-purpose tool So everything it can do is publish your distribution files and there's a simple command You just call twine upload and then the fold off the distribution files Which is usually called dist and this will then prompt you to provide your username and so on to really get Access to pipe Okay, now we are at the promised second recap. So I already introduced the PI project or tumble file, but If we now go on to the multi-purpose tool There's another important file that we have to talk about for packaging which is the lock file So what is a lock file and a lock file? Contains very concrete dependencies of your project. So you have PI project tumble file on one side Which has abstract dependencies? You usually define just what dependencies your project has maybe with a range of versions that are allowed But you rarely pin the versions exactly. This is what the lock file is for so and the lock file is Important or enables you to Have reproducibility of your project across multiple platforms So if you have several developers working on the same package or code Then you want to make sure that they all work with the same setup and this is what a lock file can be used for again, here's an example file from poetry and and In the lock file it can be very very long it depends on how many dependencies you have also the Subdependencies and so on up into the exact versions and it can be used by some of the tools to create Your package in exactly the same way that the other other developers are using it or working on it Okay, so then we can get go on with the multi-purpose tools We will take a look at pip and first Which is able to do package management and environment management and the name already says that pip and combines pip and virtual and and Pip and has been around for a while. So and it is Introducing two additional files when you use it one is a pip file. This is a tummel file So similar to pie project or tummel, but pip and has been around for longer time So that's why it uses this pip file and there is a lock file Which you know now what it is and and this is intended to replace requirements or txt Which is still or was heavily used when pip fire at pip and force introduced and This was also a bit of the motivation why pip and force created now pip and is Easy to use you just Yeah, you can do package management That means that you can use pip and install with a package name like pip and install pandas and this will add the package to the pip file and also install it for you and and pip and also Like handles and creates and manages virtual environments for you So you do not have to do that manually anymore You just use pip and run with a certain script name and this will run the script within a virtual environment for you So it will install all the dependencies of your project from the pip file within that virtual environment And then run the script in that environment And of course you can also activate a virtual environment yourself And then you don't need the pip and front command anymore using pip and shell so Yeah These are the most important things then there is conda and conda and we can I know that you some of you might Think now but conda can do packaging as well I will say a few words about that but first of all conda can do and can handle virtual environments for you it can be used to install and yeah work with packages, but also handle python versions and It's in my like it's intentionally that package building and publishing is not here on The Venn diagram for conda since I know with conda. You can also create conda packages but Packaging with conda works a bit differently compared to what all the other tools do with pyproject.toml and also what you get in the end is a conda package and and Yeah, conda in general is a general purpose Package management system that means that it does not only work with python code, but also with other languages And it uses its own index. So the packages are not Uploaded on pypi but on an internal or like they're their own index and And I won't dive into conda much deeper here since it's a huge tool It can do very many things and there are lots of talks and more information about conda out there already and And now maybe the most interesting part for some of you about the multi-purpose packaging tools So I promised that I would get an unbiased evaluation of the tools So I thought about a good list of features that might be interesting to compare the tools With like what they can do and one is first of all can they manage dependencies for you? Then are they like do they have a locking functionality that they resolve dependency conflicts for you and lock the dependencies? Do they have a clean flow for? Building and publishing a package. Do they allow you to use plugins and then do they support to? important peps on packaging one is pep 660 which is about editable install. So if you use pip to install a package Then you can use a certain flag like the minus e flag for editable mode And if you're developing a package or code and you want your changes during development to reflect immediately Within your environment then this editable install is a very important feature to have and the other is pep 621 which specifies how to write a project score metadata into the py project tumble file And I put this in here because one tool does not follow this standard, but has its own way on how to specify metadata Okay, let's start we won't have a chance to look at all of the tools But I picked out a few which I think are interesting So poetry as you can see here poetry sits Closely in the center. It can do everything except for Python version management So it allows you to manage packages and manage virtual environments and building a package and also publishing it And and this is our feature list and as you can see here poetry is the tool that has its own way on how to specify Project metadata in the py project tumble file There will be like they will support pep 621 in the future at some point But there has been an open issue for this for I think one and a half years already It's not implemented yet. So you should be aware of this when you use poetry But except for that it can do like it has a locking functionality It allows you to manage your dependencies. You can use plugins if you want to and you also have the editable installs Now the main commands and is one is poetry new which creates a directory structure for you for your project and also a pie project or tumble file and it Also has an interactive motors named the poetry in it and which then prompts you to provide basic information like the project name and swan Then there is poetry install which installs a package So it reads the pie project or tumble file from the current project then it resolves all the dependencies and installs them If you want to do dependency management Then there's poetry add which adds a missing dependency to your pie project or tumble file and installs it for you and you can also take a look at all the and Install dependencies in a nice tree format using poetry show and then the tree flag Now running the code you have different possibilities. You can use poetry shell which will activate a virtual environment And and then you can run code within that environment And so again similar to pip and the virtual environment is created for you manage for you You don't have to do that anymore And and you can also run a script without activating the virtual environment first using poetry run And then for example Python and the script name what you use would usually type into a shell And you will see that usually most of the tools have very similar commands some Existence are in tools and some don't but this init command or run command is usually Yeah, it's usually part of all of the tools and now regarding the lock file, so When you install a package for the first time then poetry will resolve all the dependencies for you So the ones that are listed in your pie project or tumble file And and it downloads the latest versions of these files these packages and once it has finished Installing them it will write all the packages with their exact versions and to the poetry dot lock files so to the lock file and Yeah, this is The way which you can use poetry to create a lock file and usually it's good practice to upload your lock file with the Rest of your code for example to the git repo so that everyone working on the project has a chance to Use the dependencies in exactly the same versions and of course And they are not pinned to these versions for The rest of the project live, but you can use the poetry update command to update the package versions Now and it also has a clean building and publishing flow There is poetry built which will create the distribution files the table file and the wheel file for you And then you can use poetry published to publish publish these distribution files Now this was poetry next we will take a look at PDM PDM sits right where poetry sits here so it can do everything that poetry can do so everything except for Python version management and PDM is a relatively new tool it was Motivated or inspired heavily by poetry and py flow And it requires to use Python 3.7 or higher. Yeah, it has been hasn't been around for that long but PDM is the only two we would we will look at today which implements pep 582 and this is and what I mentioned earlier Usually most of the tools use virtual environments and for environment management, but there's a different way to do it Which is introduced in this pep? It's currently under consideration and it uses so-called local packages So not a virtual environment, but a different way to create an environment And it's quite interesting and if you're interested take a look at the pep but yeah, this makes PDM a bit special and Also a feature of PDM is that you as a user can decide which and build tool you want to use For example, you can say okay I want to use setup tools or I want to use hatchling, but you are not Yeah fixed to use a certain single build tool Now and that regarding the feature list PDM Fulfills everything so it allows you to manage dependencies. It resolves and locks them for you It is easy to build and publish a package You can use plugins and lots of plugins exist already for PDM And it supports the two peps on editable installs and also how to define your project metadata The main commands and now you see here that it is a bit repetitive that it's always very similar There's an init command, which will create your pie project or tumble file interactively There's an install command which will install a package by reading the pie project tumble file looking at the dependencies installing them for you and so on you can Yeah perform package management by using PDM add with a package name Which will add the package to your pie project tumble file resolve the dependency conflicts that might Yeah might be the case and then you can also in the end look at for a project what dependencies it has using PDM list with a graph structure and Yeah, there is no PDM shell command you so This might be implemented in the future. It's a very active project But right now if you want to run a script and and this script will be run within a virtual environment Which is created for you like with poetry and pippin you use PDM run and then you can say Python script name like Python main dot py and What you can also do if you have worked with packages before you might know this that in your pie project tumble file You can also define scripts. So Which can like be a shortcut to run certain things and this is also a way in which you can run these scripts using the PDM run command and Yeah, the log file functionality is also very similar to poetry When you install a pipe the package for the first time PDM will resolve all the dependencies for you and creates a log file in this case PDM lock and You can also update the versions of these files of these packages in the log file using PDM update and This is the same as with poetry. You have a build and a publish command to build your distributions files and publish them and now I'm next on his hatch and Hatch sits down here. So it can do everything except for package management and Python version management and But hatch is also currently very actively developed and the main or the core developer Already said that both will be or at least package management will definitely be supported at some point I'm not sure about Python version management, but If you want to be able to manage packages then this basically comes down to having locking functionalities And this is something he's working on to have this supported And yeah, this is already also shown here in our feature list that currently Hatch cannot be used to manage your dependencies. And so there's no command like hatch at package name and It also does not allow you to lock dependencies But it has a very nice way to Build and publish package. You can use plugins and it supports these two peps on editable installs and project metadata so and Again the command structures very similar you can Create a directory structure for your project which has then a test directory and the and under init files for you Using hatch new with your project name. You also have an interactive mode for that using minus I and Hatch is also able to transfer packages from the old format with like when you use a setup that PY file in an old project You can use hatch new within it and this will convert the setup that PY file into pi project Toml file, which is nice Yeah dependency management. I already mentioned that there is no command like hatch at this might Come up or might be implemented at some point, but currently it's not there And so when you want to add a package as your dependency you have to do it manually by Opening the pi project Toml file and just putting it there in the dependency list Oh, yeah, and you can of course also display the dependencies nicely in a table format using hatch depth show and then table Running the code you can activate a virtual environment using hatch shell Hatch also works with virtual environments unlike PDM and you can run a script or a Python code or Any other code within this virtual environment using hatch run and then the commands that you want to use And yeah, this is the same you have hatch build and hatch publish and Then this is a special feature I think of hatch at the moment is that it allows you in your PI project or Toml file to Declare your virtual environments in a declarative manner, which can be very nice for example Let's say and maybe we can open this here. Let's say with Does this work no, of course it doesn't so let's say you want Like you work in a project myth several developers You usually have style checks like you might use flag eight or black to format your code automatically Okay, now it's here Yeah, and what you can see here is down here you can Create an environment for this which is nice So you list the dependencies of this environment which is called style which are flag eight black and I sort here Then you can define scripts for this you have the check script, which will just Run the tools in a like check modus and will print potential errors but you can also have a format script which will actually format the scripts for you and I like very organized stuff. So I think this is a pretty. Oh, sorry. This is bias now. Damn it. Okay Okay Yeah, maybe later. Sorry. It's always hard to do that. Okay. Now. I need to get back to my slides Okay, yeah, so this is an example use case for this the code formatting last but not least and we have right and Yeah, now you can see already how quickly packaging changes since Rai just appeared a few months ago as a new tool and it sits right in the middle of this Venn diagram with Pi flow and I intentionally not I'm not presenting Pi flow since it's not actively developed anymore Which I think is a very important feature in the packaging world in Python, but Rai can do everything it can also manage Python versions which some of the other tools we have seen cannot do and Yeah, Rai is very new it was first released in May 2023 and It is written in rust and is strongly inspired by rust up and cargo from rust Which is the way packaging works and rust and it started us as a personal project and by the main developer of flask which you probably know the package or the library and He says that it's not production ready yet, but lots of people have adopted it already so it has grown very quickly and In yeah, we have the feature list here mostly everything is supported already Regarding the plug-in interface. I couldn't really find like a proper plug-in interface that the other tools have but New releases show up like basically every week. So he's working on this a lot so and probably most of the things will be supported at some point and And Rai is a bit different from the other tools since it uses rust So you have the Rai init command and I'm sure lots of other commands will pop up also in the next weeks and months And this will also create the directory structure for you and the pyproject.toml file You can pin a Python version that you want to have and like Rai pin 3.10 But this command does not install the Python version yet Also, if you want to add a package, this will not install it yet So Rai add package name like Rai add pandas will add it to your pyproject.toml file But you always need this next command Rai sync which will then Synchronize your virtual environments for you the log file and it also installs packages and the Python versions And so this is a command that does not show up for the other tools And and yeah running code is similar You have a shell commander which will activate a virtual environment for you and with all the dependencies in Installed and you can use Rai run to run scripts defined in pyproject.toml or and just Python code and Yeah, building publishing is the same as before Now an Overview of the tools that we looked at now the multi-purpose ones maybe as a final overview and Since yeah, you can see here now that there is PDM which already supports everything Hatch Which will be supporting the first two features also at some point then we have Rai and also poetry So at some point this will probably all be green. Let's see and Yeah, I think that's it actually Yes, thank you very much If time for questions, I guess Thank you for amazing talk and now we have five minutes for questions. So who wants to ask a question? please come to the microphones here in the middle and please Hello I've used poetry a lot on the quite suits my needs But there's a little problem when two dependencies Each require a sub-dependency with a different version. There's a conflict on poetry. It doesn't want to do anything or there's been a little flame war on their Tracker because people are searching for solutions Do you know if some of this package manager? These tools are more suitable to force for example a version to say, okay I want that version install that and don't don't try to be smart. Can we do that? Yes, I think and for example Rai the new tool This is exactly one of the criticisms the author had about poetry that lots of people have this issue and There's a huge discussion thread on the Rai tool where they also discuss the shortcomings of poetry And this is one of them and that's yeah, definitely solved by some of the other tools. Thank you Hi, thanks for the talk Maybe a similar question about a negative experience that I have with poetry was that it I find is very slow when your project gets big And you have lots of dependencies. I guess because the log file is better at doing resolution So I was wondering if Rai is better at that and it kind of it has a like download speed on bigger projects How you found that with the other tools you'd considered and I have never used Rai on such a big Project yet rather like small example ones, but I assume it would be faster. You probably have to try that out so just one quick note about 582 You said it was under consideration, but in fact it was rejected. Oh, okay. Good. No local package directories Okay, at least yeah, thank you very much Yeah, my question is Did you have a look at pip tools as well? No Okay Like you can want to like if you want to add something on pip tools for the others More closely to the requirements txt file also manages the dependencies and locking But doesn't do a lot of what the other tools do does okay. Thanks. I will make sure to see if I can include it in the Venn diagram Hello, I'm not familiar with this project metadata stuff And what are the disadvantages of poetry not supporting it and how this can affect me as a user of this Okay, so Since the pep for the standard was introduced after poetry was already around They had already created their own way of how to how to write it into the file And you should be fine as long as you look on the poetry documentation how you how you do that, but of Course if you then maybe want to switch to a different tool or you look at a different tool it might just look differently and Since python packaging is so actively evolving and they try to find or define a standard on how to do that so newer tools all tried to use the same standard and That's what I mentioned already that poetry will also do that at some point but Apparently, I'm not sure why it's so difficult and for them to implement this and it will probably need other changes And that's why they haven't done it yet But as long as you refer to the documentation of poetry, you're fine Okay, so as long as I don't try to transfer to another packaging tool. It's it should be fine. Yeah, thank you Thank you guys