 This is going to be a quick talk on managing Python source installations. My name is Clark Boylan. I work on the upstream OpenStack developer infrastructure. And in the context of this talk, that means that I help perform thousands of Python package installations every day from source, and we do it mostly successfully. So I'm hoping I'll be able to teach crowd and those watching later how we've managed to make this a bit more reliable than when we started out. Quick disclaimer. I'm a member of many people, the Oslo team, Robert Collins, Monty Taylor, Doug Helman, and many others. So I don't want to take full credit for this, I'm just here to talk about it. And I'm also not a member of the Python packaging authority, which means that they may not agree with all of our opinions on these things or the steps we've taken to make these tests more reliable, but just be aware of that. So there are quite a few problems with Python packaging if you're installing from source. This is a small subset of what those problems are. A big one is it uses arbitrary code to install your Python packages. That's your setup.py file, which is an actual Python executable that runs and it can do whatever it wants. So you're kind of at the mercy of the package management system for the whatever package you're installing. There's a ton of non-compatible libraries that are used to do this. There's distutils, setup tools that used to be distributed, and then there's a bunch of unofficial ones that all behave and act slightly differently, so you don't always get expected behavior, even though you might have some certain expectations from using previous libraries or other libraries in the past. Dependency resolution in Python package management is really, really simple, which means it's not robust and often breaks. So you install some things, you know, today, tomorrow you come back and you install something new. That new installation can break your old installation by pulling in something new that the other one doesn't like. You can't easily do reproducible installations, which means, you know, if you do an install today and you want to take that and reproduce it, say, in another data center or, you know, on your laptop, it's very difficult to actually do that. And then how do you handle system dependencies? Python often will link against C-Libs or other libraries like Fortran and things. How do you actually express your dependency on that and make sure that those are available in your system before you install your Python packages? So really quickly, I want to show some of the issues related here. We're going to go ahead and do a pip install of this package called Rdolol, which is on PyPI, and we'll see what it does. Oh, it just opened a YouTube video on my web browser. That's legitimately a package on PyPI that I have nothing to do with, completely independent of me, and that's the sort of thing that you can get out of Python package installs. It will go ahead and, you know, talk to your browser and open tabs and play music. There are many other arbitrary things that can happen. That's, you know, a quick example of the sort of thing we're looking at here, which can make dealing with these things quite frustrating when you're wondering, you know, why didn't this work? Well, it's probably because it can do whatever it wants. So some of the things we've done to address this. We've created a library called PBR, which is short for Python build reasonableness. I think Monty Taylor just wanted Keystone to have a friend in the Python or in the OpenStack naming system. What this library does is it reduces the copy and paste that we've had to do of setup.py files across OpenStack. It's configuration-based. So rather than having a bunch of arbitrary code that you need to carry around, it's expressing your Python package in configuration. That configuration is based on the original distutils-proposed configs, sorry, distutils2. Distutils2 has since died. That project is no longer alive. Python had decided it had fulfilled its duties, and they've moved on from that. So PBR is kind of the current incarnation of that that will go forward, at least with an OpenStack. The execution of PBR does depend on setup tools, which means that we can't fix all of the problems we have with Python packaging. In particular, setup requires installing the dependencies you need to build your Python package is a problem still. And it also standardizes quite a few common setup.py actions. So at least with an OpenStack, things are consistent for common actions, which is useful. And this is the entirety of a PBR-based setup.py file. If you've ever seen other setup.py files, they're quite large. This is a very, very short and simple file that is very easy to copy around, which is useful when you have, I don't know, hundreds of packages that we need to make sure are working. And then the bulk of the actual package configuration is in a setup.config file, which is not code, it's a Python INI file. And so here you describe your package. And this is from the Zool project. So the name, we've got the classifications, and then some entry points, and then, you know, we have extra packages we want to install. It's very config-based, which is nice, because now we're no longer running a bunch of arbitrary code to ensure that these things occur. Probably one of the biggest things we've done to make source installations reliable has been pip constraints, which is a feature that Robert Cullen's added to pip a couple years ago now, I think. And what this does is it bypasses dependency resolution by passing in a file with a list of packages and versions you want to install them at. And regardless of what any other package wants in its dependency list, pip will use the version in the constraints file instead. So if you say install package foo at version one, and some other package wants version 0.9, pip will go ahead and install version 1 instead, regardless of that. Which means you can do reproducible installations by collecting that information and then passing it around to other installations. Unfortunately, this doesn't address setup requires issues, which happens before pip actually runs the constraints handling. It's quite simple. This is the constraints for the presentation I'm giving, which is running out of a Python program. So it's package name to a version. You can then pass that into pip when you install. And so here I'm saying install doc utils. It will pick the version in the constraints file there. And regardless of any other version that wants to be used. We've also made a tool called Bindep, which allows you to specify system dependencies and then generate a list of the missing dependencies that you need, which you can feed directly into your system package manager. This is useful to describe the state of things that you need as dependencies to actually install your Python packages. It's fairly simple. It's you list out the packages you need and then you can also separate them by package manager platform or distro. And then you can also have tags like test or production, whatever you want to add in there. So a common thing that comes up in this space is containers and virtual ends. A lot of people think that containers and virtual ends fix these specific set of problems. They make certain things better. Containers in particular can help describe what system packages you need through your image builds. But really what they do is they make the deployment of Python package installations easy by capturing a specific state and then copying that around. But they don't solve the problem of building that initial state that you can then deploy everywhere. It's very easy to still have a broken install inside of a container or virtual end without having constraints and Bindep and things to actually describe what you need. So those are the things that we've done as OpenStack downstream and that have helped make our lives better. Upstream Python and PyPA, the Python Packaging Authority, is doing other things currently to make some of this stuff work better. In particular, PEP 518 and the pyproject.toml file will describe the dependencies that you need to run a Python setup.py. And then there's also the PIP file project which is kind of taking the idea of constraints and making it more rich and robust. And so you can specify a package. This package comes from, say, a GitHub repo. This other package comes from PyPI. This other package can come from your internal company repo. And then they all have these different versions. And then it'll produce a lock file which is kind of the new equivalent of a constraints file. And that file can then be passed around to do reproducible installs. But all of this is still kind of in progress. PIP file is experimental and pyproject.toml is just starting to kind of get out into the Python Packaging Authority tools. So I think we have like a minute, two minutes for quick questions if anyone has any. Not all I've got then. Thank you for your time.