 Thank you very much for the introduction, like the gracious man said, I am Justin Mayer. I won't be taking questions after the talk, but I would love to connect with people. I would love to talk to you about some of these topics, so by all means, please come up to me afterward. I'd be really excited to talk to you. Also post the slides on my site after the talk. You can see some links to Twitter and Mastodon for those new kids on Mastodon, and I'll post the links to the slides there when I'm done. So I'm originally from Los Angeles, California. Last year I moved to a small village in the Italian Alps. I work on software related to privacy and security, and I also write about it at JustinMayer.com. In my spare time, I maintain a few open source projects, including Pelican, which is a static site generator. And today I'm excited to talk to you about the Zen of Python dependency management with a little sprinkle of package release automation tacked on. So why is dependency management important? It's important because we share code with one another. We incorporate open source software that other people wrote into our applications and libraries in order to speed up development time and to improve the quality of the things that we ship. It doesn't make sense for a lot of us to write our own crypto libraries, so we go out and we find someone who knows tons more about that stuff than we do, and we grab the bits and pieces that are useful and we integrate them. The average package on PyPI relies on someone determined two or three dependent packages, which is not a big deal in and of itself, but that can have cascading effects where those depend on two or three other dependent packages. As you take this to its logical conclusion, you can see how you kind of have this rather significant dependency tree. And this is important also because of the notion of reproducible builds, and reproducible builds are important because, say, you're a company and you have new employees or you're an open source project and you have new users, you want them to be able to bootstrap your project as easily as possible, and they can't really do that if they're getting dependency conflicts. And so this is a way of, among other things, making it so that they can easily get your software up and running. And you also have other environments. It's not just development. There's testing, staging, production. You want to make sure that the thing you're working on in one of those environments is going to behave in a similar way as possible in the other environments. So reproducible builds are important for that reason. Now, a related topic is packaging, and packaging is also important because I want to use code that someone else wrote for reasons that I just described. And I want that to be done as easily as possible, and packaging is what helps make that more facilitated and easy. And at the same time, I also want to share code with other people so that they might benefit from the stuff that I wrote and to make it as easily as possible for them to use it. So consequently, there's a lot of talks about packaging, and there's been a number of them here at this conference alone, which I think is really interesting, and they all kind of overlap in interesting ways. So the packaging ecosystem in 2018 looks a little bit like this. And I say 2018 because we'll talk about some of the newer ones. But so there's like some tools over there on the left, and there's some files over there on the right. And this is kind of what things look like today. And if anyone who knows anything about these tools and files can tell that over the years, packaging has kind of morphed and accumulated new appendages. And as it's kind of moved along, it's shed other things. But at some point, it kind of starts to look like this conglomeration of accreted bits over time. And I think that there's a tendency to look at it in a somewhat negative way. And I think the cool thing is that some people are saying, like, no, actually, maybe we can look at this a little differently and look at as we can celebrate all of these little bits. Does everyone know what a platypus is? I don't know that German or French words for platypus. So if you don't know, ask someone who does know a platypus is a creature, has a duck bill, it lays eggs, it's a mammal, it's a really strange animal. So someone came up with this idea, I wish I could give credit, but I can't recall where I saw it. But they said, we're gonna celebrate the strange and continuing evolution of Python's packaging systems. So I think a platypus is a great metaphor for Python packaging. And some of the things they mentioned are, as far as the metaphor, it's a bit odd to start with. But then you realize it's the result of evolution in very unique circumstances. And it's actually quite cute and friendly most of the time. And it can incapacitate a human with its venom, just like packaging. So it's really difficult to fully grasp the finer points of how these pieces fit together or how they have evolved to their present state. So I'm not gonna try, and I'm instead gonna focus on the tooling that exists today and some of their practical applications. For some of the finer points, Dustin Ingram, who works on PyPI, gave a great talk last year on PyPI and packaging and its history. Hinnick has given great talks, he gave one yesterday on how to manage a project when it's not your job. Mark Smith gave a great talk on packaging in general and how to get something on PyPI. I wanna talk a little bit about some of the new stuff. So there's PEP 517 and 518. 518 defines a new configuration file called the pyproject.toml file. And this has the potential, eventually, to replace setup.py requirements files, setup.config, manifest.in, and probably other configuration files I'm leaving out. It doesn't, this PEP is long, like most PEPs. But it actually specifies very little. It specifies a filename, the file format, which is toml, a build system table, and a tool table. It's a little bit like the Wild West with this right now, where every build system is allowed to put stuff in it and can kind of do the build however it wants. There's no real standard in terms of how they do that. Some folks feel like the side effect of this could be a kind of vendor lock-in where you use one particular tool and because there is no standard, then you're kind of locked into it and it'll be tough to migrate. I suppose that depends on how hard it is to convert how one tool defines their dependencies, say, in this file. And then to migrate that, we'll find out, still early stages. So the build system can be defined in this way. This is how you would define it in a setup tools context. For poetry, you define it like this and this basically just tells the system that we're using poetry to manage and build this project. You can add different configuration on a tool namespace level. This is how poetry keeps track of your dependencies. And you can only use a tool, you can only use a name in the tool namespace if you are the owner of that named package in PyPI. So let's talk a little bit about, so that's the pyproject.toml file, which will be relevant when we talk about some of these newer dependency management tools. And when I talk about them, I'm going to mention the dates of last release, when they were released last. Because that relates back to the first slide, which is why is packaging and dependency management important? And that's to distribute and share software. And software that's sitting in a version control system and not inside a shipped release is software that, for the most part, for average users may as well not exist. So to me, steady releases are an indicator of project health and thus are important. First, I'm going to talk about pip tools. So pip introduced requirements files, which allows you to pin your dependencies. You can have hashes of those dependencies so you know that the thing you are putting in your requirements file, when it installs, it will actually be that file and not something else. And this improved reproducible builds and security. The problem is that pin requirements get outdated, and they need updating from time to time, and that's a bit of a hassle. And pip tools has some ways to make that easier. The pip compile command lets you compile a requirements dot text file from your dependencies, and those can be specified either in setup.py or in a requirements.in file. Then you can use pip sync to take the compile requirements file and then update your virtual environment with the dependencies you've declared inside those requirements. And that makes sure that your different environments wherever they are, are fully up to date and reflect the requirements that you've specified. So that's a very focused tool, and it does something very discreet. In contrast, pip-env does a lot of things. It manages virtual environments, so you don't have to. It audits packages for security vulnerabilities. It does dependency resolution to make sure that one package might depend on a different version than another package needs. So like pip tools, it keeps your dependencies updated and your virtual environment current. It does this in the context of a pip file instead of a requirements file or the newer pyprojects.toml file. When I last used it, and I'm going to express opinions fairly freely, it was relatively slow. I don't know if the dependency resolution has improved since then, but it was a bit slow. And lots of software is opinionated, but I feel like pip-env is quite opinionated on the spectrum of opinionated software. It replaces the requirements.txt file, but not setup.py or setup.config or manifest or any of the other parts of the setup tools ecosystem. So you still have to manage all of that stuff. And you still have to put your high-level dependencies in setup.py and your pinned dependencies in your pip file. Just in terms of how the project is managed, it vendors a lot of packages. It uses its own patched versions of pip, of pip tools, and maybe other things. That's not really my style to just wholesale sweep a bunch of software into my repo, but I'm sure they're doing it because they have so many different things they're trying to accomplish, and it's the same way for them to manage it so I can understand the benefits there. Their virtual environment management, if you just want some other tool to manage your virtual environments, you don't want to know where they are, you don't want to know what they're named, you just want something else to do it, it's great for that. For me, I prefer to manage them myself, and so I felt like I was kind of struggling with it. One of the things it does is that I believe it hashes the path to your project, takes that hash, and then appends it to your virtual environment name, and when you sort of depend on a predictable virtual environment name for other tooling, this can be kind of problematic. Just a silly example is I use the fish shell, and I have tooling that shows me the virtual environment that's activated, and I don't want to see this big ugly hash in my prompt line every time, and so I had a really hard time getting around that, and that's obviously minor, but we have opinions. So this notion of unpredictable virtual environment names, like apparently I wasn't the only one who had a problem with this, there's lots of open issues about it, maintainers kind of were like, eh, we're not gonna address this any time soon, and they just keep closing them. But again, so by default, another thing to know is that when you run pip-env-add to add a new package, it updates your locked packages. Now, you can disable that, but for me, I found it to be a strange default. I just want to add this package to my project, and then all of a sudden it's starting to update all of my locked packages. So for me, it kind of violates the principle of least surprise, but again, you can disable it, and I'm sure they had good reasons for making that the default. The uninstall command will remove packages, but not its dependencies. So you have to use pip-env-clean to remove the dependencies. So it seems to me like it's an extra step, maybe there's good reasons to separate those steps, but it just wasn't what I expected. It can also manage .env files if you use environment variables and you want them loaded into your environment. That's cool if you need that. For me, I managed that at the shell level, so it was kind of like an extra feature that I didn't need. The initial paces for releases of pip-env was really insane, like it seemed like everything was just shifting out from under your feet in the beginning, and then it's kind of slowed, like projects as they mature tend to do, but the last release was like eight months ago, and at some point you start to wonder, again, they probably have good reasons. It's an important project for people and they don't want to break things, but it's a long time for there to not have improvements. It's opinionated, which is fine, but it's opinionated for me in ways that don't fit my mental model or workflow. I feel like they took on a lot, they kind of overpromised. For me, they underdeliver a little bit, and it seems like we bit off a bit more than we can chew and are kind of under the weight of all of that. Again, that's my assessment. Try it for yourself, reach your own conclusions. So poetry is a similar tool. It keeps your dependencies and virtual environments up to date fast and for me, more reliable dependency resolution. It manages virtual environments, but only if you want it to. For me, it didn't get in the way. It uses the new PyProjects.toml format. Unlike the other tools I mentioned, it does not rely on setup tools. It can also, unlike the other tools, can build and publish packages to PyPI, so if you are managing this particular project, you're also publishing it to PyPI, you don't need an extra tool at this point to do that. I feel like the project has managed very well. Some PRs get rejected because the author is trying to keep the core manageable, and I really respect that. One thing that some folks might run into, you cannot install into a specific virtual environment. You can't say like pip install dot dot virtual M foo. You can only install into an activated one or into the default home wherever you want that home to be. Another thing to note is that users will need pip 19 or higher to install packages that are built without setup tools. I also noticed that poetry generates a setup.py file. But there's not much documentation as to why I assume it's for backwards compatibility, but I'm curious to know a little bit more about what that's about. Only pure Python wheels are supported, so if you're trying to build anything with C code, this is not the tool for that just yet. When you build, it increments your version string in the pyproject.tombo file, but if you have version strings in other files, you'll have to manually track those down and replace them. There's only main and development environments, so there's nothing like you can't say I want a specific dependency section for testing or for releasing new things. There's some really good features in it marked for the 1.0 release, including a plugin system, which would help give a home to the rejected pull requests and still keep the core as lean as possible. I feel like that's a sensible way of managing things. There's per project configuration coming, which could be used, something like this, to specify a virtual environment on a per project basis, giving you a bit more control over where your virtual environment is. And also, incrementing version strings is also on the roadmap as a nice helper when building things. So in summary, I feel like it's managed well, it's updated regularly, it fits my mental model and workflow. This is a newer project with a sort of interesting choice of name. Dependency hell, wee, it's like a celebration of the thing it's carrying. So they're trying to support a lot of things in it, setup.py, requirements.text, pip file, poetry, they really seem to be like, we're gonna support every piece of this ecosystem. They audit packages for security vulnerabilities. It has like a pip x like ability to isolate command line tools in isolated virtual environments. It's trying to do a lot. It claims to be like better than all other tools, end quote, which to me always gives me pause, but it's a new project and I haven't given it a full run through. So something to have on your radar, perhaps. So at this point, I wanna talk about a related topic, which is release management. Once you've managed your dependencies, there is another step towards zen-like enlightenment because getting something that you're, getting your project onto PyPI has, there's simply too many steps and none of them are fun, and they're all of them are tedious. You generally go through your get commit history and you start putting bullet points into your change log. You start manually updating version strings sometimes in more than one place. You then commit and push those changes, testing, building your package, publishing your package. There's just a lot of steps and usually another thing that people run into is only a subset of folks with commit bits usually have the ability to publish new packages on PyPI. Sometimes it's only one or two people. So this is a part of an open source project, ReadMe, that documents how maintainers can release new versions of this particular project. This screenshot shows five steps. There are 14 in total. The funny part is this project's purpose is to automate GitHub releases. So like I thought that was really interesting irony, but this is totally normal. The list for some of the projects that I maintain is even longer. So at some point, like did anyone see the Chernobyl TV series? This is how I feel like it's you have your clipboard and your list of steps and you're just hoping that you don't miss a step, you don't push the wrong button, you don't end up breaking the package for potentially a lot of users. And so it's a little bit stressful and a side result of that, whether it's conscious or not, is that we don't do it as often as maybe we want to or should. As in the personal anecdote, I once noticed that a year and a half had gone by since the last time I had issued a published release on PyPI and I was horrified to realize how much time had gone by. Because at this point, I feel like I'm failing at my job as a maintainer. It's a volunteer unpaid job, but it's still something where I feel responsibility to people and I don't want to let them down. So it's not good for maintainers because it's stressful. Well, it's also not good for users because you have this slow release cadence. And bug fixes and new features are sitting in the master branch. Hardly anyone is benefiting from them because they aren't in a shift release yet. PyPI account owner is on vacation and some critical bug fix gets merged by another maintainer. Well, it doesn't matter. You can't get it into a shift release. So you now have this critical bug that people are running into. So there are sort of bespoke, custom ways of automating this. And so you can use continuous integration to sort of take continuous integration one step further. So after it runs your tests, it can then say, okay, well, let's figure out a way to publish this and you can automate it and have that so it's not as manual and error-prone a process. So I wanna explore one way of how that might work. One way of doing that is by auto publishing releases upon a PR merge. So in this context, the pull request has to include a release file with two bits inside. One is the release type, major, minor, or patch. The other is a change log entry, description of the changes in that pull request. So the maintainer looks at the pull request and says, okay, tests are included, docs are included, code looks good, the release file is there, merges it. At that point, the continuous integration system can look for the release file and grab the major, minor version, the designation, increment the version. It can then take the description, prepend it to the change log and then run the equivalent of get add, get commit, get tag, get push, all of that and publish the release to PyPI. So there's some real benefits to doing this. With almost no human input, every code contribution results in a new release in a matter of minutes. Every feature and bug fix gets its own release without anyone having to remember to package and publish a new version. If a bug is found, it's now much easier to trace it to a specific release version. And of course, you don't have to use this, if you had the system in place, you could also issue releases manually at any point. But my favorite part about this notion is that all contributors get to issue their own releases. Like, what better way is there to welcome new contributors than to reward them with a dedicated release that's composed entirely of their work? I'm not saying it's right for all projects. For some, it may not be a good fit. If you maintain a library that is depended on by critical network infrastructure or services, maybe this isn't a good fit for you. Maybe you can figure out ways of making it work. Some maintainers may think, well, I don't want to see this release history clutter where every tiny little fix here and there results in this long list of releases. And it's true. Even something as minor as a typo fix gets its own release in this model. And, but I would encourage people who have this reaction to it, to really think about it. Like, would that be so bad? Would, you know, is that a serious problem? Is it a tidier, you know, history really worth sacrificing all the other benefits? So around this time, I was trying to solve this conundrum. I came across an article that describes this type of solution. And Hypothesis is a property-based testing library. They did a really nice write-up of how they arrived to this and how they solved it. And around, and then sometime afterward, I noticed that Patrick Arminia, who has a Python GraphQL library, was looking to do the same thing. And he asked his friend Marco if he could figure out a way of adding that same thing for Strawberry. And so he did that. He connected it up with CircleCI. And I really liked the simple, elegant approach that he took. And rather than taking the bits and customizing them and then copying and pasting those across multiple repositories that I managed, I thought it would be great if I could just use one tool and kind of just kind of pip install it into these different projects. And so, and then that way other maintainers could benefit from this as well. So I called Marco and I said, hey, I generalize this a bit. Is this something you wanna work on? He said, sure, if I have time, I would be totally up for that. And so I took his code, I added some more, I put it into its own GitHub and PyPI package and just pushed it last night, as a matter of fact. And so, to see how it could potentially work, this is a bit of configuration for CircleCI. And it's not obviously the whole thing. This is just the deploy step. And so you can kind of see, I know that the type is really, it's on a black background and it's super small, so you may not be able to see in the back. But essentially it runs through some of the normal steps that you would take using CI steps, modifying permissions, installing packages that you need. But it also then uses this new tool to do things that you would normally have to write your own scripts for. Checking for the release file, preparing the code for this new release, creating the commit, getting the commit into GitHub, getting the GitHub release created. So it's still very, very early stage. Feel free to check it out. All of the good bits in it are Marcos. All of the terrible broken bits are mine. There's lots of room for improvement and I would be very interested in any input or contributions to make it more flexible. It has very limited utility in terms of scope at the moment. It's meant for using CircleCI. It's meant for people using poetry. That could easily be, or at least somewhat easily broadened to do a little bit more. So I'm interested to see what people can do with some of these new improvements to the overall dependency management and packaging ecosystem. Because the overall goal is to make it easier to use helpful software that other people have written and to share the stuff that we do and to do that as frequently as possible. So with that goal in mind, I hope that you found this overview of dependency and release management to be enlightening. If you have any questions about this at all or just wanna chat, please come up and say hello. I would love to talk to you. Thanks very much for coming.