 All right, thanks everyone. My name is Justin Mayer as I was just introduced. I am originally from Los Angeles, California, but for the last few years, I've spent the nicer months in the Italian Alps where I have been working on a project called Fortressa, which I think of as the app store for open source. Essentially the gist of it is that I got tired of a lot of proprietary SaaS products that didn't have great data portability, had some other issues, and I see all these great open source projects and decided that I wanted to use more of them. So I figured out a way to automate the installation of those and I'm now making it available to other people. So if you see a great open source project, chances are good that you now have a way to just open a web panel, tap on something, have it installed for you. So I'm really excited about it. If you wanna see how it works, please reach out. I'd be happy to demonstrate it for you. So let's see. Aside from working on Fortressa in my spare time, I maintain some open source projects such as Pelican, Virtual Fish, and probably too many others. The only way that I can do that and stay sane is to automate some of the steps that have to do with that maintenance, which is why I'm here to talk to you today about Python packaging automation. Before we dig into that, raise your hand if you've ever thought about publishing a Python package but haven't done it. Okay, yeah, there's a few. How many people are here have published a package to PyPI? All right, a few more. How many of you sometimes feel that you should publish more frequently than you currently do? Yeah, okay, pretty much everyone. This, by the way, if you're not familiar is the Python packaging platypus, which is a great little mascot that was introduced a few years ago. So one more question, or at least two. How many folks use poetry to publish your packages? Okay, and how many of you use something other than poetry? All right, so more than poetry, that's good to know. So one question that I would ask is why package something? We package our code because we want to share it with one another, we incorporate open source applications into our libraries and other software in order to speed up our development time and improve the quality of our software. The average package in the Python package index, or PyPI relies on two or three dependent packages and then those three can also rely on two or three and so it can have this cascading effect. The problem with releases is that there are too many steps and none of them are fun, all of them are tedious. You have to update the change log. You might have version strings in more than one place that you need to increment. You have to then commit and push these changes, test, build, publish. Usually only a subset of people with commit bits have the ability to publish releases to PyPI and sometimes that's one or two people, sometimes it's just a single person and that can be a really big roadblock to getting releases published. This is part of an open source project, Readme, that documents how maintainers can release new versions of this particular project. This screenshot shows five steps. There are 14 in total and this is totally normal. You know, when I publish, in the past when I would prepare and publish releases for Pelican, my list was longer than 14 steps. You know, the funny part for this particular project is that its purpose is to automate GitHub releases. So even people who are trying to automate things for other people don't always have the ability to do it for themselves. So you have this checklist of steps. You hope that you get it right. You hope that you don't mess it up. This can be kind of stressful. And so it's easy to defer this chore. It's easy to put it off. And then when you put it off and you don't do it, then there's this nagging sense of guilt that you have. Like, I really need to publish a release. There's all of these new features that have been added and there was that bug fix and someone overhauled our documentation. I feel like I'm letting people down. And so it's not good for maintainers for those reasons and it's also not good for users because you end up with a slow release cadence. Bug fixes and new features are sitting there in the main branch, but no one's really benefiting from them because they're not in a shipped release. You know, if the maintainer or the person with the ability to publish releases is on vacation, then it can mean that critical bug fixes, you know, security fixes are not released very quickly. There are tools that exist to automate this. I've never really found a great one for me. So I think that to me the solution is to take this concept of continuous integration and take it a step further. You know, we have CI or continuous integration that runs tests when people push code. So why not add a couple of extra steps and have this continuous release process that's fully automated? So how would that work? Pull requests in this model can, and ideally should include a release file that contains the changes inside. And this can have two pieces of information. One is how big is this change? You know, should we increment the version, you know, major, minor or patch depending on how big the perceived changes? And then the second one is the description of the change, something that's gonna be used for the change log. When a maintainer reviews this, they see, okay, tests have passed, linters have passed, the code itself, you know, meets the project standards. It's time to merge it. The code gets, you know, the PR gets merged and then this part of the continuous integration uses the release file to increment the version number based on whether it was major, minor or patch and then the information about the change gets added to the change log. This gets committed, tagged, built, published, all automatically and without any actual manual interaction. So with almost zero human input, every code contribution results in a new release in a matter of minutes. Every new feature, every bug fix can have its own release without everyone having to remember to package and publish a new version. And this way if a bug is found, it's now much easier to trace it to a specific release version because in theory you have a new release for every bug fix. And you don't have to use this automated system every time. You know, you can also release new versions manually if you want and that can be something as simple as taking a new text file, writing one line at the top saying major, minor or patch, second line, this is what the changes are. You can have bullets if there was a few changes. Commit, push, that's it. Your release is done. You don't have to do any of those aforementioned TDS steps. You can edit release files in a PR that someone else has contributed. You can say, okay, actually this is really more of a minor change, not really a patch change. So I'm gonna change that part. You know, you can tweak the description. So you have control over this. It's not just something you're entirely leaving up to your contributors. But I think for me the best part about this is that all contributors get to issue their own releases. What better way to welcome new contributors than to reward them with a dedicated release composed entirely of their own work? There's something about that that I just think is really interesting and cool. And part of the spirit of what Open Source is supposed to be about. And I'm not saying that this concept is necessarily right for every single project. For some it may not be a good fit. If you write a library that's a critical piece of internet infrastructure and it makes you feel squirrely, it makes you feel uncomfortable to know that every time a PR is merged, it contains a release file that a new release is getting published. Sometimes it might be easy to push the merge button and not realize that there's a release file within. So depending on your comfort level with this, you either need to take a little bit of extra care or decide that, you know what, I think I'm gonna manage this a little bit more manually and perhaps compose and push the release file yourself while still getting all of the other benefits of that automation. And some people might say, well, this is gonna result in a lot of things in the commit history or in the release history, I should say. We're getting a new version for every little bug fix, every little change. And that's true. If you take it to its logical extreme, you could get releases for typos when someone fix a typo. Not that you have to do that, but you could. And I would say, even if you did that, would that be so bad? Is that really a big problem? But again, that's something you have control over that. So with all of these benefits in mind, I decided that I wanted to fix this for myself. I maintain, like as I said, Pelican virtual fish, some Django plugins. And I really wanted the users of my open source projects to have the benefits that I just described. So with some help from Marco Cherno and Patrick Arminio, who work on Strawberry, which is a great GraphQL library for Python, I first added support for CircleCI, and then Travis. And then right around that time is when GitHub Actions became generally available and I added support for that as well. And that is primarily how I use it now. The configuration is pretty easy. I use poetry for all of these things. So everything you see here is going to be poetry specific. This is a PyProject file, and we're just defining the minimum configuration required in this particular example. We're saying this is the name of the project. This is the name of the Git user that's going to be making the automated commits. And this is the email address that that user should, the email address that should be used for that user. This is a little bit more involved example. You can see some other configuration options. We are saying, okay, the changelog file is not the default. It's not changelog.md in the root of the project. Actually, it's in the docs folder and it uses restructured text. The changelog header is not the expected string of equal signs. We're using a different header. So you can see some of the things that we can configure here, including extra version strings. This is a legacy, this is actually for Pelican, and it's a legacy project to the extent that it was not published using poetry from the beginning. And so there's a little bit of things that I do to support the way it was set up in the beginning. And one of the other nice features is that you can append GitHub contributor information to the pull request, which we'll see later. And so that is enabled in this particular example. In terms of the configuration on the CI system side, for GitHub Actions, I've defined an environment called deployment. And in that deployment environment, I've defined two environment secrets. One is called gh underscore token and the other one is pi pi underscore password. And this is to automate the GitHub release and the publication to pi pi, respectively. The last and arguably most important part of the process is the actual configuration in your CI system. This is probably not very legible at this type size, my apologies for that. But the gist of it here is you can see here in the GitHub Actions workflow that we've got a task or a job called deploy. We are saying that if test and lint pass, then this should run, otherwise, don't run it. We are also requiring that this runs either on the main branch or in a pull request, not generally for other pushes to other branches. And then we start going and we are installing some basic dependencies. We're upgrading PIP, we're installing AutoPub itself. We're using HTTPX to make some network requests. We're using the GitHub release Python package to automate the creation of GitHub releases. We're checking to see whether there's a release file. I'm using the AutoPub check command. And if it sees a release file, then it knows that, okay, it's game on. We're gonna go through the rest of these steps and publish the rest of these steps, which is again to summarize, check for the presence of the release file, increment the version in PyProject, optionally including other files, pre-pending the new release information to the change log, optionally including credit to the contributor, build the project, the Python wheel, tar ball, commit and push these version incremented files, create the release on GitHub, and to publish the package to PyPI. Release file, as I said, very simple, two pieces of information. You have release type and then either patch, minor or major. And then a description below that as to what is in this change. And that could be a bulleted list, it can be a single line, whatever you decide you want to be in the change log. As this executes in GitHub Actions, CI, you can see here that the tests have passed, the linters have passed, the docs have been built, and only when all of those things complete successfully do we see the deploy job run, which in this case also runs successfully. And when it does run, then we can see the output of that on GitHub side where our Git user has used that authentication token to add everything that we want to this release. As the assets, we get the name of the project, we get the new version number, and a description of what is in this particular release. And each contributor, as I said, gets their own release. This one is Lukas. This is a description of a change that he made, and we also described that this was done in PR number one. And this is information that was all added there automatically by AutoPub. And I can't really explain in words to you how satisfying it is to do this compared to the way that I used to do it before. There is a certain magical feeling, like it should be possible to just merge a pull request or just push a single commit and have all of these things done. It's like this Rube Goldberg machine of Domino's falling and then it hits the ball and then the ball goes and when it all works the way it does, it's really incredibly fulfilling. So as I said, this is currently designed to work with poetry, but there are plans to support other workflows. So if you use Hatch or PDM or some other kind of workflow to publish your Python packages, please reach out and let me know because it's with that input that we can improve AutoPub and add support for other use cases. One of the other things that we would like to do is to list reviewers and committers as well as the PR authors so that we can get, because we often get pull requests that have commits from multiple people. Another thing that would be really great is to send some type of announcement on Twitter after the new version has been released as a way of recognizing and thanking the contributor for the time and effort that they donated to the project. And I'd love to take that even a step further and do this for say a release announcement blog post and have that fully automated and published without having to lift a finger. So as I said, this is incredibly satisfying for me. It could be for you if that's something that you, if you want to be able to merge a PR and have the way that you feel about publishing new releases completely changed, by all means please reach out and let me know. I'll announce on Twitter when I've published these slides. So if that's something you're interested in, follow me and you'll know when they're available. If you have any ideas or suggestions, please reach out. Or if you're here at the conference in person, just come and chat with me. Thanks very much for coming. Thanks very much. Any questions in the room? Yeah, please go ahead. Hi. So I would like to ask, how do you find the virtual strings incremental? Because I noticed you don't only support, you know, the set up py file that you had there, you had other options. So how does that work and are there any issues with that? I'm sorry, I think I missed the first part. How do you find and increment the version strings? Oh, how do I find and increment the version strings? Okay. So if it's a poetry project, for example, that's handled automatically by auto pub, by running poetry, I forget, I think it's poetry version I think is the command. I forget the exact sub command, but you run that and then it will handle based on patch major or minor, whether to increment, you know, the patch minor or major version number. And that's again, it's more of a Semver philosophy for people that are more into Calver and calendar versioning. That's something that I think we'd like to support as well. Yeah. And then the configuration in the py project file also allows you to specify other files. So we have an example where it's, there's a, you know, a Dunder init file that has a version number in it and we just specify the path to that file and then it goes and looks for a key that contains the word version and then does the same incrementing business that it did with the py project file. Hi, I just wanted to say that it really was native with me and my company actually figured it out with, as you said, a Rube Goldberg machine with like Jenkins, Twine and bump version. And we wish probably that we heard about your project before but we still have a problem that probably also resonates with you. So we strongly believe in Semver and we do these releases very often and we treat every patch version as pretty much nothing important change. You can always update to it. So we pin versions just to minors but then let's say you have your libraries for your company. You don't want maybe to restrict the versions too much because you don't want them to, the higher level repos to, you want actually to restrict there but then it can be crappy with the libraries that you depend on. They may not be so strict with Semver. So what's your maybe, God feeling, what's your advice? How strong should the pinning be in the libraries level? Not on the final project level. What do you think about that? Sure, and that's an excellent question. I think everyone's gonna have a different viewpoint and strategy as it relates to how to treat dependency version specifiers. I do not work on projects that are critical pieces of internet infrastructure. So I take the YOLO approach, which is I essentially, to use a poetry term, I don't use the little carrot specifier that limits you to a certain range. I just say greater than or equals to and if something breaks, it breaks and I will go and fix it. I have that luxury. A lot of people don't have that luxury. Pinnick, who's in the room, wrote a really great article called Semantic Versioning Will Not Save You. I highly recommend reading it. It's a great article. It touches many of the things that you're asking about and he has explained this much better than I could do in the time allotted. But the other thing that I will say quickly is that particularly in the last couple of years, I've seen this plague of restrictive upper bounds on version numbers that causes some dependency resolution conflict that I cannot resolve and that ends up getting into wars between different projects. Not my favorite. So I would say that if you can avoid, if you have the luxury of avoiding specifying upper bounds on dependencies, then by all means do it. Other packages and other people who use your software will thank you and I'm sure that that could cause other issues but as long as you're paying attention, hopefully those issues can be worked out relatively quickly. Thanks, Justin. Quick question. You mentioned that you generate either minor, patch or major version based on the scale of the change. I'm curious to hear a little bit more about that. Is it just the amount of lines changed or maybe the semantic commit? Yeah. Yeah, that is not something that is detected by Autopub, that is something that is specified by the person who's included the release file. So if you are making a pull request on a repository, you are asked to do your best to come up with your evaluation as to whether or not this is a small bug fix. Have you added some feature that maybe warrants incrementing the minor version? Is this a wildly backwards incompatible change that warrants a major version? These are things that you're asking your contributors to determine. Now as a maintainer, you have the ability to review that and say, okay, well, I think you've gone too high, you've gone too low, and I'm going to tweak that before I merge it. But ultimately it's up to the contributor at some level and then the final arbiter is the maintainer that is doing the merging to make that determination decide how to increment it. Right, any more questions in the room or online? Thanks for your talk. I have one question, what happens to the release file after it's being merged? It's a good question, I forgot to mention that part, thank you for asking. It is deleted as part of the process. Once everything is done, the release file leaving it around would cause problems, as you can imagine. And you certainly don't want to have multiple releases happening with the exact same change. So it is simply deleted and yeah, you can usually tell when looking at the repository, okay, there's no release file there so everything is gone as expected and I can't really think of any cases where it's been left behind due to some problem. It seems to be deleted pretty consistently. If not, then let's thank Justin for this, excellent talk.