 Tych first with 30 minutes anyway, but I still have to talk about something first that's more important than python packaging. That's me! So my name is not GG2K, but this is my handle more-or-less everywhere online. Feel free to follow me on Twitter. I tweet about python and Brexit occasionally, angrily. So, yes, my real name is Mark Smith. I'm a developer advocate for NEXM換, one of the conference sponsors. I will be remiss if I didn't at least briefly Ie, golyda i gael o Nexmo, mae'n ysgolwyr a'r ysgolwyr, mae'n datblygu ymdyn nhw i gael adegau a'u gwahanol sy'n gweithio ar y cyfnodau iawn, a'r cyfrannu gwllwni amgylcheddau, dwi'n gweithredu, i gael o'r deisydd o'r mobile a'u web apps i gael. Felly, mae'n bwysig yn ei ddweud, ymdyn nhw'n amlwg ysgolwyr o'r Nexmo oedd oedden nhw i gael o'r twid, o'n ymdyn nhw i gael o'r cyfrannu. that another time. So now I've talked about Nexmo, let me talk about JavaScript. So in March 2016, a developer removed a library called left pad from NPM, the Node.js equivalent of PyPI. It's a big web service, holds packages, when you need to pull down those packages and install them into your JavaScript application, you run NPM and it downloads them and installs them so that you can access them from your program. Left pad, so it broke lots of packages that depended on left pad. It turns out that quite a lot of packages on NPM depended on left pad. In fact it had been downloaded 2,486,696 times in the month before it was removed from the NPM repository. It was just one function and it was 11 lines of code that padded a string to a certain length by adding characters to the start of that string. And lots of people thought that this made the JavaScript community look kind of silly. It's why would anybody publish a package that only consisted of 11 lines of code? Why would anybody use a package that only consisted of 11 lines of code? Now I don't think it made the JavaScript community look silly. I think actually it made the JavaScript community look pretty awesome. Now obviously there's some problems with the fact that people can remove a package that everybody's depending on and kind of break the entire JavaScript community software but in general why was this package published? So if you don't agree that this made the JavaScript community look awesome, feel free to fight me on Twitter. So the left pad is not a problem. Left pad was a solution. So because the JavaScript standard label library is relatively small and doesn't contain lots of useful functions that say the Python standard library does contain, like the ability to left pad a string with some characters, a developer wrote a solution for it himself. And because NPM makes it easy to publish small items of code, he did. And he made that code available to other developers so that they then didn't have to reinvent the wheel. And that means that people can bug fix in one location, they can submit PRs to improve these 11 lines of code and then ultimately people can well ideally depend upon it. Now the alternative to this is copy and paste. Copy and paste I think you can agree is not how you should share your code. Because it's really easy to share with NPM people do and even with really small libraries and I don't think that's really the case with Python because people find Python packaging slightly fiddly. There are some slightly odd things which we'll go through about pushing your first Python package to PyPI. I made it difficult for myself to say some of these sentences as well. So people are a bit afraid of setup.py. There is good documentation out there, lots of it conflicts with each other, but there is I think a growing movement to bring some of this stuff together and make it easier to find current best practices. Hopefully this talk will help a little bit. So what I really want everybody in this room to feel at the end of this talk is confident to publish packages to PyPI. Relatively small packages, maybe ideally pure Python packages, which is what the example is going to be. So I would like you all to make a package. What we're going to do in this talk is I am going to show you how to make a package. So the first thing, the assumption is that you already have some code which is general enough or close to general enough that you think it would be useful for other people. So let's take some useful general purpose code like a function that prints hello world. Now who in this room hasn't written this program in some form? Exactly. Now that is totally wasted time. Everybody in this room has written this program. Wouldn't it be so much better if you could pip install it and call that code from somewhere else? So that's what we're going to do. Also note there's an F string here. F strings are awesome. You should use them. So the idea is you've written some code that you're proud of. You would like to share it with the world. So the first thing to do is extract it from your code base so that it is independent of that code base. That's your problem. I'm not going to show you how to do that. In this case we're going to extract this code into a file called hello world.py. This is a Python module. Next we're going to put that module in a source directory and I will explain towards the end of the talk why we did that. For the moment just assume that Hinnick is excitedly clapping for the second time today. Awkward. So I'll explain why later, but for the moment just take this as correct practice. In the same directory as the source directory, the container source directory, we are going to create our setup py file. You can open that up in your favourite text editor or Python IDE and we will enter something like this into the file. The first thing to note here is that we're importing from setup tools not to distutiles. You will still find documentation online that recommends importing from distutiles. Don't do that. It's not that powerful compared to setup tools. Pip already is distributed with setup tools. If you're installing packages with pip, you have setup tools. It's not a third party dependency anymore. Then we have this function underneath it. We call setup that we've imported from a setup tools module. It's a bit weird. For now just don't think of it as a function call. Think of it as configuration. Each of those parameters is essentially a line of configuration that you are giving to pip to tell it how to install your package. This is pretty much the bare minimum setup information you need to provide. We start with a name. Name is what you pip install. This is the name on PyPI that it will be uploaded under. People will pip install it. It doesn't have to be the name of the Python code that people will import. It's a separate thing. Usually they're the same. Sometimes they're different. We need to pick a version number. Here I've just started with 001. 00x version numbers imply that it's unstable. There is a good chance that the first few times you upload this to PyPI, there will be a minor packaging mistake. This is a good stage to start to upload packages to PyPI while you've still got this unstable version number. You're not worried about people seeing instability that's not actually your code base. It's actually your packaging configuration. Then we have a description. This is usually a one liner at this point. Say hello is not a very useful description. But we'll leave it at that. Then we have PyModules which is a list of the actual Python code modules. We have a file called hellowell.py. In here we're saying this is the code that we want to distribute. That's what people import, not what they pip install. Then finally, again, this is a cargo copy and paste setup config. We have this package deer line which is a map that sort of said empty string and a source. All that is doing is telling setup tools that our code is under a source directory. Don't worry. Put it in your code and forget about it after that. Now we've built a package. Let's build it. Potentially we could distribute it. We run the setup file we just created with the bidest wheel command. This tells it to create a wheel file that's something that is appropriate for uploading to PyPI. It will spit out a load of output, most of which I've deleted from here, but the line that I've highlighted in bold up here is the one that's important. What that's saying is that it's just copied our hello world Py code file into the lib directory, which means that it will end up in our wheel. If that's not there, then essentially there will be no code. It will have an empty wheel file and it won't work. We can now look at what's been created as that part of that bidest wheel command. Here are the directories and files. Remember we've only actually created two files so far. Hello world Py and setup Py. Everything else here was created by setup tools. I have a few things here. It's created an egg info directory in our source directory. You'll want to get ignore this. I'll show you how to do that at the moment. This is horrible. I wish it didn't do this. I'm going to ignore it from now on. Then we have a build directory. This is where setup tools moved our files to in the process of building our wheel file. You will see the hello world Py file in there. Again, validated that our code is actually going to be in our wheel. Finally, we've got the actual wheel file here that it's put in our disk directory. That is our final distribution. Now we can install it locally. This is just effectively testing our packaging. It's not testing our code, but it's testing our setup Py file. Here we are going to pip install and then with the minus E flag and the full stop period dot whatever you want to call it. This can be confusing to people if you haven't seen this before. Just out of interest, who hasn't seen this before? Yeah, I thought so. This is actually an essential command if you are building Python packages or rather they're essential flags. The minus E, normally when you install a Python package, it installs it into your site packages folder, inside your Python distribution. It copies the code into your Python distribution. We don't really want to do that. While we're working on our project, we wanted to just work with the code that's in our source directory. That's what minus E does. It essentially links to the code that you're working on instead of copying code into another location. Once we've installed this package, we can continue to work with it, continue to run it, continue to write code against it without having these two copies of that code that is just going to cause us problems further down the line. The full stop at the end means install the package in the current directory. It's looking at the setup Py file. It's saying install this package by linking to the code that I'm working on. You run this every so often. Every time you change your setup Py file, you essentially run this again to make sure that your package is installing correctly and that your Python code is available to you. Bear in mind, our code is under a source directory at the moment. If we run Python in our current directory, we can't import Hello World yet because it's not in our path. Let's, in theory, let's test it. Here, we run Python, the REPL, and then we do our from Hello World import say hello. Because we've just installed our code into our current virtual environment, this will work now. Even though our code is under the source directory, Python has been told where our code is by the setup Py file. We can execute the say hello function, we can execute it by passing at the optional parameter. Everything works as we would hope. It's some rough testing. We'll get on to better testing at a moment, but it's just a confirmation that our code is installing correctly. At this point, we have a working package with some useful code in it, so we could upload that to PyPI immediately, but I would say that there's a few things that we really need to do before we get to that point. That's documentation and testing, but also just a little bit of housekeeping which I'll run through now. As I said, there's some files created that you really don't want to add to your Git repository, so it's useful to have a Git ignore file. This website is fantastic. They make it easy to get hold of GitHub's standard Git ignore files that they publish for different language and operating system communities. You would write Python in that text box, hit create, and it will just spit out a text file into the web browser that you can copy and paste into a .gitignore file. Now we're ignoring all the main files that Python creates, so it will stop you from uploading PyC files and a bunch of other artifacts on a Python project. If we're going to publish this code, we also need a licence. If we don't have a licence, we haven't given permission to the people permission to run our code. They can look at it, but they can't copy it or use it, which is not greatly useful. We need a license.txt file. If you don't know the ins and outs of the different licences and the restrictions and freedoms they grant the software that you're publishing, this website, chooseallicence.com, is incredibly useful. It essentially asks you lots of questions and then gives you your options and how they compare to each other. It's a good way, it's a human way for non-legal people to understand the differences between different software licences. We need to add some classifiers to our setup py file so that people can find a project in PyPI by filtering on common criteria. Here we say that this is Python 3 code, it runs under Python 3.6 and 3.7. We haven't really tested that yet, but we know that it only runs under those versions of Python because there was an F-string in the code, as I pointed out. I chose the GPLv2, so we put that in there. That was a bit of an arbitrary choice. These can all be looked up in this URL at the bottom. PyPI.org slash classifiers, there's a bunch of them. Try and apply all the useful classifiers to your project so that you're describing what this project is for and how it's used. Then you need some documentation, but before you write some documentation you need to work out what format you're going to write your documentation in. You basically have two choices at the moment. One is restructured text which is written in Python. It's used widely in the Python community. All of the Python core documentation is written in it. A whole bunch of the libraries you use have written in restructured text, but it is a Python solution and if you're working on a project that has some Python code and maybe some Rust code or some C code or something like that, those people will probably not have encountered restructured text before, but they will probably have encountered Markdown. Markdown is a valid choice. It is simpler but also less powerful. You're making some compromises here. Both of them allow you to use tools like Sphinx for restructured text or MakeDoc for Markdown to compile a directory of Markdown or rest files into a directory of documentation that's all linked together. Both of these are supported by ReadTheDocs, so you can publish either of these documentation sites to ReadTheDocs and then not have to worry about hosting them yourself. Once we've decided and I've chosen Markdown, again, arbitrarily, we need to write a ReadMe. That's pretty much essential for any modern project. Here we have a title of the project. We have a small paragraph describing what the project does. We should have a section describing how to install the project with some sample command line code for PIP installing this project. We should have some sample code just to tell people how to use the useful code that we've published to PyPI. Once we've written this, it's nice to have this also published on PyPI, so as well as publishing on, say, Github or GitLab or wherever you're publishing your code, it would be really nice if we could make this essentially the official description of our project and we can do that. Again, this is even if you've published packages before and you used restructured text to write your ReadMe, this is now a new feature in PyPI as of about a year ago. PyPI supports Markdown directly, so you don't need to convert your Markdown to restructured text before pushing it up to PyPI. Here we're taking advantages of the fact that setuppyY file is code and not configuration by opening the ReadMe file, reading in this block of Markdown, and then we apply that string to our setup call. So we use the long description parameter just to provide this string value that we put into a variable, and then very importantly we need to tell PyPI this is Markdown and not restructured text, which we do by providing this mime type as this content type parameter at the end. I wanted to talk about dependencies. I've cut this talk down a bit, so we won't actually show any code that uses blessings, but for example if we use this terminal colouring library called blessings, this is how we would add it to our setuppyY file. So we have an install requires parameter, it's a list of these specifiers that describe the library and the versions we're prepared to accept. I will talk a little bit more about those in a moment. If we change the library dependencies or anything else, as I said, we should run our pip install minus e.command again just to reinstall the package and just make sure it actually pulls down the dependencies and that these things work together and then we should run some tests, but we don't have tests and we shouldn't just keep on opening up the REPL and randomly calling functions to make sure that things work. I would recommend you write your tests with PyTest, PyTest is awesome, but in order to write tests with PyTest again we need more dependencies, but this time we're not talking about a dependency of our library like blessings, so we're not saying this is how this is needed to run, we're saying this is a development dependency, so this is something people need to install in order to develop code with our library and in order to declare develop dev dependencies, I recommend you add them as extras into your setup py, but a lot of people here I suspect are using requirements.txt for this, if you have a setup py file I would argue you do not need a requirements.txt, you can do all of this within Python standard packaging framework and you get some advantages because again this is code not configuration. So the way this works it looks a little bit like our install requires, but it's got an extra layer of indirection, so you'll see that it's a dictionary rather than a list, but you still see that list in there as the first value, so the key is the name of your extra, so in this case we're saying dev, we will tell people that they need to install the dev extras in order to work with our project, and then after that it is just a list of dependencies, so in this case we're saying pytest above or equal to the value of 3.7, and then we can tell people how to use it, so again we update the readme, we have a section saying if you would like to help develop Hello World this is how you install the development dependency so that you can run the tests, and it looks very similar to before but you'll see we have the word dev in square brackets afterwards it's saying we're installing our current module with the dev extras, you may have used this with other packages maybe not seen how that was specified, I stole this straight from Atters which I think is why Hinnick is here, so yes if we install the extras you'll see that it installs a whole bunch of other stuff basically dependencies for pytest, so the difference between install requires and extras requires is that install requires is for production dependencies things like flask, click, numpy, pandas, and the version should be as relaxed as they possibly can be so you should be testing against multiple versions of each dependency in this way you're not locking your users into a specific version of a shared dependency so if both you and your user are using Atters ideally you need an overlap there so if you're all using like if you're using version 3 they're using version 4 they're not going to be able to use your package unless it makes some changes extras require is different it's for optional requirements for development or testing or whatever groups of extras you want to create and the version should be in my opinion as specific as possible because you're trying to get your developers up and running as quickly as possible and so creating an identical environment to yours and the other developers who have been working with the code is like that's just going to make everybody's life easier rather than trying to debug like minor variations in your development dependencies requirements.txt still has a place but I would argue it's for apps that you deploy onto machines that you control so in this case you're pinning every single production requirement to a specific version so that you're producing a well tested collection of code on a destination machine so use fixed version numbers with the double equals operator and use pip freeze to just spit out all of the things that are currently installed straight into your requirements.txt so here we write some tests I'm going to zoom through this a bit because I'm running slowly slower than I would like so but yes we run run our code now we just need to run pytest to actually test our code each time it's much easier than actually executing the code by hand it will spit out a bunch of stuff to say what it's what version of Python you're using and things like that and then it will spit out hopefully that your tests past so now what we've done so far this is this is what we've produced we've got a license file a readme file a setup file source directory with our code and a test you can obviously stick your tests in a test directory if you have more test files it's good to distribute source distributions as well as binary distributions for various reasons people can check the code before they run it they may not have access to github to access the code they may just need to verify the code before they run it when you run estist against our setup py file we actually get some warnings saying it would like some more data for some reason estist would really like to know the maintainer and maintainer email or the author email so it's told us that so we can just add those in that's three lines we add the URL of the project a link to github in this case my name and my email address excuse me so now we need to test that make sure that source distribution contains all the files that we want it to so it just when you run estist it just creates in this case of gzip table and we can use the tar command to unzip that and have a look at the stuff inside and when we have a look at it we notice that it hasn't got our license txt file or our test hello world file ideally our source distribution should contain everything that is in this snapshot of code so everything we're distributing everything that gets built into the binary distribution in order to add those missing files into our source distribution we need to write a manifest input file they are fiddly and annoying fortunately there's a tool called check manifest that does pretty much all of this for us or at least we'll get it started quickly so you can pip install it you can add it to your development dependencies if you like you run it for the first time with this create flag and then it will create you a manifest.in I recommend having a look at it it's just things like includes and excludes lines for various files that it's found in the project that it tries to make sure that everything you have in git ends up in your source distribution so it's finding these files and adding them to the manifest input file so then if we build our source distribution again we can unzip it and then we see that now just out of the box the check manifest has created a a manifest file that includes the files that we're missing so now let's publish it it's good to publish earlier rather than later if you try to perfect everything you will really never publish the package so as soon as at the point you have something useful not necessarily perfect try and get it up but apart from anything else it will register your package name on PyPI to your project so you're not letting somebody else just kind of come in in the months while you're working on your project you used to be able to register your project before you uploaded code now you need to actually upload the code in order to register a name so here we run a setup py with the b dist wheel and the s dist command and in our dist directory we'll now have our wheel file and our source distribution in order to push to PyPI you need to use twine for various reasons it separates the build step from the upload step which means that you can do these manual checks of your distribution files before you upload to PyPI otherwise it's a single command to kind of build and push up your code if you get it wrong it's going to mess things up for you so here we install twine we use a twine upload command it also uses HTTPS whereas for a while pip didn't so it's safer sorry setup source didn't if you get to PyPI the website quickly enough you will see the name of your project on the home page as the most recently updated package and that's kind of cool if you click on it you will then get to the project page you can see our readme file is essentially duplicated here there's a github link that I've just cut off at the bottom I had to change the name of the project by the way because there is a hello world package on PyPI obviously somebody has done that so there's still some more stuff that we need to do I would recommend using TOX I really am running out of time so I apologise for running through these I recommend using TOX for testing against different distributions of Python and different versions of the libraries that you depend on here we're just testing against Python 3.6, 3.7 you install TOX you have that TOX configuration file it spits out loads of output when you run TOX and hopefully at the end you get for each one of your targets you get a command succeeded and a little smiley face at the end which I always think is rather lovely here's why we use the source directory so our root directory is the directory we have I've been working in if our code was in this directory if we import hello world while running the tests it will run the code in our source directory sorry in our current directory but we don't want it to do that we want it to test installing the package and using the code from there by having a source directory you are forcing it to use the version that was just installed into the virtual environment you should also build on on clean machines in the past I've used Travis for this I am probably moving my stuff to Azure Pipelines depending on when Hinnick gets his stuff stabilized yes I won't talk about that anymore for extra credit you can add badges to your readme for code coverage for quality metrics you can manage versions with bump version that's quite nice you can test on different operating systems you can write more documentation you can always write more documentation and tests you can add a contributed section to your readme you can implement a code of conduct and there's lots that you can do but I recommend that you don't do any of the stuff that I've described in this talk so there's a project called cookie cutter that generates sets of files from templates and people have already created template projects for PyPI for Python projects so if you install cookie cutter and then you run this command to download Yonald's there's a few of them out there I quite like Yonald's it's similar to my own way of thinking of these things this cookie cutter library it will then it will download the template from github it will ask you lots of questions because it's much more flexible than all the I've given you one option for each step he offers you lots of options different testing libraries and things like that and then at the end of it you're done in theory you will probably have to go and tweak some of these files because they won't be quite the way you want but it took me five minutes to get up and running using this process and then it created all of this so you will recognise some of the stuff in here from the tutorial that we've been through there is extra stuff in there there is a Sphinx directory of documentation with just boilerplate documentation in there at the moment but just waiting for you to fill it out you copy in your code and then you're done so that took me about five minutes could have cut this talk down to the last two slides if I'd wanted instead of wasting all of your time for half an hour but hopefully give this gives you an overview of good packaging practice and all the things that you need to do to kind of build a well-rounded package there's obviously different directions you can move in but this is a really good core understanding of the things you should do for a professionally released package there are other projects for distributing libraries these days that are very interesting that don't use setup py or use it in different ways I would really recommend having a look at them if you're struggling with setup py poetry is getting a lot of mindshare at the moment I haven't really used them so I can't really recommend them I'm trying to push current most common best practice if you are interested in the slides or the code for this talk they are available on this bitly link follow me on Twitter if you have any questions feel free to grab me at the conference come to the Nexmo booth tweet at me preferably not abuse but thank you very much for coming to my talk