 So, this is Tim talking about setup.py and making sense of it all. So, take it away, Tim. Thanks, Lee. Yes. Thank you, everyone, for joining me this morning. Just to check that you're in the right talk, I wanted to point out what setup.py was. If you ever looked inside a code directory and you've seen this file and wondered what on Earth is going in there, this is the talk for you. So, I guess where we're going to go through, I'm going to have my touching story to begin with and then we're going to talk about using this file for four different use cases. One is people who are just installing stuff and want to know what's actually happening with pip install package. Then people who are building stuff and then maybe people who are building stuff that might be more advanced than having to build big projects that combine multiple programming languages or build systems, it turns out that you can actually wrangle setup.py to help you out with that. So, I guess the heartfelt personal story is that I was really intimidated by this file. I wanted to upload things into PyPy, I wanted to be able to wrap my software and actually enable it to be installed by my future self and that became actually surprisingly difficult. I will admit some ignorance here. It got to the point where I would not just be abusing sys.path because it's just a list, you can add to it. I would be including environment variables that had locations to directories and then kind of walk that tree of my development thing and then I would kind of have this ad hoc versioning system and then I realized I really should work out how this thing works. All right, so in this section we're going to be covering actually what the definition of packages because we really want to start right at the beginning and where to go for if you're actually having a lot of difficulty. Just because I think it's really important because some of the people in the room won't be able to distinguish a module from a package from a distribution and these terms appear a lot in the actual documentation. So I guess the package could be pretty much usually defined as like a folder or a directory that contains this like double underscore init.py file. That's, people will kind of, you know, that's good enough for this talk. If you have a folder and there's .py files in there, it becomes a package when you have ..initpy. So it's something that's sort of special. It kind of, now a module is just really a file that has a .py extension. Again, that's kind of a good enough definition, doesn't cover all use cases and a distribution is a term that's used in the docs but really when you look at it, it just actually talks about how one package relates to others and effectively it's an organizational thing. I wouldn't worry too much. There is a whole bunch more terminology that can really be will to someone if you've just come to Python. You might encounter easy install, easy setup, distu tools, site packages, eggs, wheels, package resources, a whole bunch of other stuff. This is only the beginning. For this talk, I just want to assure you that a lot of it is not necessary. And if you read anything that talks about distu tools, which you probably won't encounter now, but you might, the setup tools is the new distu tools and it will probably work seamlessly into the new stuff. So the old stuff is still good stuff, right? So just to kind of cover all bases, you open up a terminal window, clone your code and we're going to be talking about this hypercode thing a lot. We enter that directory with the cd command and then we type python setup.py install. That's all you need. And like that's all PIP is doing a little bit more because it's finding the file for you as well, but ultimately it's running something pretty similar. So just kind of relax. That's pretty much it when it comes to installing things, but you've got help, dash dash help as it happens. And you'll find that there are a whole bunch of commands in there and do you think, what are they doing? Open up commands and then you'll find help install. Turns out though that they're actually focused on being comprehensive rather than being accessible. Let's kind of look through that. So if you do dash dash help, you'll encounter this and you'll probably be only able to see the bottom because it's gone right past your terminal window, but really this second, this line here is what actually matters. So if you're a newcomer, this is really not that helpful in my view. Help commands is even more difficult because it's actually saying as well as install, which appears here, you have everything else. And that's because it's distribution, distribution code has so many edge cases and the Python, core Python developers have kind of learned by experience that there's a lot of things that people want to do with code. It's a very, very widely deployed language and it's used in a lot of places. Now if you ask, you can ask for help in a specific command, but again, you get lots of stuff. I would just encourage you to breathe through it now. Really, there's only two things you need. One is install and the other is develop. I actually use develop quite a lot because instead of copying everything into your package's directory, it just creates some links, which actually speeds things up, especially if you're using package data, which I do because I happen to use natural language processing and machine learning. So I've got a whole bunch of training data that I don't want to ship across to somewhere else in the file system. If I'm ever building like vagrant files and kind of playing around with containers and so forth, it's a similar approach if things will build faster if I just use develop, even though I'm kind of abusing that. Everything should work fine with virtual environments because that Python keyword at the start changes when you activate a virtual environment. So instead of referring to your global package directory, it's talking about the virtual environment's package directory. Right, so we're going into that second stage, how to actually create code for people who are, sorry, use setup to find for people creating code. And then how do we grow from a single file out to a whole project? So this is this hypercode thing we talked about. That's our amazing piece of new technology. And we've just plonked a setup.py file in there. And this is all we really need to do. We define a little bit of metadata, which is the name for the package. And then we want to say that our package contains one module, which is that single file. And I remember that a module is just really something with a .py file at the end. The system will growl at you and say, look, you're missing out URLs and things. But it will still do what it's told and comply. So you might build a little bit more code. If you're not from the Python land, you would probably call this SSC. But in Python community, it's often become convention to have your project directory then your package directory name the same thing. In it file, what is our setup.py file look now? We've got more metadata. And now we've included who's building this, how to contact them, the URL that it's founded, we can also start to specify dependencies. So hypercode depends on hypocode as well as Greek to be able to interpret everything. We can do more. So as we start to build our project up, we've got these two scripts, type and call down that we want to include. And all we need to do is refer to them. And this directory is relative to setup.py. It's not relative to the package's directory. So if we go back to here, it's relative to this file, not here. Okay, so I want to get when things become a little, if you read through the documentation, it's like, I want my like configuration, not configuration files, there's actually something separate for that. But some data that comes along with it. So I might have serialized some big thing. And I want to include that and refer to it. In the documentation, this will be referred to as package data. And there's another section which talks about additional files. I don't like additional files for the following reason. It will actually copy things into sys.prefix, which is your, so effectively it's going to copy things into your site packages directory. And I find that a little, that'll clutter things up. And it's like, well, where did this .pickle file come from? I don't know. Because if you look into that site packages directory, there's no kind of asterisk to say that it goes back to which of a package it comes from. So I sort of tend to steer clear. We're kind of moving right through. So now you've got into the situation where you're actually sharing your code with others. You can install it, ship it around. You can probably compress things as zip files. One of those commands that we'll refer to, if you look in that help commands thing, one of them is sdist and bdist. Those refer to building a source distribution. Effectively you create a zip file, which you can give to other people inside your organization. Now we want to, we really want to share our code openly with others. Our setup.py file is becoming much larger. So all of the original metadata that we discussed exists, except we've actually got long description as well as description. What I often will find people doing is just reading in their readme file and using that so they don't have to type things in twice. The reason why long description is useful is that on the Python package index that long description field will be used as the document, as the description of your package on the website. So if you've ever gone to pipi.org slash whatever project, there's often a quite variable description of each package. That's because some people are lazy and they don't actually include long description. Or it's not very long. We can also add a little bit, we can add some more metadata. So I can start to describe how mature this package is and kind of raise flags. And I've also said that, you know, hypercode is BOS specific because I thought that was kind of fun. And what I would strongly urge you to do, if you are considering to kind of verge into this open source space, is describe your license quite specifically. If you're releasing code that you've produced at work, it probably is owned by your employer. And so there's probably, and if you are sort of a lone developer in a small shop, you kind of will need to figure out the process of making sure that the copyright is assigned either to you or you attribute the right person. Because this stuff kind of matters, especially in a New Zealand context when a lot of the licenses don't take into the account, for example, that you can't say absolutely no disclaimer. So if you ever open a GNU package, often it will say absolutely no warranty. That's not the law in New Zealand because if your code is used by consumers here, it's protected by the Consumer Guarantees Act. And so it's really, really important to start to learn about the law a little bit. And you are an effective, you know, I mean, I think it would be unlikely that someone would take you to the Commerce Commission and demand liability. But who knows, you don't wanna do things that are illegal. Again, there are two important commands. There's the distinction, which I found really confusing when I first came to it, between registering and uploading. I just thought that if I registered the code, it would somehow then do uploading as well. Beginner's mistake, I guess. So when you register a package, what you're doing inside the Python package index is claiming your space. So I have the thing in, I've got a couple of packages up there and no one else can use my name or the names that I've used. But there's actually nothing that other third parties can install. Until I've uploaded files. And if I run upload, setup.py will, or setup tools underneath, will actually go to the effort of building distributions, packaging up into the correct zip files, including all of the metadata, getting everything in the right place so that other people can install it and their Python versions will know if they can install it and then send it up into the cloud. One thing I think would become, one thing that is especially useful is that you have the ability to sign your code. It's not gonna provide a huge amount of protection for third parties, but it does prove that the person who controls the certificate is the person who uploaded the package, which is something. And probably something quite useful. Now, setup.py for people who wanna do other stuff. I want to look, I, the way two projects have used setup.py and the setup tools infrastructure to enable other build systems within the Python projects. So can I just quickly, I'm from the machine learning stuff, so who's used LMXML for, okay, few people than I thought. I was gonna say, well, you've all used Scython, but actually no, as it happens, you haven't. So Scython is another programming language and its output is C. So it is basically writing C with Python syntax. And it's been built by the numerical computing community and scientific computing communities primarily to be able to make crunching numbers really, really fast and building extension modules. And the other project is Rust. They both go about the, the Python interoperability story in a different way. Oh, just as another question, who here's built a C extension before? Right, okay, well maybe 10% of the room. A C extension is a fancy word for something else you can import that happens to be written in C. It's actually a scary word for something that's not that scary. So Scython again goes through, it builds C source code and then uses the normal thing, the normal infrastructure within side setup tools which knows about C compilers via this function called Scythonize. So Scythonize is great. You create these .pyx files which is C with Python syntax. It creates you a C source file which you can inspect to probably get about 10,000 lines of C code from your four lines of Python, let's say. Because it covers every single case on every operating system. So it's not just, Scython is an amazing piece of software engineering. The one thing that I would note is that setup.py is run, oh sorry, this setup function is run every time you execute the code, right? And so is Scythonize. So if you have a big project and you've just got a source distribution and you go and inspect it with dash dash help because you kind of want to interrogate the package before you want to do anything further, it will build the C and the extension. Doesn't matter what you're doing with it, without you asking. So it will do it every time. This Rust-ext package does something different. It kind of delegates work to the Rust compiler and Rust can be asked to create things that look like C binaries. It's actually really, really easy to do. And so the tooling that they use is effectively this command class functionality with inside setup.py. The command class keyword enables you to swap out at runtime what happens later on inside setup tools, if that makes sense. So install lib, or install underscore lib is one of those commands that can be executed from the command line, just like install or develop or upload that we've been chatting about. And here I say, I would like you instead, dear Python, to please use install lib, including Rust, instead of your default. And all, as it happens, all C extension's are zip unsafe, which means something to people who do this. So as you start to play around, go and explore what that means. But for now, okay, I can't put in a zip file as is. So just a little bit more. This build Rust command class is effectively teaching Python how to build Rust files via delegating that to cargo, which is the pip of the Rust world. And the cargo.toml file is the setup.py of the Rust world. So we're kind of thrashing the time for, we're at the end of the talk, nearly. So I thought I would add these as kind of like bonus to-dos or homework. I really, it was encouraged, like I was really impressed with the setup tool source. And in fact, the distrute tool source and how they go about argument parsing. So I noticed every invocation of that dash dash help was different and it all changes depending on which command line arguments I specify, as it happens that all of that happens at runtime on invocation of the script. And so if you want to be able to create a command line utility that which is literally used by millions of developers every day, which this one is, there's a great way to go and say, and that has a huge amount of functionality in there. This is one way to kind of look about how to provide structured help for people using your package. I was really impressed at Grail and Dublin's talk. I think I've just misspelled his last name. Hopefully he's not here. I think I just saw him left. I would look up AutoRapt. It turns out that there's a convention within, let's say this is Python path that's probably PyPyH files that rather than PTH that will execute arbitrary code if they start with import space. And that seems kind of fun. There's actually more, you can actually encode a lot of those keyword arguments in files called setup.config. And pydistutiles.config is something similar for more like global options as to where I would like things to be installed. So rather than just polluting the system, site packages directory, you can kind of have a little bit more autonomy as to where you would like that to go. Also, have a look at entry points. So entry points are defined as things that, well, what I like, what I can see them being most useful for is being able to create command line utilities really easily by taking, you just kind of specify a function that takes no arguments. And this entry point thing will figure out how to turn that into an .exe file, a, I think a bash script for whichever platform that you're installing. So it's a way to be able to create scripts that can be run by other people, no matter where they are. So again, I just kind of want to reiterate, the setup tools infrastructure that is provided to you is really powerful. And people give Python packaging a lot of grief and talk about how broken it is. There's actually a pep that describes how we should make setup.py obsolete that was never implemented. But there's been kind of a lot of hate around this whole Python packaging thing. And I just think it should get a little bit of love. So there we are. Thank you. Thank you very much. Is this even on? Yes, it's all done. Thank you very much. We have a lot of time for questions. So let's get started. So great talk, thank you. One other thing that I just recommend people look at is a command line tool called twine. It's T-W-I-N-E. It basically serves the same function as Python setup.py upload. But what it does is it actually verifies the SSL certificates. It does a secure encrypted upload to PyPy. So twine is sort of the recommended way to upload things to the cheese shop. Right, right. Certainly setup.py upload works, but twine is enough. Yeah, yeah, so in case anyone hasn't uploaded to PyPy, it by default is gonna upload by HTTP, which has, and so there's a little note I just noticed today, there is an option to do SSL uploads, but it sounds like twine is just gonna do that all for you. If I'm writing a setup.py for a package that I want compatible with Python version two and Python version three, is there any gotchas I need to worry about, any way it will behave differently? Right, so a setup.py is just a Python file. So if you have a print statement in there, it's not gonna be happy in Python three. If you have a Python three interpreter interpreting that script, it will blow up on you, well. So it's just Python. There's gonna be nothing in there, I think, that would be two, three dependent. And I believe you can also specify whether it's Python two and Python three. Yeah, so you can definitely say that my package only supports three, five and up, for example. So if I have, can I run upload multiple times once my version goes up there, so version identifier, so register once and upload multiple times. That's right, you can, the T shop, the Python package index is pretty smart and if you increment that vision number, it will have the latest file available for public consumption. And when people use PIP, they'll only pull down the latest version. It's going to rely on you using semantic versioning inside your setup.py file in order for that to work properly. And if you use some obscure versioning system, you just wanted to use whatever you wanted, like this is my star version, this is my unicorn version, it will blow up on you and say, look, I really can't interpret this. So please just use one, two, three. Any further questions? I did see one at the back. Any more for any more? This one now on the front. Sorry. Oh yeah, I maintain a package which has to work from Python 2.6 up to 3.5. At what point does from setup tools import setup actually work? Does it work in the older versions? I know, because it isn't included in the standard distribution, I think, of 2.4. So that's what, in the extra craft of keywords, one of them is ez underscore setup, which is effectively another Python file that downloads setup tools and installs setup tools for you. So... That's what I am using? Right. I just wanted to make sure that was still the correct thing to do. I may be wrong on the specific versions, but it's around about 2.5, I think, that it came in. Any more for any more? Going once, going twice. Thank you once again. Have a great day. So we have lunch now upstairs, and the next talk we'll be back down in here or in the next door at 1.10, if I remember correctly.