 OK, so this talk is best practices for reusable Python. OK, so why do we want to write reusable Python? We want to capture our efforts. We want to capture our learning. It takes time to learn how to do things. And if we're not focused on the reusability of our code, we will tend to do things in such a way that it's not easy to reuse. And maybe we can save it in a file somewhere, and then it's kind of gone forever from our memory. And writing reusable Python helps us reduce the marginal cost of solving more complex problems. So this talk is about optimizing the reusability of your code. We're going to be focusing solely from that aspect. So if I make recommendations, which may be Brash assertions, then it's because I'm trying to look towards the reusability versus maybe using features of the language that may not be as reusable. So what is reusable code? Reusable code is code that's easy to use. It's tested, it's compatible, and it's well-documented with doc strings and maybe an online documentation, as well as demonstrated with tutorials, demo files, and other examples in the documentation. So to begin with, this is my first Brash assertion. I would suggest that if you're focusing on reusability, you want to write forward compatible Python 2.7. So if we're writing in 2.7, it makes sense to use the future. We use features that we expect to be in heavy use in Python 3. So explicitly, we will do from future import division, print function, absolute import. And I commented out a thing. Let me uncomment it. Pull that back up. So you updated the timestamp? There we go. Yeah, you can follow me on Twitter if you like. Sometimes I tweet, and sometimes people tweet back. I got really lucky on this one. And Guido van Rossum actually responded to me. He highly recommends these three. Unicode literals, he says, tends to break stuff. I'm not sure. That's one of the reasons it's taken so long to get twisted, ported into Python 3. So I'll take his word for it. OK, so some more Brash assertions. We're going to try to minimize the difference between code that runs in Python 2 and code that runs in Python 3. So you want to try to write code that's compatible in both. So when things are different and you really need to do something different in Python 2 versus Python 3, you'll try to use the thing that's in Python 2. And then if that fails, use the thing that's in Python 3. But you want to minimize that. And then maybe you're going to need to use 2 to 3, if necessary. And it'll probably be necessary. All right, so when we're writing our modules, what I want you to do is I want you to run your code with the dash m flag. Don't run script.py, run package.module. So if we collect some Python code, because we're trying to reuse it, so we might put some modules into a package, we do script.py. It gets a lot more difficult to keep our code running and doing the things we're expecting them to. So we want to run that kind of code with Python dash m package.module. And we do this from the parent directory that contains the package module or if that package is installed already for us to work on. OK, so why do this? So here's the principle. I got this from Nick Coughlin's article on import traps. Never add a package directory directly to the Python path. Never add a package directory directly to the Python path. So that would be we have package.module maybe, or package.script. Then we would, to violate this principle, we would say we add the package directory to the path. Not the containing directory, but the package directory. OK, so that again is why we want to avoid running our code without the dash m. Because that's exactly what happens when we try to do Python package script.py. So dash m treats the module that you give as the program's entry point, the main. Dash m puts the current working directory on the path. And so again, my tip is have your top level parent package inside your current working directory. And as a result, you have the same internal package lookups when developing, and when in production, when you're trying to reuse your code. Maybe I need a new graphics processor. Dash m, OK, so dash m allows you to use an importable module as the entry point for your program. So here's some more examples. These are examples from the standard library. We have, for example, the JSON tool. This is useful for validating your JSON, as well as it outputs a pretty printed version of the validated JSON. The time it module can be used as a command line tool as well with the dash m flag. OK, so how many people are using if name equals main? Main. OK, I'm sorry. How many people aren't using this? How many people aren't? We have anybody. I did say this is a beginner's level. Did we put that on the site? OK, so the reason we do this is we want to minimize the code that is running at the entry point in our code. And we need to have a guard so that when we run, example is a module, we import it into another thing, then we don't want the main function to actually run then. So that's why we do that. Now, this is example.py. This is basically the same semantics, but in a package. So we will create our package directory, create a dunder init.py, add a main function into that, then we will create a dunder main.py, and at that point, we will import the package. When we use the package as the main entry point with Python dash m package, the package itself is not imported and ran yet. It's just the code in main that's running. And so we have to then import the package in our main.py in order to access the code that's in dunder init.py. And then we have the same semantics as in this basically hello world. So when we write a module, we want to declare an all. The point of declaring an all is to limit the public API. So what is an all? An all is basically a list. It's at the module level. It's a dunder all. It's a list of strings with the names of the things in your module that you want to export, that you want your users to use. So for example, in our package directory, we have our init.py and our main.py. So in the init.py, we can have from .foo import star and from .bar import star. These are explicit relative imports. And that's the good thing when you have heard maybe a prohibition against relative imports. That's implicit relative imports. That's what one of our future import, from future import was about. So from .foo import star, from .bar import star, then only the names in the all in foo and bar will be imported. OK, so again, when you import package, the names available in package are only the names listed in foo and bar. And so the reason for this is we, in our IDEs, that give us code completion when we import my package, for example. And then we type my package dot, and then we don't type anything else. We get some things that might, in nice IDEs that will try to help us out, if we had imported sys in OS and we don't declare an all, most IDEs are going to say that OS and sys are also available here. And they wouldn't be lying. If we declare an all, our IDEs will know not to show us those names. We don't want to see OS and sys when we're trying to use a package. OK, so when you're writing your scripts and you're executing them with dash m, if so, you do have modules. Factor out your shared dependencies into maybe a core lib directory or package, I should say. Put demos in a demo package, put utilities into a utils or a tools directory, and use module level doc strings. OK, write functions. Why write functions? I want you to put the logic and control flow into a single point. If you need to update it, you only update in a single point. You don't repeat yourself. What's really bad is to have two places in your code where business logic is trying to accomplish the same thing. And then you realize that code is buggy. You fix it in one place, but it's broken in the other place and still broken in the other place. But because you have repeated yourself, you've neglected the fix that don't want to get code reused with copy and paste. We want to give our functions good names. Functions do things. They are verbs. Concise is better. Concise is nice. Descriptive is better. Time spent on this is time well spent. Give your functions good doc strings. So the Google style, I think, looks nicest and most concise. Maybe these are Brash assertions, again. NumPy style seems like the most cross compatible to me. I mostly use restructured text because I use PyDoc. Doc tests can allow you to test your documentation. So if you put doc tests in your code, in your doc strings, you can have unit tests, run tests on that code. So what pops up in your documentation, as this is how to use my code, is actually tested when you run your unit tests. So yes, you can reuse your doc tests. If you weren't aware of that, you can reuse your doc tests in your unit tests. You can actually call your doc tests from when you run your unit tests. And again, you can import your documentation into Sphinx and, again, avoid repeating yourself. OK, so practice information hiding. Shorter is better. That means factor things that make your code long and verbose. If you have a chunk that's semantically doing a thing, take that chunk out of the function, put it in another function, give that a good name for that thing that it was doing, and use that when you call that thing in your function, and someone else looks at your code, they can immediately see, OK, I'm doing A, I'm doing B, I'm doing C, and then I'm returning. That's great. This code is readable and understandable. I know what's going on. I like that. And order them top down to tell a story. So A is this thing, that thing, and the other thing. B is this thing, that thing, and the other thing, et cetera. So you want to try to place them in the general order in which they're used. Write objects. Objects have state and functionality. So you use objects already. Does anybody here afraid of objects, or think maybe objects are a bad thing? So I get that. You don't have to raise your hands too high. But try to get less afraid of them. When you're writing reusable code, when you're focused on reusability of your code, you're writing not only for your future self, but for other programmers. Other programmers are going to appreciate you making your code so that they can reuse the functionality. So an important time to write objects is when you need to encapsulate the state, not worry about how it works. When you need to marry the state with the functionality. When you want to use a thing as an argument, you want to call methods on it and expect it to know what to do without you understanding how it's implemented. Classes are the definition of reusable code. So parent classes provide expected interfaces and operability. Mixins provide functionality. Abstract base classes provide interfaces. Python has no restrictions on the above. Sorry for the graphic glitch. Use expected protocols. Inherit from the abstract base classes. So you're not required to, but when you do, you've basically made a promise to users of your objects and users of your code that you have implemented the expected interface. Inheriting from the ABCs from the abstract base classes makes this easy. So you're reusing code from the standard library yourself. So for example, semantic lists of objects should be iterable in container. Sorry, I've got to uncomment some code again. Again, we can see it's 335. We've got 10 minutes left, folks. Can we make it? There we go. Here's the example. All right, iterable in container. We are reusing code from the standard library. So to use these abstract base classes, the abstract base class is iterable in the abstract base class container. All we do is inherit from them. And they require that we implement a dunder iter and a dunder contains. Iterable requires a dunder inner. Container requires the dunder contains. The iter is a special method that is supposed to return a generator, an iterator. It's supposed to return an iterator. A generator is an easy way to implement an iterator. And so by putting a yield in our dunder iter, we create a function that returns an iterator. Contains, it's a function that takes an argument and tells you if it's in that thing, if it's in that object, if it's in the instance, or the semantics of thereof. In this case, we simply delegate to a list or whatever we're given in our things. And our iterates over our things and our contains. Check to see if it's in self.things. And so by using these things now, we can pass an instance of my container to something that expects an iterable that will iterate over it. And we can use it in a context where we can test for membership in that thing. So we have created, we have programmed to an interface for reuse. Dunder slots don't like multiple inheritance. Read up on slots. Slots are a very nice thing. There's a lot of optimizations, kind of optimizations, where you trade off space for time. But slots actually save you space and they run fast. And they make your objects run faster. But they don't like multiple inheritance. So that hurts their reuse. All right, so we want to compose objects with mixins and delegations. So when we used two abstract base classes, we composed an object with some interfaces. Mixins provide functionality. Some abstract base classes from the standard library also provide functionality. Highly recommend looking closer at those. To me, delegation, which although we did in the prior example, strikes me as adding a lot of redundant lines of code with little value created in exchange for it. If you're delegating a lot, maybe you should use a mixin. But maybe not. For more on writing objects, see Aaron Hall's talk from the last PyGotham, the Python data model, when and how to write objects, which is available on YouTube. Shameless advertising, self advertising. Give your functions, classes, modules, and objects good names. OK, so we call something F, G, I, J, X, Y. So what are these things? If we're doing local algebra, that makes sense, maybe. If it's I, if it's A, B, and C at the beginning of the module and the module grows to 500 lines, no one wants to deal with that. Generic helper module. I just have a personal bias against the word helper. So maybe these are kind of brash assertions a little bit. Tuple result object. So here we're kind of encoding what we think the thing should be. And we're also being redundant by calling it an object. And Python, everything's an object. Objects redundant. Tuple is saying, hey, it has to be a tuple. Well, what if it's a named tuple? What if we want to use a list instead or a set? Then we have to change the name. OK, so do be expressive. Avoid generic, miscellaneous, and redundant names. Don't try to encode the type in the name. Use names from the solution space, especially for abstractions, like if you have a consumer thread. Use names from the problems space for the concretions. For example, that you have a publisher thread that is a consumer thread. Environments, packaging, and deployment. What's the good of reuse if no one else can reuse it? All right, so we want to use virtual environments instead of installing directly with, say, PIP or another package manager. So say we're using Ubuntu. We could do sudo app, get install, virtual end. Create an environment with a virtual end end name. If we're in Python 3, then we can use the VMs module with the dash M and create that environment as well. Then we activate the environment, and then we can do our PIP installs. We want to avoid modifying the Python path. So a little more robust solution. If we're developing a package, we have a setup.py. We will develop on that setup.py. I'm sorry, develop on that package. Do that as opposed to modifying your Python path. We don't do sudo PIP install because we're worried that could break our system Python. I understand Ubuntu automatically adds the user flag for you, but rather keep good habits and encourage good habits. So we can publish to PyPy. This PyPy is not the Python interpreter. This is the Python package index. So we'll use wheels and we'll use twine. We'll build from virtual end, call activate, do PIP install twine, and set up that Py to build the distribution, and then do twine upload. OK, conclusion. I'm on time. Oh, doing perfect. I want you to write easily reusable Python. Write compatibly, develop on two, test on three, test on all platforms. Write functions that take a small number of arguments. Don't write comments. Write doc strings. Give functions descriptive names. Demonstrate how to use your code in both tests and doc strings. Import your doc strings into your documentation. Write a full test suite, unit test, acceptance test, smoke tests. Write objects. Write and use mixins. Write and use abstract base classes. Publish your code on PyPy so that others can use it. And that's the end of the talk. Any questions? When anyone does have questions, you should use your mic if you have one, and just be sure to turn it off at the end. There's a red light if it's on. OK, thanks. Yeah, I don't have that much experience in Python, but I have code in Java. So I was wondering, I know in Java, you generate documents using Java docs. So can you explain, is doc strings the equivalent of Java docs, or is that a way to generate a document? OK, so the question is to explain how doc strings work. There are a ton of examples online. Just you know you need to use them. Look up how to do it when you're ready to write a doc string. Yeah, next question. Yes? What are your thoughts on using mix-ins with testing? Using mix-ins with what? With unit testing. I've heard that your tests should be kept simple so that they're easy to debug when something breaks. But at the same time, when you're running a lot of similar tests over very similar objects, it only seems necessary to use mix-ins. I wish your mic worked. Does that little button not work? I can't hear you. Do you want me to repeat the entire question? I still can't hear you. OK. My question was, what are your thoughts on using mix-ins for unit tests? Because I've heard that it's better to keep them simple, but at the same time, when you're reapplying very similar testing strategies across, like say, a. OK, my thoughts for using mix-ins with unit tests are they're great. They are a really good example of code reuse, reusing functionality across a large set of classes of tests. Yes, I really like the idea. We do it. I did it. It's good. Any other questions? Yes. So would you have any good references for somebody who's never really written unit tests before to understand how to write a good unit test? So unit is a semantic idea. You're going to have a kind of a setup, probably. You're going to probably create some kind of state that's going to mimic what you would do in production. And then you're going to need to tear that down. You want to be really careful and think about the seams. So if there are things that you need to mock, that you want to mock at the seams and try to test your code, you don't necessarily need to test well-tested third-party code. Other people's code, you don't really need to test. You need to test your code, your logic, your control flow. So you want to try to set up things that cover all the branches in your logic. If you keep that simple, if you try to factor things out, that's going to reduce the branching and logic. And that'll make your testing a lot easier, too. Any other questions? Yes. Thanks. I know it's very high-level. Very high-level questionnaire. But how do you determine when to use a class with a bunch of methods versus just a straight module with a bunch of methods that take an input and output? So the question is, when do you go from using a module with a bunch of functions to a class definition with methods? To me, you can't instantiate a module. People don't expect you to pass around modules as arguments. For the most part, it is a thing. It happens. But so when you need to pass around the thing that has these methods on them and you want them accessed from the data itself, you want the functionality attached to the data, again, marry the function with the data, that's where you write the objects. Does that make sense? So I think we're really out of time. So I hope everybody enjoys the rest of the conference. Feel free to come up and talk to me. I'm a lot less stressed out now that my talk is over. And sign up for Free iPad. There's a test thing. And have a great conference. Cheers.