 Hi, hello, everybody. So you want to be a rock star. Don't worry, everybody secretly wants to be a rock star. So what does it take to become a rock star? Well, step one, you need to master your instrument or multiple instruments if you play many of them. But there's a very important step number two. And that is that you need to learn to play in a band. As a rock musician, you're always on stage with other people. You're always playing in a team. So if you want to be a Python rock star, because that's your instrument, then you need to learn to play well with your band, which is your team of other programmers. And there are tools that can help you work together, play together better. And those things are standards, things that we can all agree on, best practices that we found that work for us the best, and tools that can actually check how well we're doing following these standards and best practices. So hello, my name is Miho Kozheński. I come to you from Poland, like was said, and I work at Intel on a very interesting open source project called N-Graph, which is a deep learning graph compiler, which I'm sure you will be hearing a lot about in the coming years. But I was, in my spare time, playing around a little bit with the OpenAI gym. I don't know if you know this. It's a reinforcement learning environment where you can teach an agent to solve some puzzles or play some games. And what I found was that there wasn't actually a very good way to find out information about environments that you have installed locally. So I was playing around with these environments. They're very good APIs that you can use to explore these environments. So I wrote up a little tool that is able to help you explore these environments. And I'm going to show you that little demo right now if I can figure out how to get my mouse pointer over there. There you go. So this is just a command line utility. You just type something in. It shows you the environments that you have installed on your computer. Helps you pick the right environment for yourself. And you can watch a random agent play Space Invaders and die quite quickly. But you also get in the background some information about the rewards that it's getting as it's playing, which is helpful when you're developing your own algorithm in the gym. So OK, I've done that. And I thought this is a very small thing, but it could be useful to others. So maybe I can release this as a package. Maybe somebody will be able to use this to play around as they are preparing to write their reinforcement learning algorithm for playing in the gym. So then I thought, all right, if I want to release this as an open source project, what will it take? And I wanted to put all these tools and best practices that I've learned from working on a larger open source project into this tiny open source project. And these are all the things that I found I needed to use. So my slides will be tagged with these little bubbles. And the first set of bubbles shows you stages. So you first you need to prepare your code for some of these things. Then you can automate some of the things that I will be talking about. And finally, you can put all of this into a nice CI environment for continuous integration. The little tags with a book icon show you references for more information about this, things you can Google for. The tags with no icon at all are just things you can pip install, the names of packages you can pip install. And the tags with a little external link arrow are names of services that I will be talking about, which will come up on some slides. So that's the legend for the next slide. But OK, let's start. So if you want to write a command line utility, you need to define your user interface. Your user interface, in this case, is your command line interface. And the expectation of a user when he's coming to some utility is that the command line interface will work like this. If you type in the name of the utility with no other options, it'll give you this one line short description of what the syntax is, a reminder of what the syntax of the command is. If you want to find out more, you do the dash dash help thing and you get the longer description of the interface. And then you can provide the options, the values for the various options, either with long names or short names. So that's what a good command line interface looks like. And actually, it's written up in GNU guidelines for command line interfaces. So how can you do this? Well, actually, very easily, there's a tool that I like very much, a package called docopt, which allows you to define your entire command line interface just by writing a doc string for your script. So you just write this documentation once, and it acts as the help text that will come up when the user uses dash dash help. But it also becomes the input to the docopt function, which is provided by the library, which parses all the arguments, takes all the values from the user from the command line, and gives you back a list of argument values that you can use directly in your script. So that's the first tool. I want to recommend to you docopt. But there are, of course, other ways you can approach the same problem. OK, so now we have a script with a nice command line interface. What's the next step? Well, the next step is to put all of this in a project, in a package. So the standard we have in Python for laying out code in a directory that will go up on GitHub is like this. The root directory will be the directory for your entire package. And then inside of that, you'll have a readme file, a setup.py file, some source code, which can either go into a directory name the same as your module or better yet just the directory code source. And then you may have some tests and some docs. So that's where you can put your code. OK, so that's ready. Now, the next step I was thinking about, OK, so what should I do next? Well, I have to refactor my code a little bit. So it's not just one big long function that does everything, but is going to be something more maintainable. Maybe I'll have some contributors coming in. Maybe they want to add some features. It's good to prepare the code in some way that we can all use to work together on later. And so I'm just going to mention the standard. I think that we should all be following, which is the clean code guidelines from the famous book by Uncle Bob Martin, Robert C. Martin. And the TLDR of clean code is basically that you should just write small, single-purpose functions with meaningful names, with arguments that have meaningful names. Each function serves a single responsibility. Doesn't take many parameters. Uncle Bob says that two is the most that you should have. And preferably no side effects. And that allows you to write things that you can easily test. So you should write unit tests for each one of them. OK, so some refactoring done. Let's get to the next step. Well, now the good practice that we are all following is to use this construct, where we're in our module using if name equals main, then we execute the function. This actually does two things. One, allows you to import the code from the module in another file, and refactoring the main function into a separate function will come in handy very soon, when I'm talking about entry points in a second. So the next step is to prepare a setup.py file. Now, this is actual setup.py file that I wrote. It's not perfect, but there was a talk yesterday by Mark Smith about writing the perfect, preparing a perfect PyPy package. So you should check that out on YouTube afterwards if you haven't seen it. But this is basically all you have to write to get setup tools to package up your code. And the arrow is pointing to a little trick that you can do to, if you have a readme file written in Markdown, just use that as the long description for your package that will later be available on PyPy. So I recommend doing that. Then, if you already have a setup Py file, you can use it, of course. The basic use of a setup Py file is to prepare your packages. So you can either prepare a source package or a binary distribution package, a wheel, which you can then upload to PyPy. But you can also use setup Py and you should be using it this way during local development. So you can actually, inside of a virtual environment that you created for working on your project, use setup Py with the develop option to install it locally. Another way to do this, maybe even better, is to pip install dash e and the current directory. The dot indicates the current directory where the setup Py file is, which we'll actually just call setup Py develop through PyP, but it allows PyPy to handle the dependencies as well. So another thing that setup Py allows you to do is to define entry points. And this is a very useful feature of setup tools that not everybody takes advantage of. Entry points allow you to actually combine multiple packages into systems of plugins. So you can have a main package, and you can have other packages that are plugins for that package. And these things can be defined through entry points. But a very simple use case for entry points is to define the console scripts entry point, which just creates a command. So my command, in this case, will become the command that your user will be able to call at the command line after they install your package. And this syntax here maps to a specific function in a specific file, in a specific module, with this notation. So if you want to write a command line utility, you should probably write a console script entry for your setup Py file. OK, so next subject that you need to take care of is requirements. And this is a big subject, which I will only be able to skim over due to time limitations. But the gist of it is that you need to provide a way for your users to set up an environment that resembles your environment as closely as possible. And the way I use requirements TXD is to provide a list of specific packages at specific versions that I've tested the package with. And this is very useful for your users to then find out, OK, if it's not working for them, maybe one of the dependencies is a different version. So the simplest way to create a requirements file is to use pip freeze. And then you can install these with pip install. And I would recommend separating requirements and that you need for the actual running of installation of the package from the ones that you only need for testing, because that will come in handy later when you're automating some CI processes. So I'm not going to get into pip pan or other approaches to handling requirements, but you should look into those if you're curious. OK, next best practice. This is official now, I think. We should all use black. So it's very simple to use. You just install it, and then you just run it on your source code. And it just reformats the hell out of it, but it does it in a consistent way. So you may not like it, but the way it works is consistent, and we can all agree. And it's a huge value that we don't have to argue how we're going to be formatting commas at the end of lines. There's a way that standardized, and let's just all stick to it. So black is just one formatter that you can use. You can actually have a number of them, and if you use them, then a very good practice is to use them together with pre-commit. Pre-commit is a simple tool that you install, and then the first time you want to use it, you run this command pre-commit install, and it sets up a git pre-commit hook for running all of your code formatters. So if you want to use pre-commit with black, then the configuration file on the left, which you should store in the special YAML file called dot pre-commit config, will download black from the internet and prepare it for running, and then the next time you want to commit a change, the type git commit, that will trigger black and run it on all your files. And if anything is changed by black, meaning that it had to be reformatted, it will prevent you from actually committing the change. So it's a good, useful tool for very quickly checking your formatting before you even commit the change. Another good way to test if you're actually following all the standards is to use code linter. My favorite is Flake 8, but there are, of course, many others. And why I like Flake 8 is that it has this plug-in architecture that I described before. So you have Flake 8 as the main module that you install, but then there are many, many other Flake 8 packages that you can add onto it. And in this list, you just have the ones that I like to use. You can find others. They can look for, they can test not just compliance of your code to Pepe, which is, of course, the requirement that the standard we should all be following, but also it can look for some bugs, common mistakes that are made, sorting of your imports with iSort, and other things that you like to have in your code. You can all be tested with these Flake 8 plugins. It's very easy to configure. You can put the configuration in talks any in the Flake 8 section and define some values like line length. Now, because we should all use black, the official line length became 88 because it has a 10% tolerance for 80 line length and a 10% tolerance. And you can exclude some checks from Flake 8 if you want by adding this ignore instruction in there. And now, if you run the Flake 8 command, it will load all of these plugins, run all of your source code through all of these tests, and inform you if you're missing something or if something is not formatted correctly, or maybe you have a common bug or a security fault that you didn't notice somewhere in your code. So this is very useful. Another useful check is MyPy and the type annotations that are now available in Python 3. This takes a bit more work because you actually have to add the type annotation to all of your code. But if you do it, it pays off because you can do static type analysis of your code before you actually, before you commit it. So this will check if anywhere in your entire code base you're calling something with the wrong type of argument. And this can sometimes find bugs that you, they're silly that you really didn't mean to do, but somehow you put the wrong, you're calling a function with the wrong variable, for example. And normally you would have to find that somewhere and fix it. But MyPy can find these types of issues for you very quickly without even running your code. So use MyPy for this purpose if you have the patience to put the type annotations everywhere. But I recommend it. OK, so now we have some checking. How do we put it all together? Well, the tool that everybody's recommending these days is talks. And talks is very simple to configure. And it can put all of your tests together into one thing. So a simple talks configuration is written in the box on the left. And it defines a list of environments that will be tested, in this case, Python 3.5, 3.6, 3.7, and then the definition of the testing environment. The dependencies, the commands we want to run, and even some other configuration sections can be put all into this one talks in the file, even for other tools. So with this setup, all you have to do is run the talks command. Or if you want to run just a single environment, you can run the talks command with the dash E option in the name of an environment. And it will start by creating a virtual environment for that specific version of Python, installing all the dependencies into that environment, installing your packaging up your code, and installing it into the virtual environment, and then running the commands. So all of your tests, like here I have FlakeHagen and PyTest, but you can build on top of that, all the tests that you need can be run with one call to talks. So that's very useful. And it'll come in handy in a second when we're putting this all in a CI system. But if we're going to be testing things, well, we need to write unit tests. So this is where refactoring of the code into small functions comes in handy, because now you can write simple unit tests for each function. And the test tool that I think is becoming more and more popular all the time is PyTest. It's really easy to use. You can write tests with minimal boilerplate, just import your function, run it, and put some assert statements into a test, and you have a test. That's all you have to do, so it's easy to get started. And then you just run all the tests with a simple call to the PyTest command. OK, so now we've got all the code preparation. All the code is prepared. Everything is done. We're ready to share with the community a pretty robust project. So what do we do? Well, of course, we put it up in a Git repository. These days, GitHub is king, but GitLab, of course, is a popular alternative, and there's Bitbucket as well. So I'm not going to recommend just GitHub, but it is the one that has best integration for all the tools that I will be talking about from now on. So you just set up a Git repository, add all your code to the repository, and push it to the repository. If you're creating a repository, remember to put a gitignore file into your Git repository and a license. The license is the thing that's easy to forget, but it's critical. If you don't put a license in your code, no one can use it. So put the license, set up the Git repository, and then you can proceed to setting up a continuous integration environment. I like to use Travis, but of course, as with everything, there are alternatives. But Travis is easy to use because you just prepare a simple another YAML file. You just drop another YAML file into your repository. And when you put one like this that calls talks, that one call to talks will run all your tests. So if you do that and you set up an account on Travis and add your repository to that account, the testing will start. And you will start seeing these little checks on all the PRs that you make to your repository, which are very useful even for yourself when you're writing code. You can go through the PR process of your own changes and see if it passes all the tests. Another useful tool that's available for free for anybody who has an open source repository on GitHub is a requirements updater. I like to use pyup bot specifically for Python requirements, but there's also dependabot, which is free for other languages as well. So there's no configuration required. You just set it up by creating an account and giving it access to your repository. And then the bot will scan your requirements file and figure out if they're up to date with versions of on PyPI. If they're not, then it will start creating pull requests with updates to specific versions of packages. And if you have a CI process in place, then you will know which ones you can merge and which ones you can't, because the ones you can merge are the ones that have a green check marks and the ones that tests failed for will have a cross. So that's very useful. OK, another useful thing is to check your test coverage. The pytest library and other Python unit test libraries can actually check which lines of your source code were hit when running your test suite and then give you a report. So this is actually very easy to use for pytest. You just add this dash dash cov option, specify your module, and then you'll get a report for your module. If you want more information, you can ask for HTML report, and that will generate a code coverage report in an HTML files, which show you exactly which lines of your source code are tested and which ones are not being tested. So you can see where you'd still need to add tests. And then you can actually integrate this with another service online, which will track the test coverage over time and maybe even prevent you from merging changes which decrease the code coverage in your repository. Another thing I want to mention is code review. If you're working as a team, the best thing you can do for each other is review each other's code, that people you work with have the chance and you have the chance to tell them which part of the music they're playing you really like and which one you think should be a little better. And that's the moment to do this is during code review. But there are also services now, and I think they're getting better, although they're far from perfect yet, that do automated code review. So you can sign up for something like codacy or code climate, and it will actually look at all the PRs in your repository and find things that may be wrong with this code and make a code review on your PRs. Another bot you can employ is something that will automatically merge PRs. So mergeify.io is one that I recently set up. And you can configure rules that apply to your PR. And if a PR matches these rules, then it will get automatically merged. For example, this is a configuration for automatically merging a PR that has past CI and has at least one positive review. And if your PR matches this, then mergeify will merge it. And you can even set up a different rule that will delete the old branch. So if you have all this in place, you actually have bots working for you. The PyApp bot will find updates to packages on PyPI. Travis will test if these packages are passing your tests. And if everything passes, mergeify can merge these PRs. So without you doing anything, you can have your project be kept up to date with its dependencies on PyPI. So I'm getting to the end of my story. Now you're ready to publish your project on PyPI. This is very easy. We have a tool called Twine. So once your packages are built, you can upload them to Twine if you just need to set up an account on PyPI. And your package is published and you're happy and everybody can use it. So I wrote up all the details. I know that this went fast, but everything that I said is in an article on my blog so you can read it at your own pace. And with that, thank you very much. Thank you so much, Michal, for this very interesting talk. Do we have questions for Michal? Continue on automated documentation as well. But which tool do you think for that? So yeah, I haven't set up automated documentation for this particular project because there isn't much documentation, but I'm torn between Sphinx and MakeDocs because I'm a fan of Markdown so I don't like restructured text which makes me bias against Sphinx, but I think Sphinx is very powerful and I've seen it used to good effect by people. So I guess in the right hands. Okay, do we have another question for Michal? Yes, let's use the microphone for the recording. Recently I worked in automated versioning and I came to bump to version together with a script that I made myself. Do you know some tool to do it in an automated way or we have to do it ourselves at the moment? To bump the version of the code that you were writing. Mm-hmm. That's a good question and I don't think I have encountered a tool that actually does this. So far I've been doing it manually but it would be a good thing to automate. That's true. Thank you. We have one more question over there, I'll come to you. Do you know if there's a way to install a pre-commit globally or as a good repo template because my experience is people forget to install pre-commit hooks when they start new projects? Well, if they forget to install pre-commit they will pay for it after they commit because the PR will not pass your tests. So it benefits them to install it so they'll have motivation to do it at some point. Do we have more questions? Raise your hands. Okay, so if we don't have further questions for Michal another round of applause for Michal.