 Luke, this is Mariano Anaya, and he's going to talk to us about clean code in Python. Let's hear a big applause. Hello, everyone. Thank you for coming. Let's talk about clean code in Python, software quality in a favorite programming language. First, a bit about me. My name is Mariano, a software developer. I'm interested in open source technology, software architecture and high level design. Linux and Python, of course. Feel free to contact me or reach me by any of these means if you're interested in talking about some of these concepts after the talk. Before we start with definitions, I'd like to make a few comments first. The code in the slide is written in Python 3, but there should be any problem, no problem at all if you're using another version of Python, so don't worry about that. Second, what I'm about to tell is that if I don't mean something strict or rigid, or you must implement it, there are some ideas or guidelines, some of them are opinions, so you might think otherwise, and that's perfectly okay. In concourse down, we can say that there's no sole definition of clean code, and instead you will find as many definitions as developers and authors available out there. Let's try this one, which says that clean code is one in which every function does pretty much what you would expect, and you can call it a beautiful code when it also makes it look like the language was made for the problem. The reason why I picked this quote is precisely because of that last statement, because that's what we call a Python code, or a code that is idiomatic in Python. We'll see examples of that and how we can achieve that. So to have a common ground of understanding, we can say that clean code is focused, which means it does one thing well, and that thing that it's doing should be pretty much what you would expect, so the code should not be misleading or error-prone or confusing, instead it should be clear, and this is important for many reasons, because arguably the quality of the code will determine the quality of the software. We all know there's a strong correlation between a poor codebase and a software that has a lot of errors and it has to maintain, whereas the opposite is also true. If you can maintain your code clean, readable, and understandable, it would be much less likely that there are errors and problems in the code, and if there are, it would be easier to spot. So readability counts, as we know from the sense of Python, and it makes total sense because if you think about it as developers we spend much more time reading code than actually writing code. So if we want to make a change or add a new feature, we first have to read all the surroundings of the code we're going to modify or extend. And the extent to which you can read the code and actually tell what it's doing is what ultimately will determine or define how fast can we ship new changes in the code or new features, so it's related to Asha development. We all know that previous messages lost you down and deprived you from shipping new functionality faster. And last we can mention that the code is like a blueprint, it's like another model that you have where you should represent the business logic and the requirements of what you try to do. So it has to be read also, it's useful. On the other hand, we have some scenarios, we are exactly the opposite to what we want. For example, complex, obfuscated code, code that is misleading or that has misdirections, the complicated code is like the worst thing we can have in the project, and code that is not intentionally revealing and that instead of revealing the business logic or the business requirement, it's exposing implementation details which should be encapsulated or abstracted. And this is all part of technical debt. There are many ways of technical debt, but having a poor code base is arguably one of the worst. And to make things worse, technical debt is also invisible. It's not only negative for the project, but it's something that is sometimes hard to spot or identify. So let's try to see some examples of this with Python code. The very first example is something really simple which is speaking about meaning in the code. Let's consider this function that, given a year, it should bring one line per day of the year. You think that it works and it actually does work, it does what it tries to do, but now suppose like, I don't know, six months, a year elapsed since you first wrote it, and you find yourself trying to figure out what it's trying to achieve or trying to do. And you see that it's trying to do some calculations and see if the year is divisible by some numbers. And you cannot actually spot what it's trying to achieve, but you find that if that condition is met, it's added an extra day. So you say, okay, maybe that's trying to figure out if the year is sleep or not. But that's the problem. The fact that you have to guess is the problem. You shouldn't be guessing. The code should actually be telling you what it tries to achieve. So if I want to know if a year is sleep or not, I'd rather have a function, let's call it is sleep even a year. So I can read the code and I can actually replace that and only with a statement or something meaningful in the code. This is a very simply yet powerful thing you can do in order to increase the reliability of your code because it's not about like reducing code, it's about separating concerns, differentiating different problems into different layers and having an organization in the project. Remember that functions are the first line of organization in a project. Functions should do one thing, one thing only and do it well. And starting from this very simple example we can say that it's actually related to code duplication because if you think about it, most of the times code is duplicated because it didn't have a proper abstraction or a name for it. So you might say, okay, let's say we have a validation in some part of the code. I'm going to say, okay, I need to add a similar validation. So someone might say, okay, let's copy this line from here to here, paste it here. Let's change this number, the one by two. I'm all set. But actually not quite right because maybe you introduce some duplication in the code inadvertently. And the reason why that happened is because it didn't have a name, it didn't have a proper abstraction. We all know that we want to avoid code duplication at all costs because duplicated code forces you to do parallel changes. You have to change things in many, many places in the software at the same time. And if you forget one of those, you have a bug or there is a problem. So instead, we do not want duplication and you can remember the dry principle, the dry and crony moon for this, which stands for don't repeat yourself. And the things must be defined once and only once in the project in order to be efficient in the work. So this has been sought after principle in software development over the past few years. And in that regard, there have been many enhancement of progress. For example, libraries, frameworks, tools, design patterns. And those are great and it's usually a good idea to help us in mind. But on top of that, we can say we have a so-called extra tool when it comes to Python, which are decorators. I will not explain all the details about decorators because it's to extend the topic and it's worth a token itself. I will only mention what's relevant for the purpose of this talk, which is addressing college obligation. The general idea is that you can have some functionality abstracted in one place, but repeated or reused many, many times. Let's see this with an example. So let's say I have a maintenance database task that is part of a framework called like this. And the high-level idea is that it is made out of a sequence of commands and these are going to be executed. So the first task is to update the index of a database in Postgres. So I have a sequence, which is a const of one command and the logic follows like this. I execute every command in the sequence with the cursor which is provided. If there is an error or something went wrong, I log the exception and return minus one. Otherwise, I log that work fine and return zero. So far so good. Let's assume these are the valid goals that the framework requires, but then another requirement arises and says that I need to move some data from some records from a table to an archive table to only leave the most recent rows in the table. So I say, okay, I can do that in two commands, two SQL statements, one for inserting the rows into the archive table and then a second one for deleting the affected rows from the main table. But then as part of the framework and the logic, I have to preserve the exact same logic. So I can do something like, okay, for every command in the sequence, I execute it with a cursor which is provided. If an exception occurs, I log the exception and return minus one. Otherwise, I log that work fine and return zero. This actually also works, but now we see the problem. The problem is that I have to preserve the logic, but this is exactly the same code as before. Starting from the try except block is the exact same lines as before. And the reason why that might happen is because those lines that were in charge of doing the error handling and logging didn't have a proper abstraction location or name for it. So it's kind of similar to the very first example with Libya. There was some code that didn't have a name, didn't have an abstraction, so it was actually very error prone to duplicate the code. Let's see if we can address that with a decorator. So the idea is that I create a new function which I'm going to call db-status-handler which is going to be a decorator so it assumes it receives the function which is the one that is going to be decorator. And inside it defines a new function and there I can put the logic for doing the error handling. I execute every command in the sequence. Again, if an exception occurs, I log the exception and return minus one, otherwise I log that task completed well and return zero. This assumes that the function that's being decorated provides the sequence of commands. So there's an interesting thing here. First we have a name for that. It's no longer an unused code. Now it's called db-status-handler and it's in charge of one thing which is the error handling and executing the commands. So now the previous two functions can be changed to use the decorator. So I can remove all that duplicated logic and instead only return the sequence of commands which is the relevant part. But they're still preserving the same logic because they're being decorated by the db-status-handler on top of the function definitions. We actually did three things with this change. First we assign a name to the previous anonymous code which now is called db-status-handler. Now there's a separation of concerns of the logic. Remember that the decorator now only handles the commands and the logging and knows nothing about the commands itself whereas this function has the opposite behavior. This function only returns the sequence of relevant commands and knows nothing about the error handling. And last, we remove the duplicated code and we have it defined only once. So this was a cool way to use decorators to address duplication. So this is somehow related to another topic which is managing implementation details. The idea is that you have to run like a task as part of your core functionality but you also have to do some other things that are related from technology you cannot avoid you cannot escape from. So this is similar to the one before. We do not want to mix up or different things into the same problem. We still want the same logic separated into smaller pieces. So let's see what Python has to offer for each scenario. Let's consider a very simple example. Let's say I have a web application an online game with players playing online and I have a requirement that says that a player finishes the game I need to update the score with the new points that player has just earned for the match it just finished. So the idea is that giving a player status object I call the accumulate points by the new points that I have to set for that player. You might think this works but if you take a closer look you'll think that indeed it's not very good because it's mixing up implementation details with business logic. So what I want to do is to add new points for a given player but instead I'm having to deal with a race connection with a key 0 which seems to be at a default value an integer then follows where I actually want to do which is to add a new score for the player and then again another implementation details another technical detail with the set value. So instead what I would like to have is not the previous accumulate points method but instead just points with the Python variable because it's easier to follow and understand. So if I want to get the points of a player at any given time I just type points the same for the set. So the previous accumulate points there's nothing particular about it it's just plus equal to any number as I do with any or regular Python variable or attribute. So in order to achieve that we can use the property decorator which is a built-in decorator in Python which is a method of the class and there we can move and encapsulate and abstract implementation details so from where it's calling it doesn't know what's behind it and that's a good thing to have it only knows about the points and we have two methods or two properties that are smaller in size and easier to understand. So you can use the property decorator to use some computations based on our object attributes and you should prefer this approach to custom getters and setter methods because the code will be easier to read to follow and also to test because now you can test your code with anything that has like a points attribute you don't have to have like a radius connection for running your test or mock the connection or patch it etc. So it's easier to understand and follow. It's much more Pythonic. Now let's suppose I have like a web application in this case but it's an online store that has stock representing all the products that are in the store. So there are dividing categories and I might have a view like this one that says like request for customer that is going to handle the scenario where some customer is trying to make a purchase of a product online but if you care about code quality you will find that this is a bit hard to read in particular these lines are not very expressive but you still go to the trial to read those lines and find a clue near the if statement when it says if product available in stock so you say okay maybe the previous line we're actually trying to figure out that again I'm guessing over the anonymous code so you say if I want to figure out if a product is available in stock I might rather just simply write that like if product in current stock which actually makes perfect sense is speaking in terms of the domain problem and is self-documenting it doesn't need a comment because whenever you write something like that if product in current stock Python slightly translates that by calling the contains method the so-called mashing method because of the double underscore and passes the product as a parameter so the idea is like okay now I know that I can implement the contains method into the stock class and have an interface that you want to have before but actually it also makes sense so the search algorithm can be encapsulated away in the class another case for managing state or handling scenario is when we have a code that again runs a core functionality in your project but you also require to do certain tasks for example you might have a code that has pre-conditions or post-conditions or both for example you're connecting to a server you want to make sure after processing the data you need it's actually making sure it's closing the connection or releasing the resources it allocates so the problem is here that we might again feel the risk of actually trying to mix up those things and instead it should be together separated into different layers so let's see if we can do that with a context manager which the idea is very simple let's say I have to run an offline database backup so my backup requires that the database service is stopped before running then run the offline backup and then of course I want to make sure I'm leaving the database service up and running again so instead of trying to put the stop database service and start database service inside the run offline backup which doesn't belong to and is making the code more coupled or more yeah, actually instead we can separate that into a handler which is going to be called and I write like with an object that implements the context manager protocol Python suddenly translates that and calls the enter method automatically which in this case will stop the database service then follows the wait statement the block and then I can run the core functionality what I actually want to do which in this case is running the backup and then after the last statement completes it automatically calls the exit method even if an exception occurred so it's making things easier because I don't have to do the run, leave myself or manage edge cases or scenarios I will make sure that even if this fails or something went wrong the database service will be left up and running no matter what you can slightly improve this by using the context decorator which is an interface provided in the context live and once you inherit and extend that interface you implement the enter and exit methods and once you have that you can use that in the context decorator for the function so whenever in this case if I call it like this whenever I'm calling DB status handler this is going to be called automatically inside a context manager calling the enter and exit methods so from all of these we can draw the conclusion that there's always a much more pythonic way to write things and the best way to write pythonic call is to actually take advantage of the features of the language of playing well with python reserve variables like two pieces of hashics that match together so in a way that makes sense like the example of the stock and sometimes in order to achieve that the most common answer is the magic method but now that there are many other tools many other magic methods or features of the language such as scriptors etc so if you're starting you're beginning in python I really encourage you to try to find these in order to transform your call into a much expressive one and if you're an experienced developer you might use these examples as ideas in order to provide feedback in a call review or in a pull request try to see if the call is pythonic so to wrap up we can say that the best way to write pythonic call is to actually take advantage of the language of the future languages sometimes that means using a decorator removing duplication etc aside to that you also have some standards that can help you write better call and can be understood by the team for example Pepete or Coding Islands but not that you can be 100% with them I still know how pythonic call because they're actually looking for different things although it's a good idea to keep them in mind try to pull doc strings functional annotations you want to keep your readers about what you expect at any function in the call if you're here for the so-called productive call also applied for unit tests you should also maintain your unit test clean and with good structure so they are useful and also it's a good idea to use test and TDD because you will naturally follow this logic or actually trying to define smaller pieces because you will want to make your call test a lot and in addition to that we have finally some tools that we can use to provide metrics for all the call we have like Pygostyle, Pylin and Radon you can run this in your project and it will give you metrics such as psychomatic complexity, maintainability index et cetera that you can use as a head start to know where the call needs more improvements urgently and finally Koala which works with the previous tool but also lets you define your own standards to be run or check automatically as part of your continuous interaction environment if you're particularly interested in some of the topics you can have more information in some of these sources and then you can go to the board for the talk that would be all and if you have any questions I will add the answer, thank you very much I think we have time for questions we have time for a couple of questions any one no questions so thanks to the speaker again ok