 Hi everyone, so hope you had a refreshing lunch break Now we have Julian with us who is going to talk about building a clean maintainable and tested code base He's a software engineer at HAKAR. Where are you? Where are you streaming from Julian? I'm streaming from Walter. So a bit far off Alright, cool over to you Okay, so Welcome, I hope you're having a nice day so far Yeah, let's do this for those who don't know me. I'm Julian for those who do know me. I'm still Julian as well I'm a senior software engineer at HAKAR and Yeah, let's take this off So the agenda for this talk is around clean code mainly where we'll see some examples and Comparisons starting off with the imports which usually are at the top of the module and Then we'll go into a function some logic and then we'll take that logic and Try to decouple it Let me go full screen for you guys. So I see a bit better Then We'll go into a bit of an example for a project structure The same structure as the clean code examples where we see a comparison and we'll go through it a bit then We'll have a very short amount of time to go into testing where we'll talk a bit about doc test by test mocking and test package structure, which is a bit like the project structure where we see a comparison and We'll discuss that a bit So, yeah before we jump into the first few slides where we'll talk about clean code it's important to Ferrisly understand some characteristics in regards to clean code and Those are readable and easy to understand Reusable and dry by dry. We mean do not repeat yourself and that usually means you don't have duplications in your code so What are the benefits of having dry code is that? Imagine you have five or six duplications of the same piece of code if you have to change something you have to change it five or six times in your code base rather than Just changing it once in a function. That's reusable Then of course, it should be documented that can be doc strings or type hinting which will see a bit of the next few slides and Also third-party documentation if you have time for that and it should be simple and Easy to modify or extend those are in my opinion, and I hope a lot of people's opinions What makes clean code? Or at least a few characteristics that make clean code. So let's get started So What we're seeing over here is An example. I'm sure you like me. You saw this a few times in your career where we have these type of imports that are a bit scrambled and That's me over there a little bit said with a with a cat balloon So let's start off by addressing a few things that we're seeing over here so the first thing is Having imports on the same line using the import keyword is not something recommended by Pepe it Pepe it is the style guide Python style guide the official one So it's recommended that These are separate on different lines When you're using the import keyword, you can use multiple imports on the same line If you're using the from keyword, which we see underneath Although we don't have multiple ones on the same line and then we have another bit of an issue where We have the standard library import that is a bit off further down in the imports and We also have the collections import which is a standard library as well, even though it's using the from keyword so Before I continue explaining over here. It's important to understand the recommended import grouping the the order that is recommended at least by Pepe it and That is that you should have the standard library imports at the top which are CSA BC Matt and Many others that are built in Then you would have the related third-party imports and the application library imports, so The related third-party imports in this case is what's highlighted the Django imports and we also have the requests imports we're here and a few other things that one should keep a bit or Should be a bit cautious about when using imports is that Avoid using wild card imports because they are quite unclear in regards to what you're importing and It tends to come view confuse the automated tools Besides the engineer of course looking at it but yeah and another note is absolute imports are recommended when You're using either absolute or relative imports you should if if that's Sort of something that you can do you should go for absolute imports So yeah, I'm quite happy over here We have imports that are properly structured Clean and readable so we can start off by noting that the standard library imports are at the top and you have the separation between using the front keyword and the import keyword and Also They happen to be an alphabetical order So if you want to be a bit if you want to go all out OCD you can do that as well. You can go alphabetical order and Yeah Then we have the Django imports so the related third-party imports and We can see over here I've added a bit of a segregation between Django dot db imports and Django dot conf imports and This is something I'm not sure if everyone does but it would be nice if everyone does but I I find imports more readable when you sort of Not just segregate them based on the library But also if you're importing over here, of course, we only have a few imports But if you have a ton of imports having this type of segregation helps a lot in terms of Finding what you want to work with and finally we have request import, which is a third-party import as well over here We don't have An example of an application import, but that would be the same structure over here, but below below these imports so Jumping into the next The next bit of a problem Over here over here we have a simple function Which is Which can't be improved quite a bit We can see that the function name might not be as descriptive as can be We have the items argument as well And we have The for loop it's using ITM Of course over here. It's not confusing because the function is quite Quite slim, but if you had a function that is about 50 lines 60 lines And you're at the very bottom of it. You're going to be like Yeah, I have no idea what ITM is so yeah We can also see some issues with variable naming in terms of not just the argument and the loop, but also You can see client over here whichever client that is We don't know execution and Some duplication down here as well in terms of the exception handling. Of course, it's missing doc strings and type hinting which we'll be looking at in the next few slides so what we've done over here is We've changed the namings a bit right starting off with the function name the argument name and With just that we know that we're going to update tasks. We don't know what type of tasks yet, but we'll see further on and we know that Online 19 we can see that it's a step functions client. So that pretty much indicates that we're using Boto3 or something similar and That is it for now. We've changed also the Response so we know we're getting a response probably from an API call So let's jump on to the next slide So I'm not sure what people call this, but I called this code grouping Pretty much a lot like the importance grouping, but if you look at Line 22, let me get the pointer. Give me a second. So if you look over here this this part of code which is Which is handling the status updating and if we go back a Slide we can see that this is a blob of code, right? And of course I couldn't fit a lot of code in here. So over here It's readable, but if you have a big chunk of code without any segregation It tends to get quite confusing So by simply adding That segregation over there that space between lines of code You've you've basically told the engineer that Even though these work together over here. I'm doing something and over here. I'm doing something different And it increases readability drastically Yeah, then we've added Type-in thing we can see that in the argument over here In the function argument We know that this is going to be a list of Tasks, which are I couldn't fit all the code over here, but these are Django models in this case Django objects and We've also added the dict over here Which gives us a lot to work with because at least we know what we're getting Whereas before we weren't even sure if this was a generator or anything guys So That's that's a step forward in order for The next person after us to pick this up and then we have doc strings. So doc strings Are mainly the part over here and you can see that Usually you add a bit of a description in regards to what's happening in the function and And you document the parameters There are a few types a Few a few different formats to writing doc strings. I'm using your structure text over here But there are other types that you can you can adopt so that's really up to your preference There's Google and others but Basically, we were documenting a bit of the function in order to give a bit more of a context to someone picking this up and Also, I've sneaked in The grouping of the exceptions down here Where we have The execution does not exist and the invalid arn together now rather than having them as we've seen before Into a separate except since they're doing the same thing. It makes sense to have them together If they're doing two different things you might want them to have different Context So this part we've done as well Jumping to the next one so if we go back a slide we can notice over here On line 33 We're doing a call right we're handling the call to the step functions in this case on AWS To get the information on the task and Then we also have the try block Which is around quite a bit of code? But in this case, it's quite safe because we're just handling that exception as Well as updating the tasks, so this function is doing two things at the moment, so it might be Something who want to look at possibly making that reusable so we don't have duplication of code going forward and Over here what we've done is we've taken out the try except block and the API call and We've created a separate function Now, of course, we've added some type hinting over here and some documentation In order to make it a bit more clean and readable For someone has to pick up And the next in the next slide We're going to Change the main function. So over here. This is what we've had originally. We had two things couple together Two concerns and we've improved it to take out the described execution into its own function So This is what we end up with We can see that it's slimmer cleaner more more direct in what it's doing and Some some things that you might Notice about the sample code That will will discuss in the next slide is That we're explicitly checking for the status and the execution information Which is usually called look before you leap And there's another method of doing it which we'll discuss in the next slide but That is that is the final Clean version of our function and There are a few things that You need to keep in mind about this demo code that we did not address Such as chunking the tasks in order to not eat up all the memory and all of that and Possibly even having bulk update rather than saving Each task as you're looping over the items So as we've just mentioned in the previous slide We have looked before you leap and then we have easier to ask for forgiveness than permission The differences are That in one We're checking if the key Isn't the dict already before we actually use it While that might be okay, and it is okay It does convey to the user that that is the exception not not the usual behavior of the code rather than Having something That actually conveys That not having the key is the exception so there's a bit of a different mindset both are just fine I use both and Although although they are both fine to use in some cases The one on the right might be better suited for you So it's just being aware of Of the different ways of handling it Okay, so let me let me take a sip before I dry myself Into a dry fruit Before jumping into project structure and the next slide we're going to have an example of of a project structure in Django and Basically the idea behind this is Quite subjective in a way that not everyone prefers the same thing although What we can note over here is That we have a few modules right less modules Probably more code in the modules In some cases that might be okay, although it doesn't really set you up For the future Especially if it's a project that grows. Hopefully it does grow or as it would fail If if it's a project that's growing drastically and quite fast then this might Might cause a bit of issues down the line because Having two models in the models in the models module is okay But as it grows into 10 20 mod mod is It it starts to raise a bit of an issue on the right hand side we have More packages and modules but Less code in each model module. Sorry This Might be Something that you want to adopt if it's a project that you know will grow if it's something simple like a microservice It might not be Something that you require so I'm happy about that also Looking looking at the example on the right Usually in Django at least this works out of the box although Although you need to import the modules in the init module and That's okay although if you have like checks on the code if an import is used or Are they checks? You might want to use the All done there to define the imports and The imports that can be imported from the in it although This shouldn't promote using wild card imports so that's an important Important thing to keep in mind and also another note on the example on the right is You might come across a few issues in terms of circular dependency and Although Django does allow you to use strings in order to link to other models And some other projects you might not and you might need to opt for something like a registry or Something similar So something to keep in mind over there Okay, so the next few slides We're going to look at doc test Very short introduction a simple unit test and a bit of pie test a simple mock and Test package structure pretty much the same as we've seen in the previous slide So yeah, so Let me highlight it over there So doc test this this is usual syntax of using doc test doctors are Simply put tests inside your documentation in your doc swing Or you can even use an interactive text file which which doc test Does support you can read more on python website if you're interested and basically What we're seeing over here is The input so we have three arrows and what we want to run And the expected output and we keep on going we can also catch exceptions and Pretty much This does not in no way replace unit test or pie test Or any of that Even though some projects Did manage to do that. It's very painful Especially when you have mocks or other things So this Should in my opinion should be used as Something that brings so much value to your documentation someone that an engineer that looks at this Can immediately Understand what's happening over here even though the code is quite clean and It's quite self-explanatory Having that example over there It's it's more explicit in terms of the behavior and Another benefit of this that at least I find quite quite beneficial is That if you're running If you're if you're changing this function this particular function that does have doc test You can simply run the test over here without even moving your hand, you know Yes, that is all I believe Yeah, so In terms of unit test This is quite a simple example Where we're trying to test the previous function that we've seen So we're we want to test that if we give it a list of a certain numbers the ratio is correct, right? Just to explain this function a bit before we go into the next one. So you understand what this function does is This function just changes a list of numbers into the simplest form in a ratio, right? so Over here on line seven. We're specifying The Actually, let's start from the other way around because it might be easier to understand so We have a simple test case down here with two arguments, right and Then we decorated that test case with pytos parameterize and we're saying that we expect The two arguments to be injected into this Test case so the first argument of the parameterize are What you're expecting into your function. The second argument is The test data in this case I call it test data, but it's usually a list of topics and Or a list of values it depends if you have one argument or more arguments than one but in our case we have two arguments, so We have a list of topics the first argument links to the numbers and the second argument links to the expected output Now this is beneficial because we don't have to Do a loop what happens over here is pytos knows that we want to run this test case with different data So it does that for us rather than having a loop inside the test case or even having multiple test cases It's much easier to do it this way and then What we're doing is simply Asserting Whereas if you're familiar with unit test what we do in unit test usually is we use the assert equal Among others I would guess the assert equal is the most common at least that I've seen What we use in pytest is the assert which in my opinion comes a bit more naturally and What we're doing over here. We're doing we're calling The get ratio simplest form function what that we've seen in the previous slide With the input so these are all linked numbers numbers and numbers and that Links to this then that one on the second iteration then that one and then that one and Then we're expecting that to match the expected output which in this case Would be this ratio and then none because we have zeros one and two one four Also, if you want to run combinations you can You can stack the decorators using parameterize which would then do combinations Okay, I think we've covered most of the things here another important note For someone that is not very familiar with unit test Is that a unit test should test one unit? One single unit of functionality. It shouldn't test multiple things at the same time Because that that loses a bit of scope and add to the flakiness of the test cases that you have so With what we've just said in regards to testing one thing at a time What we have over here is Mocking right so in order to test This simple function my function in this case we need To mock the generate key function and we need to mock it not just because we decided to want to mock it but We mock things When they are not deterministic so something that is random Thank you for that Julian Unfortunately, it's time for our next speaker now But the audience would love to connect with you in breakout optimal room. So see you there