 Okay, so my talk is about how to be a factor like a boss. So my name is Michael, I'm a co-lo kung fu on Twitter. Before I continue, if any of you are on Twitter, I am preparing for this year's PHB Conf. So use this hashtag come to Asia and mention me and mention anyone else, anyone you would like to invite to come for this year's PHB Conf, any VIP or VVIP that you want them to come to Singapore, right? So this is your mission by the end of the night. When you're done, show me the tweet and I'll give you a free t-shirt. Over there, okay. Yeah, so I mentioned somebody on the PHB world that you really like to see in Singapore, and put this hashtag come to Asia and mention us at PHB Conf Asia and then show me the tweet, I'll give you a free t-shirt. Unfortunately, you're left with 2 XL and a small size. You shouldn't have stolen that. I can't lie. No, that's not a lie. Fine, forget it. Go away. Anyway, so I'll talk about a couple of things. So I'll talk about what is refactoring and why we do refactoring. And I'll share with you some common refactoring techniques that I have personally used. At least for this sample code, as I'm showing you later, I have used the same exact refactoring techniques. Is it fun big enough? You guys, can you guys see? Can I? Okay, cool. So what is refactoring and why do we do it? The HO wisdom is that when software is left alone, it will decay, right? Because use cases will change and your software, if you don't refactor it or change it, you end up having very expired software. Software that is no longer useful. So we refactor so that it makes it easier for us to... Well, we refactor code because we don't want our code to decay, right? We want it to continue to be relevant. So usually it's a change made to the internal structure of your software that makes it easier to understand, right? So when you refactor code, you want the code to read naturally. So we look at the different functions it should communicate intent. What do you write the bloody function for, right? And also for business people here, this will be very good for you because when you refactor code, so this is how you explain to your boss, okay? I refactor code so that it's cheaper for us to change in the future, right? Why is it cheaper? Because if you don't refactor your code, you go back and you're like, wait, what did I write two months ago? You have to spend time reading your code again. If you don't remember or you forgot your comments or you got forbid your comments are outdated, you're like, shit, what is this? Yeah, it makes it cheaper, right? Because when you refactor code in a way that is readable, it makes it easier to understand by glancing at it, like looking through your function names or whatnot that communicate intent. Once it communicates intent, it's easier for you to read it and understand it. And then all the muscle memory comes back and you start coding away. Yeah, ideal situation. But when you refactor a code, it shouldn't change the observable behavior of your code, right? So your software is written to do certain things and when you refactor, it will make changes, make it readable, make it cheaper to make changes in the future. But it shouldn't change the behavior. You should still do the same thing. And you should also, in doing so, you will improve design of your code. When you improve the design, you also read easier and it makes it easier to refactor and change and you add new features, right? And having test actually helps a lot because having test coverage in your code will help you ensure that you have not actually changed the behavior of your code. Current times, have you actually written a code and then wait, it wasn't supposed to do that, right? Wait, what did I change? So, when you have test, it gives you immediate feedback that look, you have actually made a breaking change somewhere and this test will tell you the test has proven to you that you have especially when you have a very big software or legacy code. How many of you guys are still maintaining legacy software like .php3 or .php4 days, right? Or maybe code that you inherited from a predecessor. I'm sure there's many of you guys here, right? So, because I want our meetups to be practical, I'll go share with you some practical tips because I like being practical. So, there are four things I'll be sharing. Number one will be how do you change names? Okay, the techniques you should have is change the names that communicate intent. Remove any of the magic numbers because those are scary. One responsibility per function. Or class. That should be. It should be common sense but actually it's not. And last thing is do not be obsessed with primitives. Primitives like arrays, strings and all that stuff. I'll go into more detail when I touch that. So, the first one is about refactoring techniques. Number one, change names to communicate intent. How many of you have seen mystery variables? Like A, B. What does A and B stands for? So, how do you change them to give it meaning? You look at the implementation and you look at the data it contains. Usually, you look at the data and you're like, okay, there's somebody's name. So, what is that? So, somebody's name? Usually, you should just mean first name and last name. Name your variables in a way that communicates meaning and what it's supposed to be used for and how you'll be used. You communicate intent easy for you to glance at it and figure out what it is. Otherwise, you have to look at all the implementations that actually use it or you have to look at the data that contains to actually find out to figure out what it's used for, right? So, which is, yeah, don't do that. Mystery functions. How many of you have actually seen mystery functions? What is fool? Right? So, fool takes two arguments. So, what does it give you? It spews out a greeting A greeting. Okay. So, what is this mystery function? What is fool? What is this time for? And what should we, how should we name it to give it meaning? Any guesses? Any guesses? Say hello. You greet somebody. Yeah, so it's greeting. So, greeting from first name, last name. Yeah. Or some of my exam was a bit contrived but you know, you get the picture, right? Okay. So, change names to communicate intent. And extension to that is about magic numbers, right? Suddenly, you find in your code a mystery number, like, what the hell is this 1.4, right? What the hell is that? Right? So, amount in USD, you pass it at amounts and then you have 1.4 inside. It's like, what does that mean, man? Look at the function you may probably kind of have a guess. You know, what does this 1.4 means? Probably means exchange rate, maybe ask around, ask your colleague, what is this 1.4? You check to get log. Oh, yeah, okay, this guy, what is this 1.4? Okay, fine. So, what do you do? You refactor it. You change it into something meaningful, right? It has to be, yes, the thing has a really intense, right? So, you change it with a constant. You define it and then you can use it in your code, right? Also, same applies in a class. You can also lift it up into a constant, right? A class constant like this. You can just use it so, right. So, when you find a mystery number, even Ruby is do this, right? The previous project I would say they had mystery numbers floating all over the Ruby, the Rails code that was like, guys, what the fuck is this? So, yeah. I mean, what currency is the input in? Sorry? What currency is the input in this function? I assume it's Singapore dollars. Again, see, and the function is wrong then. We should be dividing it, not multiplying it. See? No test. Sorry. I forgot to write test. See? Writing test, make sure you don't change behaviour. I'm such a failure here. Anyway, so that's two tips down. So, the third technique that we use is usually when we refactor code, we try to bring all, we will try to write your functions such that each function only has one responsibility. As in, it takes in certain things, it only does one thing, and it gives you the result that you need. Right? So, with this, I'll change it. I'll use an example, an example of say a club membership register like say, you run a country club, maybe not a country club, some club, right? And you're kind of like a software guy there. You're tasked to maintain club membership register as a CSV file, right? So, something like this, Luke Skywalker is a good friend of mine. Ya. So, we have this a CSV file and your manager comes to you and say, hey, look, why don't you come up with a PHP script that kind of manages the registry. And this is the feature story, right? So, if you do agile like I do, they usually are. So, where is the feature story? So, I want to add new members to the end of the file and if the email is already in the list, ignore the new entry. So, basically, if it's new, if it's never been there, add it in. Otherwise, if the email is already there, omitted or ignore it and don't throw anything or trove whatever, just say ignore it. Don't insert into the file. Simple enough, right? So, you go away that, alright, let's do this. You open up, you fire up your PHP storm and start coding away and, lo and behold, you got a piece of code that does exactly that. So, you have the first line opens up the CSV file, right? Open up, you read only mode, read plus, which the plus means you can also append to it and then you say, okay, loop through, find, if this guy is the same email as the guy, you return false otherwise, you put it in and you are done, right? So, and you turns true or false. It's somewhere along that line. Simple enough, right? Anyone can write this, right? Right? Okay. Okay. So, so it works. But here's a question. How many things do you think this ad member is doing? How many things? Just have a look again. How many things do you think he's doing? He's opening the file. He's checking for the whatever. I don't know why it's in position 2. And it kind of throws a return of false otherwise, it injects in the new CSV field, right? Four things. Four things, this thing is doing, right? Open the CSV, check for duplicate emails, ignore if it's already there and add if it's not there. So it's kind of like, there's a lot of things there's one function to do, right? Of course, if you rush for time, you probably just walk away and say, it works. I'm not going to care about this. But for me, it irks me that when I read it back in this file, right? What does it even mean? Why is this position number 2, right? It's like, huh? Right? So only thing here that makes sense is it's a function named ad member, okay? It's doing ad member but, you know, it's doing quite a few other things, right? So it is disconcerting. It's doing too much, right? Ideally, each function should only have one functionality as it only has one responsibility. It should only do one thing. So these four things that it's doing, we should probably try and break it up, maybe? And if you look at the code, it's not very readable, right? I mean, if you can sense that you can feel your eyes twitching and whatnot, you are in the right company because we all are craftsmen. We want to build things that we can be proud of, right? And if you're twitching, like I was when I was looking, when I wrote that code, I feel very dirty. But anyway, so let's go over onto this. So what, if you come back, you'll just say piece of code two months later. It still confuses you. What the hell is this? So what should you do? As I said, you should probably abstract away the details, the implementation details to fit separate functions, like opening a file, closing a file, writing to it. These are all implementation details. What do we want to do? We want to check. We want to add a new member into the list. But to do that, we need to do other things first, which means get the current list, do other list, do that. These are things we want to do. How we implement it, that's why we write functions. We hide it inside a function, right? So we ask ourselves what if questions? What if we come back and we say, I want to also check for something else, then we kind of like, shit, yeah. So if we have to make changes to the same piece of code, it will be very frustrating. And if you come, if the worst response you can give to your manager when he's asking you to make changes, it's not meant to do that. That's not an acceptable answer. So four. So what do you do? Step one, what are different behaviors? So first of all, when you open a file, what do you do when you open that file? You basically pulling together the list of members, the current list of members. You also identify that how do you check email? It's not about email, it's about checking whether that person is already in the list. That is what your, that is the behavior that you need to abstract. This is the behavior that you need. Checking for email address is just an implementation detail, right? You are concerned with whether this person is in the list. And now if it's already there, it's kind of like a reverse of two, right? It's just a result that you get from checking if a member is already in the list. And number four is add if it's not. So no, it's just about adding a new member to the list. He's just adding it in because I already know I checked it in the previous step. I'm just adding it in. I'm not going to check whether it's not in the list or whatever. I'm just going to do something that injects a new person right? So these are, you've broken it down to maybe three separate functions. Three separate behaviors that you want to kind of maybe extract out, right? So next thing you do is you you keep the function that it is. So as the refactoring step, you don't change the current, the actual implementation but you try to pull out the stuff that you think it's trying to do. So you break it into separate functions, right? So the things that it's trying to do, maybe you pull out or we call it extract, to get list of members into a function, right? So the act of opening the file, reading every single piece of record inside there and maybe pushing it into an array, right? And then just returning it, right? I'm sure there are some CSV libraries out there that can do this but I'm using just call PHP functions right now. So I have implemented something that that's just one thing, a function that just gives you what does it give you? There's some members. That's correct. So it's all I need to know, right? You can write test around this right now. You can write a test and say, I've just given this file, I should get like five records because I've a CSV fixture that gives me a fixture file which gives you five has five records inside. Maybe, right? So it's how you could write a test that they write. So a test with a test a function that only does one thing is very easy. Your test is the earlier function you're writing test, you're like, oh no, you got to check for this, you got to check for that, you got to check what if it's not there and you end up with a very long test. So breaking it up into smaller functions also help you write smaller test which is more readable. So first, extract the get list of members which is basically opening file blah blah blah and give you the array. Just give you an array. Second thing you do is extract the check imaginary my imaginary function which I actually just wrote gives me an array of members and then I assume I create a new function right now that kind of checks whether this person is already in the list. I don't really care what is inside this new member. I just know I'm passing in a new member object or in this case it's just an array. Array with some data inside. So whatever the main function gets, and I also from another function where I get the full list of members, I also pass to this. So this function it needs to know two things. It needs to know what is existing list of member and the new member I'm trying to add, right? So there are two things it needs to know and from these two it can basically compute the implementation of computing whether this guy is in the list is a mystery. It's an implementation detail which your boss will tell you. But in your function, in this one function, right? You wrap your implementation inside this one function which does one thing only. What does this function do? Check whether that person is in your list. So it gets takes in two arguments, it runs through it because it's an array which I'm not sure why I'm using number two, but somehow the guy who pass me the CSV file had the email address as a third column, right? So I put little comment there because this is the email email view, okay? Ya. So basically it just takes the list, loops through it and basically compares and find that if it's already in the list I'll just return false and early return and then return return true if it's in the list otherwise it returns false which means it's not in the list, right? So naming of the function is also very handy because the function name, right? Well, just communicate. Can you? Ya. Ya, it also tells you what kind of it kind of suggest you what kind of data it returns, right? Looking at the function name the is the is in front is kind of like kind of how we have is now you know, is a you know so it's kind of like also giving you suggesting to you that return the return data is a Boolean, betul? Ya, so that's unfortunately we kind of question mark in the function name so you know it's like, you know just another whole other story so we hint we hint to the person using this with the function name and we prefix it is and third thing is we take the add new member to the list it also break it up into the separate functions implementation is very simple just take the new user just insert it into the file, right? of course if you look at this full-on again you say add member to file what file are we talking about maybe I could refactor this to also include the file name I'm actually injecting it to because but you know that could be future refactor on the hindsight probably I should have put the file name as one of the arguments but anyway just separate separate discussion so now you have three functions that actually basically does the three behaviors that we talked about, right? first is retrieves the full list of members second one basically checks whether this person is in the list and then the third function is basically just inserting it into the into the file three separate functions that do three discrete things so one, two, three and each of them has one responsibility so once we have that we go back to our main function add member we open it up and say hmm let's refactor so we take the new functions that we have and we just inject it in here so for example getMembers is the first one we wrote gives you the list of members and we pass it into a local variable then we also make this return we also make this check something is this member in the list it returns true our it returns true when the user is actually already in the list and our return false to the whole thing I'll say I will not inject this person I will not add this member it is so I return early here but if it's not in the list I'll just add the member to the list and I'm done and then implement the implementation remains the same or rather the usage remains the same and I'm just passing in an array and then I have not did I change so question I ask of you is did I change the behavior of this function no right the function remains the same so when you do refactoring only focus on code that doesn't change behavior changing behavior changing the sorry make changes that doesn't change your API basically your function how is used don't try not to change it your behavior of a code try not to change it but internally the internal structure just re-jicket around to make it more readable and cheaper to make changes in the future because so see from a long function I've refactored and then this function is now so readable it's just like how many lines 1, 2, 3, 4, 5 5 lines of code that wah so nice and it reads like English right first you get the member it's a member in the list if it is if it falls otherwise add to the file from just reading this you kind of understand oh more or less immediately you understand what is this function how is this function implemented the implementation detail is somewhat obvious to you because it shows you the behavior but then the implementation detail you have to drill down further to actually see what's going on but again that's like so the implementation detail is abstracted away in a different function and you can just dig in and see you can see so we're happy the function works so the boss is happy then after 2 months he comes to you again and say avoid ah cannot so it tells you that we're getting too many duplicates in the list because I want you to check the name or so I want you to check that the name is not already in the list if it is then ignore it because currently we have all being duplicated in the list so it's very stressful so he one stressful boss it'll stressful faces so people with the same name ah with different email addresses are also being added to the list so they want to prevent this in the future so this is the what his feature change actually mean so it's the bug that is found so let's look at the code change so this is from here on I'll show you an example of how we have since we have refactored the code 2 months before we look at it again with fresh eyes we have not seen the code in 2 months again we look at the function that we can quickly look and we basically extracted out the main function that is displayed in that file right I can show you the file as well if you want so the file is line refactored okay cool so the refactored file is here so if I do a command shift minus it basically collapses to all the function names so first of all it gets get member is member in list add member to file and then finally the add member function which actually does the actual implementation right so looking at just the function names can you kind of guess where should I make the code change is member in list you guys so smart awesome okay so from this so where do you think you should make the code change we have identified is member in list it's pretty much you need to make the code change this is where it's actually being used because if it's not returning the right response it would basically let the let the let it go through and you basically so we need to check full name here so this is the original function that we have seen so we have one command there that says email field got it so there's something else in here so you can do a va dump inside and run a couple of things oh why is in this array ah okay the position one is position zero is the first name position one is so the first column is first name second column is the last name so this is kind of where we need to make a change so this is re-implemented code in ideal situation you also have test you also have test for this function so you have to go back and write and change the test now this test I want to also inject someone with the I pass it someone with the same name so it's a bit the behavior the behavior in this code change is a bit different from the email because email just one field so just is equal with with the first name last name you also want to check that it's actually you know if it's not it's not it's not it's not it's not it's not it's not it's not it's not you know if it's not first and last it's actually not exactly the same then you do something else so the implementation is a bit different but again these are all implementation details right your function communicates an intent your intent is to not is to check whether this guy is in the list and the way I check the check make the comparison is the implementation detail username it also in a further idea situation you may also extract out this two thing I can foresee that if we have more fields in the database in the CSV file you probably this function will probably grow right so maybe it's also good to maybe take this opportunity to refactor a little bit maybe create a new function that says is email there or like is name the same or something like that and why do we do that when why do we refactor make it cheaper the big changes in the future and more readable of course and also get paid ya see for freelancers it's actually a good thing when you refactor your code when your customer come to you and say I want to do this oh it's going to take a long time and you charge them extra you actually can do it just one day not really ever you can I mean you can I mean because developer happiness is important to me to everyone ok so what we learn from looking at this whole example today readable code for the win so we like code that readable that reads like English someone like English right and we also learn how to break up a long function into separate functions and each one with its own responsibility just one is responsibility per function and also makes future code changes easier to make because you know we have broken it up it's logical logical pieces of the puzzle that we can just poke and change if we need to make changes right it's not like a one huge function that oh no legacy code if I make two lines of change here where it break everything you know the kind of thing so it's scary ya so refactoring of a code is important and it's also important that you don't make changes that affect your API your current your current implementation of a code shouldn't be affected so there's technique number 3 there's technique number 4 refactoring so do not be obsessed with primitives why is this akan datang next one I'll talk about this okay so I'll share you a bit more about this next month meanwhile the sample codes are actually up there you can just take a picture of this understand this so it's a github github.com code kung fu refactoring PHP demo if you peek into the code right now you actually see the slides already up there and that's all I have any questions that's all I have for this week so next month we'll talk about the last refactoring technique really no question just want to add one point to that one of my mentor my career taught me that every time you copy paste code think of refactoring so every time you copy 50 line code from one place to another place try and understand if this should now become a function and extract it out if you can because often people don't do it just wanted to add that as well that one of the reasons you should refactor is to avoid copy pasted code or rewritten code because if you find bug in one place you end up ficting them at six places actually fix it only one place and the bug still remains ya so if you ever have a copy pasted code you know that's a bad thing ya but because it's also an opportunity because if you do copy and pasted code it means the cow tells you that these two places where you copy and pasted code there's some similar behavior and there's an opportunity when you look at these two code and say what are the similarities and what are the differences so once you figure out what are the similarities you can pull them out into a main function and then you can take all the differences and implement it separately differently so that's an opportunity you will see copy pasting of code it's an opportunity for you to find out where are the similarities and what are the differences and you know suffer that thanks any other questions or comments ya refactoring that you just did earlier if the file had like million entries the code will still work because nothing was loaded in memory but after the refactoring you're bringing everything into memory it won't work oh maybe I might break it might probably break but there's someone someone did tell me this before right friend of mine once said that performant code is usually never pretty right a very performant piece of code is never pretty so so somehow and sometimes it's a balance between pretty code and performant code code that reads well and you can understand and change easily or code that performs very well for that moment or for the thing that you need it for but I personally would put more emphasis on making code that is readable because eventually you'll be the one that has to maintain it of course performant code that is also readable that's like the holy grail and sometimes it's a balance running out which is what it is and of course to that question about why am I using CSV because it's easy of course if you want to look at in more detail you probably want to re-implement it as a database MySQL to make it more easy to manage performant That would be something I actually will talk about next month My clients Great Any other questions? No? Great So that's all I have Thank you So next we have