 Okay, so today I want to talk about refactoring legacy code, and this whole talk is based on a code Cata, which is like a code exercise that you do to practice your skills, called the Gilded Rose Cata. Sandy Mets has been doing a talk this year based on this Cata. She goes a different direction than I do. There's a whole bunch of Catas on a site, CMU, something Todd Sedano, who's down here, maintains that site, and there's a lot of really cool exercises. Anyway, this is a cool Cata, and you should actually try it out sometime, but I'd actually ask that you wait until after my talk to pull it down. Alright, so we are starting a new job today. We're working at the Gilded Rose Inn, which in addition to being an inn, sells a bunch of interesting items, and they have an inventory management system, and the previous developer left, and they need us to come in and take over the system. The system runs as a nightly process, and updates how many days we have left to sell the item, and what the quality of the item is. So as items age, they degrade in quality, and different kinds of items have different rules. It's pretty straightforward. There is a spec. We have no idea if we can trust it. Like most legacy codes, specs are out of date, they lie, they're wrong. We don't need to spend a lot of time on this. We'll come back to it later, but we don't know if we can trust the spec. There is a constraint. This heme does not practice collective code ownership, and there's a very grumpy guy in the corner that maintains the item class and does not let anybody else touch it. Now, this is our first day at work, so we're not going to kind of come in and rock the boat and try to solve this team issue. We're going to hold off a little bit, and so we're going to respect this constraint for today. Our task is to add support for a new kind of item called a conjured item, and a conjured item is just like a normal item, except it degrades in quality twice as fast. That seems pretty straightforward. Let's take a look at the code. There's the forbidden item class that we're not allowed to touch, and we're only going to look at it to notice there's nothing weird here. There's no weird side effects. It's just straight accessors, getters and setters, no magic, no weird side effects, and that's important when you're dealing with legacy code, because you need some firm ground to stand on while you reason about the code you're trying to clean up. So that's all I'll say about the item class, but that's a very important point. This is our database. That's a fairly interesting database. Notice at the bottom there, there is a conjured item in the database already. It probably is not being handled correctly. We don't know that yet, and then there's this update quality method, and I mean we have a task to do. We have to support these new items. That looks like maybe where we want to look, so let's look at that code. Okay, it's been a while since lunch, been a while since snack, so hopefully it's okay to look at this right now. That's nasty, but wait, there's more. Okay, let's look at the tests. Tests should be better, right? There are no tests, none, not one. Public service announcement. Please write tests for your code. All right, so what are we going to do about this? There are lots of techniques for dealing with legacy code. In fact, we could probably write a book with all the tech sneaks in it. Unfortunately, we don't have to, because Michael Feathers already did. If you work with legacy code at all, you need this book. It is awesome. Second thing, we need some safety. We need tests, and so I'm not going to talk about writing tests for legacy code. That's a different talk, but when I was working through this for this talk, I wrote what Michael Feathers calls characterization tests, because what I want to do is preserve the existing behavior of the code. Whether it's right or wrong, I want to make sure that I don't change that behavior while I'm doing my work, and so I wrote characterization tests. I use simple code to make sure I got 100% coverage, and everything I show you in this talk, with one exception that I'll point out, kept the test green the whole time. Okay, and that's important. I made sure I didn't change the visible behavior of this code when I was refactoring. So we're not going to trust the spec, like I said, and we are going to fight the urge to rewrite. When we don't know if we can trust the spec, we don't know if we can safely rewrite. I mean, everybody I've seen do this coda takes the spec and looks at the code and says, I'm throwing that out, I'm going to rewrite it. We're not going to do that. Rewriting is often very expensive and very error prone, and you can't always do it. So we need to have some techniques in our tool bag for when we can't rewrite. What do we do instead? We're going to follow the Boy Scout rule. We're not just going to hack in our change and perpetuate the evil that's in that code. We're going to leave the campground cleaner than we found it. So we're going to try to improve the code in the area where we're working. So we're going to take a lot of baby steps, very tiny refactorings that add up to big changes, but we're not going to boil the ocean. We're going to stay focused on our task. So we're only going to be really messing with the code that we have to touch to implement this new feature. It would be tempting to go rewrite the whole thing, but that's not what we're doing here. All right, and Kent Beck has this great advice for each decided change, make the change easy, which may be hard, and then make the easy change, and we're going to do that. All right, so we have to stay focused on our tasks. Are we there yet? Can we see an easy way to implement support for conjured items? Not one that leaves the campground clean. So let's see what we can do about this code. We're going to start this is the time when we know the least about this code. And so we just want to start getting our hands on it a little bit. And we're just going to do some really simple mechanical changes to try to improve things. I used Ruby Mind for this talk, and it's got some great refactoring tools. And I use those a lot in here. So a lot of these refactoring I'm talking about can be automated in something like Ruby Mind. So the first thing we're going to do, I found this quote when I was preparing the talk, and I had to put it in there. When you start a new programming job, you have to walk right up to the biggest function in the yard and refactor it in front of everyone. So here we go. So the first thing we're going to do is we're going to get rid of some noise so we can start to see the structure of this code a little bit. The first noise I see is this at items I everywhere. There's actually 34 instances of it in this code, the two pages. So let's extract that to a temporary variable. Martin Fowler calls it extract temporary or extract variable. I forget the exact name. So we pull that out. We have this variable at the top, and that's a little bit less noise. Let's zoom in on that a little bit. So we've got this loop and assignment, and then we've got 40 plus lines of code that all refer to item. That method's doing two things. It's looping and it's messing with items. Let's extract those 40 plus lines of code into their own method. That's better. And that loop is kind of weird for Ruby. It is using a Ruby range, which is kind of neat. This code was actually ported from C Sharps. That's why you see some of this stuff. But there's a much more idiomatic Ruby way to say that. And I'll zoom in a little bit. An idiomatic code is more readable. And so we're going to try to make the code a little more idiomatic so that you can see at a glance what it's doing. You don't have to study it. It's like, oh, okay, that's just looping through all the items. So we can use an each here. And that's a little better. Now this method is pretty clean. And it's kind of an island of sanity in this sea of chaos. We have a place to stand while we work. Okay, back to the update method. There's still some noise in here. Notice all the conditionals are wrapped in parentheses. I assume that's a holdover from the C Sharps. Ruby doesn't need those. Get rid of them. There's a semicolon near the bottom. Ruby doesn't need it. Get rid of it. Getting rid of noise. Alright, there's actually two more parentheses I could get rid of. Yes, I know Ryan, but I'm not getting rid of them anyway. So we have this item quality equals item quality minus one or plus one. Ruby has the minus equals and plus equals operator. And again, that gets rid of a little bit of noise. A little more idiomatic, a little bit cleaner. Alright, are we there yet? Can we do conjured items yet? No, I don't think so. So the next thing is there's a code smell on this code known as feature NV. So let's get rid of the feature NV. So what feature NV is, is when you've got a method that is way too interested in the internals of another object. You see all those item dots in there? That's a clear sign of the feature NV smell. There's all this stuff. We're doing two items rather than asking items to do that for us. So we'd like to extract the body of this update method over to the item class, except that we have this constraint. We're not allowed to touch item. So we have to come up with a different strategy here. We've only been working here for a few minutes and so we're not really ready to address the grumpy guy in the corner. So we need to do something else. So what we're going to do is we're going to wrap the items in another class and then we can move some behavior onto the wrapper. And to do that we can use something from the Ruby Standard Library called Simple Delegator. So I've implemented a class called item wrapper that inherits some Simple Delegator. What Simple Delegator does is you construct it on an object and any message you send to the Delegator that it doesn't understand, it forwards onto the original object. So this little refactoring right here where I introduced this class did not change any behavior because all the messages we're sending to item get sent to the wrapper which forwards it onto the item that we're wrapping. But now this gives us a place to move some behavior to. So now we can take the whole body of the update method and move it to a method on item wrapper. There are some tools that can actually do this refactoring automatically. VisualWorks Smalltalk has a refactoring called extract method to component that does this for you. RubyMind does not, so we're going to do it in a few steps by hand. So first of all we're going to copy and paste that whole body of the update method, move it to a method on item wrapper. And because I don't want to change those 43 lines yet, I introduced a temporary variable called item which is the same name that the parameter had, and just assign self to it so that I'm actually talking to the object but I'm doing it through a temporary variable just so I don't have to change those 43 lines of code yet. Now that update method at the bottom really isn't carrying its own weight so we can go ahead and inline that, it's called inline method refactoring, and you can see there item wrapper dot new on item dot update. All right, and now we can go look at that the item wrappers update method and we can inline that temporary variable now, which is actually worse, but now almost all of those self dots are completely unnecessary because when you're calling a getter on self you don't need self dot. When you're calling a setter you need it so we have to keep some of them. That's starting to look a little bit more readable. Are we there yet? Can we do conjured items yet? No, still not. I'm not seeing it anyway. The code is better. We've left it better but it's not a great thing. Now I see some duplication. So let's get rid of the duplication. First of all, duplication is a place for bugs to live because you fix n minus one of the places where the code is duplicated but you miss the other one and you get in trouble. And the other thing that removing duplication does is it allows you to give things names that can help the code communicate better. So the first duplication I see is this little phrase if quality is less than 50 self dot quality plus equals one. Let's extract that to a method called increase quality like so. Now there's one more case that looks like we could extract it but not quite because there's a nested if inside of it. So let's dive in and look at that a little bit. What I'd like to do is pull that inner if out a level but in order to do that I have to make sure that it's safe to do that. So this is the first refactoring we're doing we actually have to think a little bit about the code. Everything else up to this point has been very mechanical. This one we have to think about. So what's it doing inside that inner if? While we're comparing against name we're comparing against sell-in we already saw that name and sell-in are just simple getters there's no magic there. And then there's two calls to increase quality. Those calls both are protected by an if quality is less than 50 down at the bottom. So we can safely pull that if statement out a level like so and now we've got exactly duplicated code and we can replace that with a call to increase quality. All right there's another instance of duplication in this code and that's this phrase now this looks like it might want to be a decrease quality method but there's this extra comparison against name in there that really doesn't belong in decrease quality so again we have to reason about this a little bit and what we see is we've got these two nested if statements there's no else clauses to worry about and the two conditions are completely independent it's not like we're checking for nil and then sending a message to the object that we're sure is not nil these are completely independent so we can just reverse them. Okay again we had to think about that a little bit but not too bad all right and now we can extract a decrease quality method and that's better okay are we there yet can we do conjure items yet I'm really not seeing it but what I am seeing is that there's kind of three main sections to this method the top section is all about updating the quality and there's this little tiny middle section that's about updating a cell end date or cell in days and then the bottom section which is also about adjusting the quality so what I'd like to do is I'd like to group similar tasks so what I want to do is I want to move that cell in clause in the middle either up to the top or down to the bottom so zoom in on that a little bit if we move it to the top what our method looks like is we're adjusting the cell and date of the item and then we're adjusting the quality based on that new cell and date that seems to make logical sense to me that seems to read very well so I'd like to move it up to the top but if I do that cell in is going to be one less than it used to be and so I have to adjust those other two if statements that talk about selling in there so I'm going to do that again we had to think about this a little bit still not too complex reasoning but we do have to think and we have the characterization test to protect us if it goes wrong so now we've moved that up to the top and we'll zoom back out and see the whole method again now we've got this method that does two distinct tasks so let's extract those and give those names so we're going to first extract the top part and call it age and then we'll extract the bottom part and call it update quality okay so now update is a very nice clean method it says exactly what it does actually follow something Kent Beck calls the composed method pattern which I'm not going to go into here but it's a very nice clean method and that's actually our top level entry point in the item wrapper class if you remember so you look at the item wrapper and say oh to update an item I age it and I update its quality I understand that that makes sense now can we do concert items we made it better right no it's still pretty nasty so I think we finally got to the point where all we can really do is tackle these nasty conditionals so it's time to simplify those so this is where things get ugly but now we've got our hands on the code a little bit we've kind of got the feel of it we're starting to get a sense of it and now we're feeling a little more confident in our ability to tackle this code and that's a good thing so the first thing I see is that all of these conditions are negative not equal to not equal to not equal to and negative conditions are generally harder to reason about than positive conditions so I'd like to reverse those now there's the inner ones where a name not equal to Sulfurus there is no else there those ones are harder to negate so we'll ignore that but let's zoom in down at the bottom on just the backstage passes check and I'd like to negate that and make it an equals equals rather than a not equals and to do that you just have to swap those in and the else parts okay just reverse those so that looks like this okay so now it's a positive check not a negative check and we can zoom out one level and do the same thing with aged brie and then we can go look at the top condition now and I like to do that here but this is a compound condition it's a little more complex to negate that condition we still swap those in and the else that's fine but that condition is harder to reverse and thankfully Ryan already taught you about the Morgan's law I'm going to teach you about it again using Ruby this time instead of circuits or gates so if you want to negate A and B so A and B is false if either A is false or B is false and that's basically what De Morgan says and then there's a version for switching ors to ants so let's apply that to our code so there's our condition we want to negate it and then we apply De Morgan's law which looks really ugly and then we simplify those all those knots okay so four quick steps there and now we have the inverted condition and now we can apply that back to our code swap the then and the else parts like we did before and that looks like that okay next let's look at the overall skeleton of this of this method just the ifs and the else's notice this pattern where there's an else with a nested if and else with nested if Ruby has an else if keyword that allows us to get rid of a lot of that nesting so that looks a little better and we can look at that in full context here the next thing I want to do is I want to look at this a little bit at the bottom of both of these if clauses there's an else if name not equal to Sulfurus and then we do nothing so let's zoom in on that a little bit if name is Sulfurus this whole method does absolutely nothing we skip everything and so what I want to do is I want to take these two duplicate conditions and extract them up and make them a guard clause in the method so that we just return if it's Sulfurus and forget about them for the rest of the method okay so that looks like that and then we zoom out a level notice that age does almost exactly the same thing and we can quickly invert that condition make it a guard clause and make it exactly the same now we've got two methods that have the exact same guard clause at the top and they're both called from a higher level method and that's the only place they're called from because we just extracted those two methods we know that so we can move that guard clause up a level and Kent Beck just wrote a blog post a couple weeks ago about something called call graph refactorings and this is kind of a call graph refactoring I actually exchanged some emails with them on the topic to make sure I understood it and I'm not sure I do yet but I want to appreciate I appreciate him taking the time to answer me so we'll do that refactoring right there all right now we can go back and look at update quality again that compound conditionals bothering me especially when we're checking backstage passes almost twice right in a row so what I'd like to do is split that compound conditional and kind of merge the backstage passes cases and to do that we have to duplicate that one increase quality line and that looks like that okay notices are all really tiny baby step refactorings all along the way now if you look at the top if statement and then the if statements that's nested in the if sell in less than zero at the bottom those if statements have the exact same structure and switching on the name of the item seems to be an important concept in this code remember kind of our system description talked about you know different kinds of items have different rules I want to kind of unify those those if statements together and to do that we kind of have to distribute that sell in less than zero check to all the conditions so I'm going to first merge the bottom else clauses that looks like that so we decrease quality and then if sell in less than zero we decrease quality again repeat that for backstage passes and repeat that for aged B and now we have this method with a top level switching on the name and then logic for each of the cases okay are we there yet can we do conjured items yet this is the first time I feel like I can confidently support conjured items and actually have it work and have the code not be significantly worse than it is now so let's go ahead and give it a shot now we're going to make the easy change we spent all this time making the change easy now let's make the easy change so there's our code and remember conjured items degrading quality twice as fast as normal items so we just add a new clause for conjured items we're comparing against the name just like the other guys do now notice we have to call decrease quality twice I really wanted to write this code but that fails the tests and the reason is the decrease quality does a check to make sure the quality never goes below zero so we have to stick with this for now okay so at this point we've actually implemented our new feature this is where I had to change the test because conjured items actually have different rules and I needed to make the test reflect that but I didn't change the test for any of the other kinds of items and our tests are green again we could stop here but we still have some time left so let's see if we can make the code just a little bit better you know we still have some time left for this feature so we want to express some important ideas in the code and the first idea comes in these two methods to decrease and increase quality they're expressing an idea from the spec which is that the quality of an item is never negative and the quality of an item is never more than 50 but the problem is they don't express it at the lowest possible level there's other ways we can set the quality that don't go through these checks and that's why we had to call decrease quality twice so I want to push those down a level so what we can do is we can introduce a quality equals method here now remember we're an item wrapper which is a simple delegator so by calling super we're calling simple delegators version of quality equals which it doesn't understand so it forwards it onto the original item again this is really a no up with an extra method in there but it gives us a place to move some behavior so we can take that quality greater than zero check from decrease quality and push that down to the quality equals method we have to recast it a little bit because it's a new context but it's basically the same check so we just say if quality is less new quality is less than zero then make it zero and we can do the same thing with the quality less than 50 check and that's a little bit cleaner there's more you can do to refactor that there's actually a proposal for Ruby to add a method called clamp to numeric types that does this for you you just say new quality dot clamp zero 50 and that would work but that hasn't been implemented yet now decrease quality and increase quality really aren't pulling their own weight anymore so I'm going to go ahead and inline those so we extracted them because they were duplication but now we simplified them enough that we're going to inline them back it feels like we're waffling a little bit but we're really not we kind of learned some things by extracting them and now we can put them back in so inline decrease quality inline increase quality and now we have those duplicate quality minus equals one lines that we can finally merge together and get the code we wanted in the first place so that's better there's another idea that's not being expressed in this code everywhere where we touch quality we're adjusting it by some amount and so there's this concept of a quality adjustment that seems like it might be important now I did this refactoring in probably six or seven steps which I don't have time to run through so I'm just going to show you the final product so update quality now just says self dot quality plus equals quality adjustment and then quality adjustment goes through all the cases and computes what the quality adjustment should be okay and now what we can do with that let me just double check if I know where I am yes all right so there's another refactoring I want to do here we have this big long conditional and all those cases are independent and there's a refactoring called replace conditional with polymorphism so we're going to do that next so we have our item wrappers and what we want to do is we're going to make subclasses of item wrapper and push behavior down to those subclasses so the HB will go to an HB subclass and so on in order to do that we need to create the right kind of subclass off of an item so we're going to introduce something called the factory method to item wrapper and I'm just going to call it wrap notice it's self-taught wrap it's a class method and all it does right now is delegate to new again we're introducing a place that we can move some behavior to step at a time very small steps every time okay and we're calling wrap instead of new down below and now we can look at the aged breed case now we can introduce an aged breed subclass so the factory now says oh if item name is aged breed then make an aged breed otherwise make a regular item aged breed itself just inherits from item wrapper does nothing special but now we can take the aged breed case of quality adjustment and push that down okay and now we can we can do that again for backstage passes and for conjured monocakes and now our quality adjustment method is pretty clean and all of the checks against name are up in that factory method there's no checks for name anywhere else with one exception we still have that sulfurous check down an update so let's make a subclass for those look at the spec it says sulfurous being a legendary item never has to be sold or decreases in quality so we're going to make a legendary item subclass and add that to the factory and then we can push down the sulfurous behavior of update down to that subclass well what happens in update if the name is sulfurous absolutely nothing we need an empty update method when I have one of those I tend to put a little comment in there just so the next guy coming along next person coming along doesn't think I forgot to put code in there I want to make it clear all right so now our factory is the only place where we do name checks we can look at that a little bit closer there's one more little simplification we can do here and we can use a case statement that makes it read just a little bit better all right so now we want to express some domain concepts we have a few minutes left to do that and I have to go through this pretty quick but let's look at our spec so once the sell by date has passed quality degrades twice as fast does this code say that kind of implicit it sort of does but not really so this is about oh cool this is about five or six little baby step refactorings to get to here so quality adjustment now says if sell ends less than zero then use a past date adjustment otherwise a normal adjustment normal adjustment is minus one past date adjustment is two times normal adjustment so now that's saying quality degrades twice as fast that's better next line in the spec age spree actually increases in quality older it gets that doesn't really say that again five or six steps and we can get to here now this is a little bit weird normal items adjustment is minus one age spree increases instead of decreases so we just negate the super superclass implementation that might be a little too coupled I may not actually do that for real I did it here because to me it expresses the idea that it increases instead of decreases it's the opposite of what the base class does conjured items degrading quality twice as fast as normal items that doesn't really say that that does again five or six steps to get there and finally backstage passes first one of the things that says is the quality drops to zero after the concert so we can use this past date adjustment that we have in the base class and just override it and return minus quality doesn't quite say it drops to zero unfortunately doesn't read quite that cleanly I might do a little more work on that but it's not bad and then we can look at normal adjustment we can clean it up just a little bit again this is a few steps that reads pretty good now let's look at the rest of the spec for backstage passes quality increases by two when there are ten days or less and by three when there are five days or less does this say that pretty close but the spec says five days or less and ten days or less this code says less than five less than ten that's a mismatch is that a bug we don't know maybe the spec is out of date maybe the code is wrong we are not going to blindly fix that bug because there might be people relying on that behavior or other systems downstream a lot of times in legacy code one bug masks another bug so when you're working legacy code you really have to be careful to preserve behavior until you can find out more information and decide if that's really a bug and whether you should really fix it right now so we are not going to do that so let's quickly review the final code remember where we started and virtualize if this bothered you before I actually ran FLOG thank you Ryan for that tool this is where we started the total was 179.5 and that one update quality method was 155.1 so 80 this was actually 80 steps later there's our database quote unquote there's the new update quality method just loops through the items wraps them and updates them there's our factory that we built there's update and age there's how we update quality and there's the subclasses now our total is 67.1 down from 179.5 the worst method is now gilded road initialize which we didn't touch that method that was out of scope for the change we were trying to make so the worst method is one we didn't touch the worst method that we touched was actually the factory method which was a 9.5 so our worst method went from a 155.1 down to a 9.5 of the code we touched that's a pretty good improvement I would argue so what do we learn slow and steady wins the race we didn't rewrite this thing we just very tiny steps very tiny safe steps there's a few places where we had to think a little bit but mostly this was mechanical changes following standard refactorings out of Martin Fowler's book if you haven't read that book you need to read that one too refactoring but just very tiny steps we made the change easy then we made the easy change we made simple mechanical changes first so we could get our hands on the code and start to get a sense of it we reduced noise we removed duplication and we made sure we expressed important ideas and we expressed domain concepts in the code so that the next person that comes along will have an easier time with the code and that's probably gonna be us like Blythe said if you don't have empathy for other people at least have empathy for yourself and we didn't mindlessly fix the bugs there is a GitHub repo that has every single tiny refactoring that I did including one more I didn't talk about so there's actually 81 in there so if you want to go see all the micro steps I did they're there I'll thank my company for sending me Zeal is a local consulting company where I live that they let me come in and do this as a tech talk for them giving lots of great feedback and Kent Beck exchanged some email with me there's some references the slides will be up on speaker deck I don't have time for questions unfortunately I apologize for that any feedback on the talk would be great and the slides will be on speaker deck actually I think they're up already so if you want to see them great and thank you very much awesome thanks very much Randy