 has been a hot topic in software engineering over the past few years. And I think this is rightfully so. Refactoring is an immensely important tool to any developer who must add to and maintain pre-existing code. Now, there are many benefits to refactoring. And frankly, I feel like I'd be preaching to the choir if I went through all of them. But refactoring is still a change to working code. And any change to a code base already in production carries a certain amount of risk. Now, when I talk about the risk of changing software, I occasionally encounter an attitude of, yes, maybe there's some downtime. Maybe it costs us some money. But it's not life or death, right? Well, to this, I answer that software is increasingly being integrated into transportation, which includes planes and self-driving cars, energy sources. This includes nuclear power plants. And most critically, medical devices. Now, this includes pacemakers. The software on a pacemaker literally controls whether a patient's heart beats or not and whether that patient lives or dies. So with all these potentially life-threatening possibilities, it's natural to ask, is refactoring bad then? Well, to this, I'd argue that, like many things, refactoring is neither inherently good or bad. It's how you do it that matters. Now, in my time as a developer, I've encountered two common approaches to refactoring. One is to edit the code and pray or hope that it works as expected. I've heard this referred to as PDD, or prayer-driven development. And the other is cover and modify. Now, this involves using tools to both discover and verify what the code does, covering that behavior with tests, and then making changes to it. And this is from Michael C. Feathers working effectively with legacy code. Now, it probably goes without saying that cover and modify is by far my preferred method for refactoring. Hope and prayer are good for many things, but they should never, ever be used as a refactoring technique. So now that we've discussed the two common approaches to refactoring, I'd like to introduce you to my method of refactoring, which I call surgical refactoring. In surgical refactoring, we change exactly what we intend to change. And here's the key part, only what we intend to change. That means we take extensive measures to eliminate side effects of our refactoring as much as in our power. The first and most important principle of surgical refactoring is to first do no harm. That means none of our changes should break or affect the behavior of existing code. So just like any benefit of a medical surgery will be negated if it caused harm to the patient, likewise any benefit of a refactoring is negated if it breaks other parts of the code or the system that the code runs on. Now, rather than being a prescribed set-and-stone method, surgical refactoring is really a series of good habits. What all these habits have in common is that they reduce risk. Now, when I'm evaluating whether code should be refactored, I generally divide refactoring into two categories. The first is necessary refactoring. Now, this is like a medically necessary surgery. These refactoring are needed for the survival or health of our application. And the second is cosmetic refactoring, which is kind of like cosmetic surgery. Now, I want to make clear there is nothing wrong with cosmetic refactoring, but just that the risk tolerance is different. And we'll go over what risk tolerance means in just a moment. So let's talk a little bit more about what makes a refactoring a necessary refactoring. A refactoring is necessary when we need to add something to the code, but we just can't do it with the code in its current state. Additionally, refactoring is necessary when the code is too inefficient, when it's tying up too much time and resources. So I like to use the example of what I once DDOS'd myself, because a report I was trying to generate took so long and ran so many queries on the database that the server ran out of memory and the application crashed. Now, this was a prime candidate for refactoring. Finally, a refactoring is necessary when without that refactoring, we would be blocked from achieving a business need. Now, if you're refactoring code at work, there must be a business need of some kind for refactoring to be considered a necessary refactoring. This is how you justify that refactoring to project managers, business users, and anyone else who's concerned. So necessary refactoring is have a moderate to high risk tolerance. We still need to do them very carefully, but we can tolerate a higher amount of risk because the benefit of the refactoring outweighs the risk of the change. As for cosmetic refactoring, a cosmetic refactoring is when there is no business need to change the code. It's not blocking us from adding or fixing anything, but something about it just bugs us every time we see it. How many people here have experienced this? Quite a few hands went up. There's just a code smell, some sort of weird syntax. It's just something that we are desperate to fix. Now, the risk tolerance for a cosmetic refactoring is much lower. Even if something about the code bugs us, it's better to have ugly code that works than elegant code that does not work. Now, while we're on the subject of cosmetic refactoring, I want to take a moment to address white space for factoring, and this is because white space for factoring is the ultimate cosmetic refactoring. And as I'm sure many of us here in this room have experienced, it's a massive source of distraction and what I consider unnecessary conflict among team members. So the solution to this is if you do not have a style guide, get one, or adopt one. The GitHub style guide is a particularly good one. And if you come across some white space in an application that you just don't like or doesn't meet your own preferences, if it does not violate the style guide, leave it alone. So now that we've covered the difference between necessary and cosmetic refactoring, let's go over what's involved in surgical refactoring. Now, there are three stages, and these stages are the same whether it's a necessary refactoring or a cosmetic refactoring. The first stage is pre-op. This is what we do before touching the code, diagnosing the code, figuring out what it does, adding inspects to verify that behavior. I usually spend by far the most time in the pre-op phase. Next is the operation. That's when you carry out the actual refactoring. When you make those changes to the structure and style of the code. And finally, there's the recovery stage, where we take additional steps to verify the refactor. Now, I could just talk about each of these stages, but I instead like to take you through an example of surgical refactoring and discuss these stages as we go along. So let's say we're browsing through our applications code base and we come across a method that looks like this. Now, this is based on an actual example sent to me by Michael Lorenz on Twitter. Now, this is not a long method, but it's a very complex one. We have a system call, we have a substitution involving multiple regixes and more. So this is the kind of method, if I were to come across it in my code base, that would make me drop my head to my desk and moan in despair. So let's refactor this using surgical refactoring. And the first stage is pre-op. The main thing we need to do in the pre-op phase is make our diagnosis. What exactly is this code doing and how could we preserve that behavior? So let's start out by looking at this method and then mapping out what we think it does. So the first thing I notice is that this method uses Ruby's system method to execute some sort of command. And I notice that the command it executes is a said command with some flags passed to it. Next, I notice it performs a substitution of some kind, finding a pattern of text and then replacing it with a different pattern. And finally, it looks like it does this to a series of directories and files. So does that mean we know what it does now? Not quite. A map like the one we just made is helpful but is also influenced by our own experiences and expectations about the code. It's not an objective assessment. The only objective, definite and canonical way of knowing what the code does is to execute the code itself. And the best way I've found to repeatedly execute the code, and we're going to need to do that a lot in this refactoring, is through automated tests. So let's begin our refactoring by creating a spec which verifies the first thing this method does, the system call. Let's start off just by verifying the call to the system method happens. Now, even though it might seem pretty obvious from looking the code, it gives us a starting point for our tests which will execute the method. And the first thing I'm going to do when I create my test, I'm going to use our spec for this example, is to create an instance of my class that contains the method I'm refactoring. I'm also going to do this in a let statement which will allow me to use the instance of my class in multiple specs. Now let's create another let statement. This is the argument will pass to the method. Now when I don't have much context about how a method works, I like to just use a string for a sample argument. This way if the test fails, the failure message will probably give me a good clue as to what kind of argument the method expects. Then I'll create my spec, verifying that it calls the system method of Ruby. And to do that, I'm going to add in an expect statement saying that I expect the instance of my class to receive a call to system. And I'm going to expect it to happen with any arguments at all. At this point, I don't really care what's passed to the system call. I just want to verify that it happens as I expect it to. Then we need to actually call the method from the spec. Then we'll run the spec and the spec passes. So we're done, right? Not quite. Since the test passed first, I don't know for certain that it actually executed the code I expected it to. This could easily be a false positive. Now if we were writing this code from scratch, we would write a failing test first, then write just enough code to make it pass, and then refactor, the red-green refactor method. However, when I'm adding test to code that already exists, sometimes the test passed first, and then I use a green-red-green method. That means I add in the test first, and when it passes, I make sure the test fails by changing the code itself. So to do this, I'm going to remove that system call, then rerun the spec, and this time the spec fails. So now we can be confident that the test is a reliable indicator of whether the code is working or not. So now that it's failed, let's put that system call back in and run the test again, and this time it passes. All right, so we know the call to the system method happens, but now it's time to figure out what the system call is expected to return. And to do this, I had to look it up in the Ruby docs. The system method returns true when it executes the command passed to it successfully. So it will return true when that call to the said command returns true, or when the said command executes successfully. So let's create a spec for this. First, I'll set up my spec, expecting that when the method call is successful, it returns true, and then I'll expect that when I call the method on the instance of the class and pass in that sample argument, it will return true. And when I do this, the spec fails the first time. So here's where I take a look at the error message returned from the spec failure for clues as to why that spec failed. And it looks like it returned an error from the said command, no such file or directory. So let's take a close look at that error message. It seems like it's expecting a directory or subdirectory in a .rb file within that subdirectory. So let's set this up in the spec. First, I'll create names for the directory, subdirectory, and file. Again, I'm putting these into let statement so I can use them in multiple specs. And then in the before block of the specs, I'll create that directory in subdirectory, and then I'm going to create the path for that file. This is where the file will be created in my system. And finally, I'll create, write to, and close the file itself. Now, before anyone panics, yes, this is creating an actual object on our system, and yes, this is a code smell. Fear not, we will come back to it. Our primary objective right now is just to get these specs around the method running and verified, then we'll come back and optimize them. So when I run this spec again, the spec passes. So now we know we're passing the directory and file structure it expects and getting the expected value in return. So we've just added a whole lot of setup codes that last test, and it wouldn't be hard for that logic to quickly become tangled. So let's take all that setup code we just created and move it into its own method. This way it's more maintainable and we can use it in other specs as well. So this method will take a path and a file name, then make the directories associated with that path, then create, write to, and close the file itself, which we'll now be ready to use in our spec. So back in that spec we were just working on, I call this new setup method in the before block and pass it the path to that directory and subdirectory and then the name of the file we are creating. Then I run the spec just to make sure it still works and the spec still passes. All right, I promised I would come back to this. Since we're creating actual directories and files, let's clean them up after the spec runs. After the spec runs, let's remove the directory and everything in it, including the subdirectory and the file. So if I run the spec again and the spec still passes. Now if I were to look at the directory I'm running these specs from, I would not see any additional directories or files added to it after the spec run. So let's pause for a moment and look back at the original method. Now we're making good progress. We know that if it calls the system method, we know that we need to, what we need to pass that method for the said command to be successful. Now let's take a closer look at the said command itself. Now said is a streaming text editor. That means it is somehow changing the file or files within that directory structure. We passed that method. And it looks like that substitute command is the core of what the said command is doing to that file or the files within that directory structure. So let's zoom in a little closer. Here it is. Now what the substitute command appears to be doing is to find a match for this first pattern, this first regular expression and then replace it with this second pattern. So we have not one, but two regexes here that we need to figure out what they do. Now whenever I'm dealing with a refactoring that involves a regex, and by the way, if you can refactor a regex without breaking it, you can refactor anything. I like to first extract the regex into its own method, character by character, so I can get a better idea of what it does without all the clutter of the original method. So let's add in a new method and call it the first regex match method. And what this first regex match method is going to do is attempt to find a match in the string that is passed to it using the regular expression on the left. Now right now that's just an empty regex. This is using Ruby's match method for regular expressions and strings. So first I'm going to set up my spec. Now when I'm adding specs around a regex, I like to paste the original regex in a comment in the spec for me to refer to as I decrypt it. Then I'm going to create a string in a let variable. This is what I will pass to this new first regex match method. For now it's just an empty string. Then I set up the spec itself, verifying that the method we call finds a match within the string. And I'm going to expect that when I call this new method and pass it that string, it will not return nil. Now remember, we're using Ruby's match method. This method returns nil if it does not find a match for the regex within the string that is passed to it. So that means if the result is not nil when we call this method in our spec, it means it will have found a match. So focusing on that reference regex, it looks like the first character in that regex is a colon. So I'm going to add that to the string we are passing the method and rerun the spec and the spec fails. All right, it's easy enough to make this pass. Let's add in that colon to the first regex match method. Remember, we're just copying this regex character by character from our original method and verifying as we go along. Now we are technically creating new code here, even though we're copying and pasting. So our tests need to fail first. Then we need to write just enough code to make them pass. And the spec passes. So who's off? Now let's move on to the next character in that regex. The next character I notice in that regex is the Alnum character class. Now this is an older regex syntax, but it matches any capital letter, any lowercase letter, or any digit. Now I might start itching immediately to replace that older character class with a more modern syntax like the slash W meta character. But remember, we're verifying the behavior first before we change anything. So let's add in a letter to that test string, then run the spec again, and the spec fails. All right, we know how to make this pass. Let's add that Alnum character class to the first regex match method and rerun the spec. And this time the spec passes. So the next thing I wanna focus on is that plus sign in the regular expression. This plus sign is a quantifier. When it appears after a character or character class, it means it will find a match for if that character appears one or more times. So let's add in another letter to the test string, just to make that character appear one or more times, and run the spec, and the spec passes. But wait a minute. We were expecting the test to fail first. So let's investigate why it passed the first time. The regex in the state that it is in right now will return a match for any string with a colon and a single letter or number. So if we pass this our string with a colon and two letters or numbers, it will return a successful match after finding the colon and only one letter or number. So rather than just verifying that a successful match happens, we need to make sure that it matches the entire constant of the string that we expect it to. So let's alter the spec. Rather than just expecting that a match will happen, let's test the constant of the match itself. So when we call the 2s method on a Ruby match data object, match data is the type of object that is returned when using Ruby's match method, it will return the entire text match from the string. So we're going to expect that it will match the colon in both letters, not just the colon in the first letter. So let's run the spec and the spec fails. All right, let's make it pass. Back in that first regex match method, we add in the plus sign after the album character class and run the spec again. And this time the spec passes. All right, next step is to focus on those parentheses around the album character class and the plus sign. Now this is known as a capture group, which means it will capture whatever part of the string matches that part of the regex and save it in the computer's memory for us to use later. This time I'm going to test that the match return by the first regex match method contains a capture group. And to understand how to do that, we have to remember that when a match data object is returned by Ruby, it will include any capture groups within the object. Now we access the first capture group by calling a one in brackets on that match data object. Yes, the first capture group is accessed by using one in brackets, not zero in brackets like an array, but that is a topic for another presentation. And the spec fails. So let's go ahead and make it pass. Now as I mentioned earlier, we're technically creating new test-driven code here, even though we're copying it character by character from another method. So after I get a failing test, I want to add in just enough code to make that test pass. So in this case, I'm going to add in an empty capture group for the moment and rerun the spec, and this time the spec passes. Now let's get more specific about that capture group. Rather than just testing that it exists, let's test the actual content. So using that reference regex, it looks like the capture group should contain the match for the ALNOM character class appearing one or more times. Now if we look at the test string we are passing, that should be the letters A and B. So let's test it that first capture group contains the content we expect it to, and the spec fails. All right, so we are not getting that content in the capture group that we expect. So let's move those parentheses around the correct portion of the regex, that ALNOM character class appearing one or more times. And let's rerun the spec, and this time the spec passes. All right, now I would love to take you through adding each additional character of this regex, that space character class in that hash rocket, but for the sake of time, let's fast forward. Let's say that we've added each of these remaining characters of the regex to our new method, making sure we had a failing spec first, and then making the spec pass. So at this point we know what this first regex does, and we even have a sample string which we know will match it. So rather than continuing to test this regular expression in a separate method, let's add in a test for its behavior to the spec exercising the larger original method we are refactoring. And we do that by writing that sample string, something we know the regex will match, to the file we passed the original method in that before block. Then we capture the original content of that file we just wrote to. And let's capture a variable. Let's call that variable original contents. Then we call the method and pass it the directory which contains the subdirectory, which contains the file to the method we are refactoring. Then we're going to capture the new contents of that file after we run the method on it. So let's assign that to the variable new contents. Finally, we'll compare the original contents of the file to the new contents of the file. We're going to expect that the original content should not equal the new contents. The file should have changed when the method ran on it. And the spec passes. Now normally, I would go back into the code and modify it to make sure the spec fails. In the real world, we absolutely should do that. But for the sake of time, let's move on. All right, let's take a pause and stay with me here. We're almost through this and I will get you to your coffee as soon as possible. Now we know this method takes a directory which contains a subdirectory with at least one .rb file. And we know the pattern it is looking for in that file, but we don't know what it does with the match for that pattern after it finds it. So let's zoom in on that substitute command again. And we now know what this first pattern matches, so let's take a look at the second pattern with that substitute command we'll replace the match for the first pattern with. And here it is in focus. So let's go through what this does and then devise a test for it. So this first slash is an escape. Now in a regular expression, some characters have special meaning. If we want to use these characters without using the special meaning, we need to escape them with the slash. And next is the second slash. That's what we escape with the first slash. That second slash and that one mean we are reusing the content of the first capture group. So remember how a capture group in the first pattern captures a certain portion of the string? This second pattern will return the match of the first pattern with whatever it captures in that first capture group. Then we will add a colon after it inserts that first capture group into the file. So let's take a look at an example of this, make it a little clearer. Here's the larger substitute command. Now if we were to find this string, which we know matches the first pattern, it will replace it with this string, the result of the capture group and colon. So at this point, it's probably becoming clearer what this method actually does. It finds any part of the file that uses the old Ruby hash syntax and replaces it with the new hash syntax that came with Ruby 1.9. So let's alter our spec. Let's say we write this original string to the file before running the method, then after we run the method on that file, let's define what we expect the new string to be, what the old string should be changed to. So the new contents of the file after the method has been run should not contain the original string, but should contain the new string. The contents of that file should change. So now let's run the spec and the spec passes. Now again, we would normally go back to the code and make sure the spec fails since we're adding the spec to existing code, but for the sake of the running time of this presentation, let's go ahead and move on. So the next step would be to add in tests for these flags. Now unfortunately, I don't quite have the time to show these step by step. However, when I post the slides of this presentation, I will include the slides which detail how to figure out what these flags do and add in spec coverage for each of them. These are essential parts of the method to bring under test before we make changes to it. So if at any point you'd like me to walk you through this, just grab me sometime at the conference and I'd be happy to take you through it. So for now, let's fast forward. Let's say we've added in these specs and verify that they pass and fail and we expect them to. And at this point, we would have successfully verified every or nearly every piece of behavior in this method. Now we know what it does now. We have it documented. We have tests in place to verify its behavior. Now we can actually make our changes. And by the way, everything we just did was part of the pre-op stage. So let's move on to the actual operation. This is where we make those changes to the code. So let's look back at the code and decide where to start. First of all, when I saw this, I started itching to move the Substitute command to its own method to make things a little less intimidating and confusing at first glance. So let's go ahead and move it into its own private method. And then let's call this private method from our original method and run the specs to make sure the behavior did not change. And the specs pass. So we have successfully made a change to our code without altering its behavior. Now next, I'm really not big on that old regex syntax in the Substitute command. It's really old. I had to look it up to understand it. And I think a more modern syntax would be clearer. So I'm gonna replace the ALNUM character class with a slash W in the space with a slash S, the newer regular expression syntax. So that looks better to me. Now let's rerun the specs and this time they fail. We broke something. The reason is that although Ruby's regular expression can handle meta characters like slash W and slash S, SED uses a much older regular expression engine. This Substitute command is being executed by SED, not by Ruby itself. Now it turns out SED cannot use enhanced regular expressions the way most modern engines can. Meta characters like slash W and slash S are considered enhanced regular expressions. As if regular expressions weren't confusing enough, there are varying types of them. So fortunately, since my specs failed, I know that those characters do not work. So let's go ahead and change these back and this time the specs pass. So taking all that time, adding in the specs kept me from breaking the code. And although there are more changes I'd like to make, such as making the name of the class and the method much more helpful, for now let's move on to the third stage of surgical refactoring. And this is recovery. This is what we do after the refactoring. So after refactoring, there are two things we must know decisively before we deploy this code to production. First, does the code still behave the same after the changes? And second, is it connected correctly? Now this means we need to dive into all the parts of our application and anything outside our application that might use this code. Now I unfortunately do not have the time right now to go into this more in depth but is an essential part of lowering the risk of a refactoring. And now what about QA? If our workplace has a separate QA department, isn't it their job to make sure our refactoring works? It's always good to send any code to QA before you deploy it. But remember that ideally QA should find nothing. Now that doesn't mean they won't. Often there are bugs that we as developers just can't anticipate. But that doesn't mean we should allow any slack in our tests. Surgical refactoring is about doing everything in our power to avoid bugs sneaking into a refactoring. The better we do our job, avoiding all the bugs we can control, the better QA can do its job to find the bugs we cannot control. Now even after rigorous testing and even after thorough QA, sometimes something still goes wrong when a refactoring is deployed to the production environment. Now if this happens, the first step is to take responsibility. As developers, what our code does in production is our responsibility. Not QAs, not test users, and certainly not the ops team. The ultimate responsibility for what code does in production belongs to the developer who wrote the code. Next, evaluate the risk of another change. It's not always the best thing to immediately roll the refactoring back, particularly when it's been sometimes since the refactoring was deployed and other features that might depend on that refactoring have already gone out. Now it might seem like a quick bug fix will fix the problem, and it very well might, but a change to a change is still a change and the risk must be evaluated. Finally, it's our responsibility to fix the problem. Again, not our teammates, not the ops team, it's our problem if it came from our code. If we can't fix it, and I have personally found myself in this situation, it is our personal and professional duty to find out someone who can and make sure the fix gets done. And with that, I'm Nell Schemeral Harrington. I'm a software development engineer at Chef. Here's how to find me on the internet, and that's it for me. Thank you very much.