 There are very few things that draw my attention like a looming deadline. Some part of my brain watches in morbid fascination as the deadline approaches, wondering whether I should call emergency services or perhaps just make popcorn, settle in and watch the show. I can't help but keep a wary eye on it. And that's a wary eye that I am no longer paying attention to the code at hand. And I need all the wary's eyes I can get. Because without them, forget best practices. I revert to less sophisticated problem solving strategies like guessing and desperately copying code off of Stack Overflow. Now sometimes I'm happy. There are moments when the world melts away, your sense of self dissipates. You become completely absorbed in what you're doing. Time appears to slow down and speed up simultaneously. And being awesome feels effortless. So I've taken to coming into the office early in the morning and committing random acts of refactoring. And some people are calling this guilt driven development. But really it's not, refactoring just makes me happy and more refactoring is an optimization. So today I'm going to tell you a story. The arrow here actually needs to point the other way because it's a refactoring story. And it's a story that has a beginning, two middles and an end and a moral. So to be clear, when I say refactoring, I mean small careful steps that improve the structure and the readability of the code without changing its behavior. And tests are implied. So once upon a time, there was an application that performed some incredibly dirty hacks in order to deliver data into an old school, real world hard copy publishing system. And I found this particular specimen in the dark recesses of that code base. It was in a module that was about 300 lines long, most of which was in a single method. And this module talked to code all over the application. It dealt in temporary files, FTP, it shelled out. It used EXIF2 to write metadata into JPEG files that handled encoding. And it had a comment in it that read, a kitten dies every time this code is run. We had it running on a cron job. It had no tests nor was there any documentation. So this is essentially a bucket of business logic and it jumps through hoops. And the whole point is to name a file so that it ends up on an FTP server in the right place. There's a comment. It's almost as bad as the code. Also it's wrong. The input to the method is target. And target is the God object in the application. So it's an object that inherits from a massive ORM base class that has over 300 methods in its public API. And target itself has 500 lines of code on top of that. And the big picture here is that information is going to be shoved onto the string in order to build up that file name. Now there's a bit of code that's not actually shoveling things onto the string. This draws the eye. It seems to be a chunk of something. And it's probably a good candidate for extraction. So another striking aspect of the code is that it's littered with a bunch of low level details. And it makes it harder to see what's going on. But it does seem like amidst all of this clutter, there are chunks of behavior that could be extracted and named. So what are we up against? We have a very large method. It appears to be doing things at two different levels of abstraction. And nothing at the lower level of abstraction is being named. So these are not characteristics of good code. A thing to do with big, ugly code is to break it down into small, ugly code. So there's a critical difference between breaking code down and breaking code. So the first middle of the refactoring story is adding characterization tests for this method so that we can then, in the second part of the story, the second middle, do the actual refactoring. So the question that I start out with is how do you begin testing something like this? We don't know what the inputs look like. We don't know what the outputs look like. We do know that it's in production and it appears to be working because we haven't had any complaints from the customer about it. So the easiest way to discover inputs is to send something, anything, really, into the method and then see what comes back out. So we need an assertion to work against and again, it doesn't matter what that assertion is because it's gonna be wrong. And the really nice thing about being wrong is it tells you what right looks like. So running this fails, obviously. And the error message, helpfully, tells us that we are missing an input and it also points us to the exact spot in the code where this is happening. And what we find on line six comes as no surprise at this point. The message published on is being sent to target and it returns a date. Meanwhile, back in our tests, we now know that the stub represents a target and we're ready to start fleshing out the inputs. The one we need right now is publish on which we know is a date and the failure tells us what our next input is. Digging around in the code reveals that XYZ category prefix is a string and next we have kind which is also a string. Personal is a boolean, ID is an integer and title is a string. And this gives us our first real failure, meaning that we've gotten all the inputs right. So these are the inputs. These are the messages that we need to send to target in order to build up a file name and with the inputs in place, we're also given the output which looks like this. Now all that practice copying and pasting from Stack Overflow comes in handy. Copy the string from the failure into the assertion and the test should pass. And it doesn't because we have random characters. So I chose to use a regex to fix that and did get a passing test. So we're not quite done yet. There are a lot of low level details that are not being exercised by this very nebulous test and there are alternate paths through the method. So we need to improve the inputs for these three lines of code and then add test cases for the lines that have conditionals in them. And at every step, failing tests tell us how to tweak our assertions. And these are trivial to fix and boring to watch so I'm gonna spare you that. I'm gonna show you some highlights. So publish on gets zero padded and pie doesn't get affected by that so we're gonna fall back to E. Kind replaces underscores, so we need an underscore so that we can see it being removed. Title, title appears to strip out everything. The existing input has very little cruft. We need at least a number, some more funky characters, an uppercase letter. I don't know if you saw that but there is definitely something fishy about that regex. What's with the square brackets? I'm too lazy to reason about this so I'm adding a test case overriding the stubs input for title to give it some square brackets and this proves that indeed the brackets get left in place which is most likely not the intended behavior and the test serves as documentation so that we can clarify with the customer later. Now the first conditional deals in personalization and our stub doesn't personalize so we need a test case, test for the case that does and we're informed about yet another input that's missing which is trivial to supply and the other conditional on that line of code provides a fallback in case age is nil and the test for this stubs out the exact same inputs as the previous one and the final conditional is a ternary statement that's trying to determine where to truncate the target's title and our default input for title is neither long nor short so by making it longer we can make sure that it gets truncated and then the only case left to test for is a title that is too short to get truncated and this gives us protection against regressions so what have we actually accomplished so far? We took a completely untested, undocumented piece of code and then with some hand waving we got fake assertions to tell us the inputs the inputs gave us the outputs and the outputs gave us the real assertions which is almost karmic then we had a reason about the code we picked through every single line and determined what inputs would actually exercise that line of code and then we made sure that every single branch of every conditional was being called from a test and the tests themselves here are not providing a specification there's no design being done we're casting a safety net and we are getting documentation out of it up to including the fact that we probably have a bug but the biggest win is that we can start changing things without breaking into a cold sweat so here's what we're gonna do we're gonna isolate the method into its own class and then we're gonna extract a bunch of tiny methods and I'm gonna borrow detailed prescription from Martin Fowler's book Refactoring and it's called Replace Method with Method Object so traditionally this is a refactoring that you do when you have a bunch of local or temporary variables and you don't wanna be passing those temporary variables around the methods that you're extracting so first of all we need to make a home for the code add an initializer that takes the target add an attribute for it and then just copy and paste the body of the method in there the method should not be called on the class it definitely needs a better name it no longer takes any arguments and then back in the old module we need a reference to the new method from the old one so just delete the body of the method reference the new file and instantiate the class pass in the target call name on it and we're green which means that we have a license to go to town on this code so where to begin it always feels good to delete something so let's go ahead and get rid of the comment and now somewhat arbitrarily I decided to go from biggest to smallest so the biggest chunk that we've seen for extraction is the truncated title bit and the actual mechanics of extracting a method are surprisingly more complex than you'd expect create an empty method and name it for what this thing represents or what it does not by how it does it coming up with a good name is often the hardest thing you'll ever do in a refactoring the next step is to copy not cut the lines of code from the source method into the new one then scan the new method for local variables so file name here is declared in the source method and we have a choice to make we can either pass the variable in or make it an instance variable and do the mutation here or we can make this a query method now the whole point of the refactoring is to separate out the two levels of abstraction building up the little file name in one place and dealing with all the fiddly details somewhere else so we're gonna just return the value that we need and all the other temporary variables are local to this method so now we need to assign the result of our new query method delete the temporary variables that are no longer used in the source method and then we're green again now before we move on I'd like to tidy this up a little bit there's some unnecessary work going on in the regex we're doing a case and sensitive match and then we're done casing there's also something in the match of brackets that drives me absolutely crazy the spurious parentheses have got to go and another thing that bugs me is the fact that we have a ternary statement to determine the range for the match if now a match if is not gonna raise an exception if you try to truncate a four character string to nine characters it'll return four characters so the ternary statement can go and since we no longer need truncate two we don't need to worry about the length of the title so that can go as well and with those two lines gone there seems very little point in having a temporary variable so we can ditch that as well effectively leaving us with a single line of code in truncated title so the longest line of code is now the hex digest stuff so hex digest is an absolutely terrible name in this context noise seemed appropriate copy stuff in, assign the result and work green again the personalization line is now the longest can perform a quick extraction publication day, same thing and x, y, z category prefix so in this class which is named x, y, z file the x, y, z bit is redundant and the fact that it's a prefix is completely irrelevant so we can go ahead and drop the x, y, z prefix and the prefix suffix and just call the new method category and then we have kind and this leaves us with a fairly readable method and we could totally leave it at this but there's some spurious stuff going on in here take publication day for instance publication day returns a string so why are we interpolating it the same goes for category and kind and target ID is an integer but string interpolation will already call to string on it and then there's a move that might seem a little bit subtle until you recognize that that kids game spot five differences so which pieces in this code are different from the others all the file name pieces are being shoveled on to the string except the first one so if we start off with an empty string the first line becomes like all the other ones up to a point the extension is not like the others so if you conceptually separate this into two jobs building the actual file name and then slapping the extension on it we're left with two distinct sections one where everything is smushed together and then another where the pieces are separated by underscores so if we extract the smushed together pieces into a method then all the pieces are separated by an underscore and we can shovel them into an array rather than a string which gets rid of all the interpolation syntax then you can join them with an underscore so there done is it perfect of course not is it better well yeah we went from this to this in less than 30 steps and all those extra methods are one liners they hide the unimportant details of the implementation until you decide that you need to reason about them so let me just show you that again before after before after so first we quarantined the method then we extracted stuff now extracting stuff essentially identifies a piece of code that performs a sub-task and gives it a name back when we started out we identified three different problems with this code a long method, two different levels of abstraction and unnamed abstractions and extracting stuff addressed all three of the issues having done that we were still left with some ragged edges so we went ahead and polished up the code a little bit so that was the second middle of the refactoring story I'm going to conclude by taking a look at pointless cruft so in the field of information design there's a term chart junk and it refers to visual elements in charts and graphs that add no value they don't help you understand the data they'll often get in the way of comprehension and they basically just add noise so to give you an idea of what we're talking about this graph gets just about everything wrong and this is the exact same data presented without chart junk the code junk is the code equivalent of chart junk it's not about coding practices it's not about dry or srp or small methods or naming it's purely about noise and what that means is going to depend on the language in the environment that you work in so I've tried to classify code junk in Ruby and came up with this list number ten does a bear shit in the woods comments shouldn't echo the implementation they shouldn't be wrong they shouldn't be imprecise and they should not be misspelled nine it's my conviction that everyone should make their editor show them trailing white space because this so if you leave it in you have noise in your code and then when you take it out you have noise in your diffs that's just annoying eight if it's dead let it rest in peace what were you saving it for anyway you need to learn to let go seven if you do this in Ruby I will come to your house and take your keyboard away what is your excuse this is vile when I see this I secretly wonder if you wash your hands after you go to the bathroom six some code changes nothing Ruby has some intelligent defaults we can use them uh... this is kinda cool uh... it does nothing has no point I don't know how familiar you are with Ruby but there's an idiom where you can use a post fix modifier to only do something to an object if it's actually there except in order to get to this line of code we're already sending a message to the object and if it's not there it will already have blown up alright let's talk about dependencies in particular dependencies that you're not actually depending on may or may not actually uh... impact performance it is going to impact the performance of your tests which is a bit of a different story but mostly you just added noise to your code for how to say no and really mean it this is a true fact this is another true fact don't do work that the computer is already doing for you we saw this earlier uh... not to beat a dead horse but this is gross uh... more stringifying strings same deal and this is kind of the same thing because join is going to implicitly call too strong two-string on every element in the array so the call to map is superfluous here the match of brackets don't need your help we saw that earlier also from today's example the computer here is not doing the work should be the test suites should be curated maintain fed watered coddled and minimized so more tests does not mean more certainty or more correctness now there is an exception and that is if documenting behavior outweighs the cost i haven't actually been able to make up my mind which side of the fence this example falls on finally the compounding sin there's this old saying uh... one plus one equals three for very large values of one and this is revolting uh... it bears a horrifying resemblance to the code that we start out with today when some people say that software is grown not built i'm always reminded of this it's kind of like fungus earlier i said about the refactoring that we extracted stuff we quarantine the method then we extracted stuff and then we kind of polished up the code a little i'd like to make this uh... a little bit more precise but we actually did was that what first we quarantined the method then we extracted stuff and then we eliminated code junk so that was the ending of the refactoring story and before i get to the moral i'd like to mention that the whole thing was on github the refactoring was committed step by step if you want to dig into any of the mechanics or in the test or anything dig through the commit history it's all there if you want to dig deeper into the subject of refactoring there are books martin fowler's canonical text along with uh... the either in the original or translated to ruby and i'm pretty sure there are other translations out there this provides a catalog of step by step refactoring and it also provides a catalog of code smells or issues that you might run into into in the wild and then maps those issues to refactoring that will address them working effectively with legacy code by michael feathers explores a bunch of techniques uh... to help you deal with that special flavor of crazy that you get with a big ball of mud saw uh... design pattern both of those books uh... can be really intimidating they assume that you know a lot about or object oriented design and good code and if you don't you they might leave you more perplexed than enlightened uh... i would highly recommend reading practical object oriented design and ruby even if you don't do ruby uh... most of the book is language independent and it's one of the best reads i've ever had on on good design the moral i uh... i have a vague recollection of that night when the code that we refactored today was written let me rephrase that i vaguely recall writing it when i panic i write god awful code and other people do yoga or go whitewater rafting or meditate and i refactor it soothes me and science if you believe that sort of thing actually explains why the key player in the game is working memory so working memory is the thing that allows you to do mental arithmetic you store off temporary variables in memory as you work through other parts of the problem and the main differentiator in performance of tasks that rely heavily on working memory is the type of strategy that you're able to employ in order to solve the problem so people with better working memory are able to employ more complex more successful strategies here's the deal if you worry in ways that involve mental chatter you use up available slots in your working memory and you basically no longer have the swap space necessary to use complex strategies so you fall back on more simplistic and less uh... successful ones and this is where refactoring comes in because refactoring gives you an exo brain it's a lot like those details that usually end up in working memory can be put into your tests and they get offloaded there so that once you start refactoring you start uh... reclaiming your brain and actively counteract panic now one of the really unexpected things that happened when i started refactoring simply for the act of refactoring itself was that i started optimizing for happiness put something in its own class so that the test would be really fast uh... and the feedback loop that i got was unbelievable and it's a huge enabler for flow and flow is nice um... selfishly isolating code in order to improve my own happiness turned out to be a pretty good deal for the code as well suddenly i was avoiding dependencies i was asking myself constantly can this be a separate class how about a gem what is the unrelated sub problem here and it made a dramatic difference so in summary refactoring makes you smarter by providing you with an exo-brain it's preventative and curative with respect to panic and if you optimize for developer happiness by making your tests really fast you get loosely coupled code that adheres to the single responsibility principle thank you thank you katrina would you like to take a few questions we have time we have about five minutes so if people want to ask questions please just queue up to the mic and we'll take some questions for about four or five minutes or we could just talk over coffee outside it's probably a better choice i do want to say that i work for jumpstart lab we do ruby and rails training uh... if you have a team that's transitioning to ruby or a team writing ruby that would like to level up come talk to us thank you