 This is this is for my talk What I'm trying to do here. Let me explain What that's about I'm I'm doing this talk. It's a testing talk It's about applying tests to the FLOG project and I'll get into a little bit about what that is and what I'm doing I had two motivations for doing this talk And they have to do with other presentations that I've seen the first is that every time I see a testing presentation That shows examples of code what I usually see is three line methods Very simple tests. This is how you do this Some of those are great presentations Bob Martin gave a great talk RailsConf at one point that he was showing Small snippets of code and working through them fantastic talk But most of these things are contrived or most of them are from source code bases that are Not real in a sense not something that I would encounter in the wild or not something that I'd ever have to work with That's sort of the one motivation the the other motivation for for giving the talk is that a lot of times we talk about testing and we talk about TDD and and here's how you write a Specification or a test before you write the code and then here's how you write the code And so from starting from scratch from ground zero Here's how you will write something that's well tested and you'll get this great test coverage, which is a great thing And I think that's that's what we should be doing The problem is that the bulk of the software that I deal with is not something that I'm writing from scratch And I think for most people it's not the case that this is something that you're writing from scratch so you end up in a situation where you find a code base that has little to no tests or bad tests or mediocre test coverage And you want to improve it you want to refactor it you want to do something with it and That means that if you're doing things right that you need to be able to write the tests for this code That doesn't have them or that has bad tests And nobody really talks much especially in presentations about doing that So what I wanted to do in this talk is actually give you a real-world example where I wanted to do this Here's a project there were no tests and I'm adding them in so I can do some work That being said My fear about this presentation is that it will be hard to follow It is since I'm dealing with the since I'm dealing with the real-world code base Sometimes I'm going to be looking at 40 lines of code at the same time and it's real code It's code that someone else has written and then I'm going to say okay I've changed it this way and so I'm going to be showing a lot of code snapshots So what I'm trying to do to preempt some of the problems that might come up with that is Actually push out to you the information that you need to either follow along now Follow along at home or if you're seeing this on the web and in one of the confreaks presentations to be able to pull This up and work along with the things that are going on Due to the the time limit on the talk, which is about an hour I can't cover everything that I did but I'm going to I'm going to hit as much as I possibly can and so To make this possible. I'm trying to push this out So what I'm asking you to do if you're at all interested in following along with the talk is do one of these three things preferably in this order you can either run get your and Clone the information for me and you'll get it. You'll get a little project out That's probably the best way because it's local network It's probably pretty easy downside as we found out this morning is something has changed and get with respect to make dear and 1.5.6 gets an above seem to be broken. I notified Chad Fowler We'll see if there's a work around any time soon or if we can pull out a patch But anyway, that's the preferred method second is I've been passing around CDs If you haven't seen one if you need one you need one Simply take that copy what's off and if somebody else needs it just just hand it there Otherwise if you haven't done either those things and you're interested You can get the PDF of the presentation and and the snapshot of the information I'm trying to push out from the web and you can pull that down and that's the least Favorite because it's going to suck our bandwidth So that said let's let's sort of get into what is this? I am Rick Bradley and that's showing up, right? This is about what I'm calling characterization testing. I didn't invent that term We'll talk a little bit more about where that came from And as we mentioned, we're going to be talking about FLOG I am from Nashville. I work for og consulting you can find us on the web ogtastic.com We are Nashville based and we're excited to hear that the hoedown is going to be in Nashville next year This is an official offer to try to help out with with making that possible Jeremy wherever you are If you want to read what we write you can find us at blog tastic We keep a tumble log Nealist and you can find a lot of our projects on github under flogic For people playing along at home These are the sort of resources. I'm making available about this talk. The first one is I'm calling my sort of fork of flog. I'm calling it flame. You can find it on github the The talk itself the keynote slides are actually on github also if you want to pull them down They're under Rick hoedown 2008 and the resources tar ball is going to be up there and definitely at FLOG snapshot Let me talk a little bit about what FLOG is in case for various people who may not remember who may not be familiar It's it is a project written by Ryan Davis from Seattle Rb He goes by Zen spider. You'll see that online sometimes It's part of his suite of tools for sort of torturing code And and that being let me put my code under stress so that I can figure out where the weaknesses are so I can improve them What FLOG does is it sort of analyzes complexity of code? So there are a lot of measures from the 50s 60s 70s where people say if we look at how a parse tree breaks down and we look at How code behaves under run we can find out certain things about it and we can use those to measure complexity in certain ways With the feeling that high complexity code is going to be code that is poor in some way to maintain a poor in some way as far as performance and so on so if we if we take a little snapshot of code and you apply a Sort of analysis to it where you say I'm going to look at Assignments and branches and various method calls and I'm going to think about how various things in Ruby in this case Should be penalizing me as far as complexity and manageability FLOG applies that analysis and it begins to give you a set of scores So it'll give you an overall score for your project This particular piece of software a Rails app or Ruby file library has this certain FLOG score and the most defensive most Complex methods are these in this order and they have these scores and they break out this way This test blah has an 11.2 score six points of which are because it used eval There's some branches some putzes and so on We use FLOG in our work daily we look at our own code But we also do audits for folks if you have gotten a look at the most recent Rubyist magazine There's a talk in there About auditing code and one of the things that author mentions is that he uses FLOG to look for Complex areas in the software. These are sort of hot spots things that we're going to run into Or that the owners are going to run into when they try to maintain this code One of our clients Has a stable of developers of varying degrees of experience and what we find watching the commits that go by is that some of the Developers do a pretty good job of keeping things clean and readable and and perform it while other developers don't and what we see Is if we run FLOG that those developers who are having trouble keeping things clean Their code sort of shows up at the top of the list and what we want to be able to do is to say okay Here's an example FLOG output at the top of the list There is a method and it has a certain score and these are the guys that Contributed to that method and over time it lets us sort of focus and manage on who is it that may need some help Refactoring code who is it that needs to pair with somebody else who is it that we can see typically May need to come in and get more code review and those sorts of things so my motivation for even messing with FLOG at all Was to be able to do sort of a FLOG plus Blaming so if you're using get we can we can pull up get on a file and say who is involved in this and if we run FLOG against the file we say well here these methods can we correlate the two and produce this sort of report that I'm looking for so I said Let let me try to find a way to do blaming with FLOG so we can report on this The problem or a potential problem once I got to looking at FLOG is that they're actually no tests I talked to Ryan Davis about it and said well Is there a test suite that I'm missing because this was sort of held in a per-force repo in Seattle and The gyms are on Ruby Forge, but so I figured I'm probably missing something I talked to him said no actually there were no tests because this was sort of a spike This was something that we wrote in you know a couple hours and We let it out for people to play with and it's sort of taken on a life of its own So you end up in the situation where ideally there would have been tests ideally the things would have been there But there aren't and now I'm in a situation where well I could either go through the system as it is and make the changes that I want because it seems to work pretty well Or I could consider maybe I should start adding the test to this so philosophically that brought me to a point where I Began to say well, what should I be doing and it made me think about sort of my history as a developer and and as I set Down and tried to reason through it I said well Maybe there's there's sort of a general history of developers and evolution that comes about and I tried to document this a little bit so one of the things that this sort of comes out first when you think about is that there's the sort of programmer who's just sort of walks in the door and All he wants to be able to do is to code something can I code anything? A step ahead of that you may find yourself saying well, I've evolved a little bit I feel like there's probably nothing that I can't code in software And I remember finally not too long ago those those days where all I've got to do is be able to code this The problem is that Eventually you need to be able to fix the things that you've written More often than not so you sort of step up a notch and say well I think maybe I should work on my problem-solving skills and I could probably fix that Moving a step further if you fix a lot of things and you spend all of your time fixing you start to finally Interest back and say maybe there's a way that I could write things so that they don't need much fixing When that comes about Eventually you hear about testing well, wait a minute Maybe maybe the way to doing this is that I should write these tests to help me along and you get sort of involved in testing but you're still fixing things and you're still writing software and You end up at a certain point where you know some things, but you're doing your testing Maybe when it's all done when when everything is all the dust is settled you'll get to the test Which unfortunately means that you won't get to the tests from first-hand experience so What ends up happening is that you finally get a realization you hear somebody talk for the 50th time about TDD and writing the test first and then you say hey wait a minute If I write the test first then the tests are already written you finally begin to evolve into a state That's that's almost sort of human and then you go see Uncle Bob talk And you'll tell you things like well. It's not just good enough to do TDD Let's let's be simple and let's be rigorous about it And and he's got three rules if you look at Bob Martin's website go search for Bob's three rules What you'll find is his first rule is that you don't write any production code Unless it's to make a failing unit test pass right so if there's not a failing unit test out there You're not writing production code. That's that's the first safeguard The second is that you don't write any more of a unit test than is sufficient to fail and for Ruby part of his rule Doesn't come in which is that a failure to compile is is is a failure So in C C++ Java and so on a lot of times just that failure to compilation is the failure of the test But anyway, you don't write any more testing then then gets it failing and then finally his third rule You don't write any more production code Then it's gonna make that test pass and if you've ever heard of sort of ping-pong testing or ping-pong coding We're you give the keyboard to one person and they write a unit test and then they hand it to you And you say okay great I've got to make this test pass and then you make it pass and they hand it back back and forth and you switch off Sometimes roles that's sort of what he's talking about but you can do that with yourself You sort of write these rules and and you and you write the code and that gets you To a point where you're starting to write good well-tested code You can take it a bit further and do what Brian was talking about yesterday It's great lightning talk, which is to say you're gonna test everything all the fucking time and that's test infection That's your code is completely test infected and sometimes it's very difficult sometimes you're testing rake files Sometimes you're testing command line script. Sometimes you're testing GUIs Sometimes you're testing java script on a web server. How does how does that happen? You have to figure out how to do that and part of this talk is going to be about giving some examples for some of these things so one exception or one exception I would make to to Brian's rule and I would I would encourage you not to make this exception in general is when I when I talk about a spike People talk about spiking something to figure out how to do something. I consider this sort of scratch paper for testing I'm gonna go off and do something on the side that I'm just trying to figure out what in the world I want to do what's the behavior or is it even possible to make this work? I call that a spike, but I'm trying to test drive everything, but sometimes I've got to sit down and do that The difference between what I think most people do for a spike and what I recommend that you do is is that? When I try to write a spike when I'm done with it and someone says well Let me see it or or well Let's start from there and use that I said no I've got to throw this away because it was a spike because it was not written test-driven. It's garbage. It was scratch paper It's not something I'd ever want to show to somebody so that's my one exception If you feel compelled absolutely compelled to I'm not going to write tests for this because I don't know what I'm doing Go ahead and do that treat it as scratch paper, but make sure you throw it away never release it. Go ahead Brian Okay, so that so Brian was saying he agrees make a spike throw it away feel good about that In doing this I Guess part of that the evolution discussion is that I feel like I'm somewhere in on some evolutionary ladder that says I've got a test Those slides were a bit of a joke in that I think that there are Lots and lots of lots and lots of supermen sort of beyond where I'm at because I see these guys out there I see guy steel and I see uncle Bob and Martin Fowler and they're doing things that I can't do so There's this whole series of steps that that are yet to come hopefully an evolution But I can't allow myself to go through this this sort of flog update Without trying to add tests to it so I could either give up or I could go ahead and write tests Having said that Having done enough test driving and having done some refactorings to testing One thing that I've learned is that test driving code from ground zero is a different skill Than working with the existing code and trying to bring tests to it. It's a lot Cleaner you can make your code a lot better starting from zero and test driving your way there This sort of existing code that doesn't have the test. It's often harder to know where to start I'm looking for a behavior that behavior is not here or there's things that it does I don't fully understand but I want to make it do this thing at the end How do I do that the code that we're messing with needs change, but it's resisting change That's that's one of the lessons here. So so the technique the overall technique we're looking at we're not going to change existing code without test coverage So I'm not going to refactor this method unless I've covered it with tests I'm not going to add behavior unless I've covered this system with tests And I'm probably not going to add behavior until I've covered most of the system with tests So I know this is what it does. It's well tested now if I add behavior. We're there This this set of tests. I'm going to add is like a scaffold. I've got this statue up here I'm going to do work on this the Statue of Liberty I'm going to put a scaffold and around the entire thing to protect it and then once that's done I can now work on it and change it Once we've done that we're going to be able to refactor and then we're going to have solid tests and refactor code That's where we want to be Let me bring in a phrase that has a lot of meanings to a lot of people and that's that's the phrase legacy code And I'm going to steal a definition because I happen to agree with it that legacy code is code written without tests So you could have written this code last week. You could have written it today. You could have written it 40 years ago Code without test is legacy code and that definition comes from Michael Feathers He's got a book called working effectively with legacy code. I've actually got a copy with me I can pass it around if anyone wants to look at it It's one of the best books on testing I've read in a very very long time And he talks about legacy code even code that you wrote last week as Code without tests and he talks about how do you deal with this? What are the strategies the specific things that you do to get this code under test? So one thing is Code written without tests is different from code developed from scratch I've been saying that and you'll hear it from other people. It's structurally different And I'll show you some specific examples of how that happens test first code is Also different from tests later code, but it's even more different than untested code So those are some things to sort of look for as we go along The reason I'm giving this giving this talk ultimately is because I'm working on flog And I'm doing that for my own good But but the the motivation for getting this thing done in a sense is that I've got a talk to do So I'm doing TDD of a different sense. I want to rewrite flog I want to get this finished so that you guys can benefit from this Again here are the sort of links to things on the web. This is more for posterity So let me talk about what I've given you The software that you've you've picked up on either the CDs or from from get juror if you look in it There was on the CDs It's expanded if you download the tar ball from the web you unpack it what you'll see is a series of 126 directories and these are all of the commits that I did from the time I took flog from the Perforce repo in Seattle and stuck it on get hug until the point where I said that is now Tested well enough for me to begin adding these features that I want and I've provided all this information in a certain format So first off every directory has a get revision hash. So there's this big long hash number after the dash That is if I were to say get log in the project and give it that number it would show me up That's the commit that that came from in the directory There's the full archive HTML output for that particular snapshot so I can open it up and say okay Last month when he did this thing. Here's what the test coverage looked like The next file the flog dot text is the is the output of running flog on this particular code at the same time This code is of course flog itself So I'm running flog on flog and we can see how did it get better? Did it get more complex in different areas? There's also the get commit log. So what was the message I wrote? Oh, I refactored this thing and here's the whole diff of the things that I did What I was doing when I wrote this presentation was I pulled up all of those get log that diffs and Open them in text mate and just went through one by one to recall What was it that I did and sort of pull out the interesting snippets and say, you know, here's what's going on There's a text dump of the archive output useful for generating statistics if we want to do that And then there's the whole source tree. So at that at that revision in time. This is what flog looked like So what I'm going to end up doing as we sort of go along here is I'm going to talk about well, here's some code and This is commit number eight. So right before that that get commit number There's zero zero zero eight and then the next one is zero zero zero nine. So these in order You can open you can open up any of these directories and when I say this is commit number one twenty two You can pull it open say ah, that's the diff. That's the code. I see what's going on You can also take those hashes you can stick them in github great place And there's four or five places on the screen where that hash number appears so you can look through and say oh That's hash one five one three et cetera et cetera and see what was going on on github So how do you do this I've got an application that has no tests. I want it to do something new I've got a long road ahead of me. What do I do the first thing you do is you look at the application and say What in the heck is this thing? Is it even worth trying? What am I dealing with if I try to do it? So You open up the directory. This is at the bottom right of the screen. This is commit number one So if you open up the files and commit number one, I see this in the source code tree There's a history file manifest file read me file a rake file a couple of scripts that Ryan uses for some purpose that I Don't understand. It's analyzing some of his tools something like that But there's two things I do care about there is been flog which is the command that I run to flog a piece Of code and then there is lib flog that rb Which is the library that been flog uses that does the magic that analyzes the code That's all I care about once I looked in here. I'm like I know what this is That's all I care about so let me look at these files and see is this a feasible project Been flog. This is the command line script in its entirety It's it's a nice small little condensed script and I look it over and I see a few things immediately that make me say hmm What is this okay? I see a usage paragraph that it's going to tell me how things work I see at the very bottom sort of the important stuff. It's loading up the the flog class and it's flogging some files and it's reporting That's pretty readable. I understand what that's doing at the very top of the script Ruby dash s. I had to go look that one up I'm not an old-school Rubyist. I haven't been around for 15 years Ruby dash s says if you give me a dash h on the command line I'm going to set dollar sign h the global variable to true. That's what that's doing. That's that's in line argument processing So the two other little arrows here defined in a global Warning flag I've got a dollar h that's going to be defined He's checking to see if it's defined the reason this is a warning flag for me is that I'm going to write some tests That are going to turn the h flag on and off But you can't undefine a global so I can't turn that h flag off and testing I'm going to hit that later. So I do a quick look. I'm like, okay, if that's all there is to that I can handle this. Let me go look at the library. Wow, okay There's a there's a lot more stuff in here This isn't intended to be to be really closely seen you can look through this Later to see but I'm I'm giving sort of a high level look the first thing. There's some libs Looks like parse tree or something that Ryan wrote there's some global variables look like option stuff And then there's a class that inherits some stuff and there's data related to oh method remove method define eval each Okay, these are the these are the op codes that could be in my program. So there's data related to this in here And as I scroll on down, oh, there's a bunch of methods including initialize Flog files and report that were called from the command line binary That's the stuff I'm going to be messing with and then down here below a bunch of process this process that process this It's process block case call each if else. So these are the things that deal with oh wow in my code I see an else. Let's give that a three-point. Oh and this I see a whatever. Let's give that a two-point There's a lot of these things I need to expect it because Ruby's got a lot of sort of op codes and keywords So that's kind of what I would would expect to see all right That seems feasible and I think that I can probably get my hands on around some of this stuff Let me look into the things. I know I'm going to need so There's an initializer in the library it calls super okay because it inherits from sort of a s expression thing that Ryan wrote that I don't fully understand, but I know it has to do with ripping apart Ruby I'm constructing things in the constructor. I'm doing some stuff. I don't know I Figure I can handle that let me let me go on. There's a couple other methods. There's bad dog and there's bleed I don't know what those names me I Presume I'm going to figure this out all right So I know that the code works and I know that Ryan's a pretty good programmer So let me let me figure out you know in time what this does Let me look at flog files because this is sort of the loop that's going to process all the files I give it okay, what do I see well? I see a big chunk of stuff in the middle this the square That looks kind of like something that I might want to extract later Okay, maybe that's something I can refactor out because this is kind of a big method. I see some globals from my Command line arguments. I see something that's kind of confusing there I'm not sure and then I see one thing that really sort of tips me off This might be a headache is that flog files calls flog files So I've got a recursive call in here, and that's that's a great way to program The problem is when you test it you may find yourself trying to break this thing apart So I might be in for some trouble for that There's the report output method. It's kind of big too. There's some globals There's an exit right in the middle of it. That seems kind of weird for a library But it's sort of a library used by one thing. Okay, not too bad, but there's some stuff. So I'm still looking for this one part the magic like how do I add up branches and assignments and calls and I finally find it This method called totals. There's some globals in there, too But in the middle of this thing There's tally each and some cases and assignment branch a b and c and I do some math that looks like this is flog This is sort of the guts of it. So I'm looking around and I see that I see some of these process methods. They're kind of big I get a little bit wary. I don't know if there's a lot of stuff in here This one. I don't know what process it or does don't know what to make of all this code I see bad dog being called and bleed both. I'm gonna need to figure that out eventually So what happens? I get looking at this and it's it's daunting. It's complicated in places If I take it as a whole I get a bit worried But if I focus on what I'm trying to do if I if I don't worry about everything the system I'm trying to add this functionality that is in these places that I can understand mostly Let me just start going ahead and doing this don't get distracted by what is out there So let me just start at the easiest place. I looked at the command line script. It looked pretty easy Let me go ahead and try to put this under test One of the tricks is how do you put a command line script under test? I'm gonna run this from the command line. It should do some things It should fail in certain ways of certain outputs inputs aren't given. How do I do that? Well? Looking at it again Ruby dash s Got some usage. I got some checks and ultimately I want to be calling Flog files and report Here's how you do it. Here's how you put a script under test. So this is our spec This is this is a diff. It's all green You can sort of look commit number two look in the the get diff log. You'll see this I'm using mocha by the way. It's not our specs normal mocking If you if you want to get this stuff up and running you'll need to know those things But to put a script under test first I define just a little function a little method that says run the command So whatever magic I need to do in my scripts I can say run command if I expect this it should happen those sorts of things so run command actually loads the file With file read and then evals it and I rescue system exit because there's an exit in this thing It's going to call exit zero. It's going to kill our spec in the middle of running it. So rescue that I Also say before each remove the argv constant which Ruby is going to Ruby's going to use with the dash s stuff And set it to an empty array so I can set arguments and I can run the script now and now I can assert Okay, it should run it shouldn't raise a file not found there That's the simplest thing I can do to test to make this thing happen I've got the first hook around the script and now I can start to do interesting things with it So I'm gonna I'm gonna test the obvious stuff. What is what is it really that it's doing? It should make a flog instance. It should call flog files. It should report and Since the behavior of the command line stuff has just passed a dash for standard in if nothing was given Just do that and those specs run and they pass and I can twiddle around some things and see if I did that They would fail if they did this. Okay. I've got a hook around this this piece of untested code. It now has some tests So then you can still run into some issues you get your hooks around it. I haven't changed any of the code This is awesome. I've got some tests running I'm gonna go the next step which will start playing with you know Dash H and dash I and some of these arguments. He's got going in but it turns out that due to this is a global I'm testing if it's defined. I can't twiddle the bits. I can't turn dash H The dollar sign H global on and off the way I'd want to so I'm gonna end up having to change this code To be able to put it under test and this is one of the things that Michael feathers talks about a lot in his book How do you deal with? Code that you can't test in place code that you can't feel inside of or big blobs of code That you can't get a hook around at all and he's got a lot of techniques for doing it And sometimes the simplest technique is look at this What's the most minor change I could make and painstakingly say will that affect anything study the code? And that's sometimes what you have to do So I find myself in this position where where he said if define dollar sign H All I can say is if dollar sign H and it turns out looking through the system that has no impact on anything I didn't have access to Ryan when I was doing this. So I couldn't call him up and say Ryan By this did you mean was this important? I didn't have that So I had to study the code and make this change and I did the same thing with the dollar sign I a little bit later commit number four, you know look it up. There it is The define didn't make anything any difference if it's on or if it's off. That's enough. So I made that change and now I Can write a test that says okay before Let me go ahead and set dollar H on and we'll see what happens if dash H was specified Afterwards go ahead and clear it. It's that afterwards go ahead and clear it that I was trying to get so now all of my tests stay Clean and independent we wouldn't have done it this way TDD I can tell by what I had to do here that this was not test-driven code Even if he had written it and then taken away the test suite I know he didn't test drive this code because he could not have done this He would have had to do what I just did to break it This shows you that test-driving code changes how you write it some people sometimes You know are skeptical of that it does change how you write code and this is this is one of the proofs Your code is easier to test when you test drive it so Sometimes you can find when you're when you're getting into things that well I Understand now what I've what I've got in this little area But there are these hooks that are spreading out through the system that I don't know the rest of the system like like flog is right now I don't know what the rest of it's doing. I know that I'm coupled What can I do to deal with this? Well, I don't want to lose track of the fact that I haven't tested everything I don't want to lose Side of the fact that I don't know what's going on I've tested some stuff, but I want to make sure I can come back to it later, so What I what I like to do personally is I will leave markers in my tests And if you're using our spec if you're using shoulder if you're using something What would to allow you to do a pending spec you can leave pending specs in place and these basically when you run your test We'd say this is pending. I haven't implemented this test so I can come back to this It's it's now on my to-do list of things. I've got to get done before Before I complete this so what I'm specifically talking about here is in that command line script There was a set of code that said this is the usage and I have tested dash h and I've tested dash i because in that script He deals with those but there's dash a dash s dash m and dash v Which are all setting global variables which have to be being used somewhere in that system and I grep through I'm like Yeah, they're being used in various places in the system, but they're not being dealt with where I'm at testing now So I write pending specs and it says it should be doing something it should validate this dash a this dash m This dash s and dash v and until I get to the point where I understand what all this Option processing is doing how it's handled on the code and put tests around it. I can't do anything with this So I move on and that's one way of dealing with this This feeling of I'm overwhelmed. I don't know what's going on is you leave yourself marks That you can come back to so now I've done what I can with the with the script itself The big part is ahead of me the library and what I'm going to do is test the points that I can see from the command line script So I'm just sort of moving out every time I get a little bit of knowledge. I'll test a little bit more I now know that this is under test, but it calls three things. I'm going to test those three things So the entry points of this library have got to be tested And sometimes in here when I open up the initialize for instance, and there's all this sort of stuff I don't understand I'm gonna end up leaving a marker right away So I've got more work to do I know it it may need to verify more state than what this does And if you look through the history of this thing I come back to this a hundred commits later and can finally clean this up when I understand what's going on I Know that if I initialize this since this is an sx processor Class that it's inherited from at least when it's done it should return one of those and There's a method that gets called in there that's initialize something for that and There's one other thing that I can see that it can do so I just put those things under test I don't necessarily know what the implications of all that is I may never know what the implications of all these things are because I don't know what all the other classes are doing But I just start at the next at the next spot I've done everything that I can with the initializer and so I just moved to the next method What's the next thing the meat of this thing is flog files which says here's a list of files They could be directories. They could be file entries themselves. They could not exist I want you to give me flog scores for this stuff. So I start going in and flogging a list of files And this is the left is obviously not intended to be readable But if you if you go back and sort of look through what happened in the changes what goes on Well, there's some special case stuff that I can see. Let me go ahead and put some specs around that I'm saying it apparently does this stuff there if the file doesn't exist. I know that can happen So let me see what happens if a file doesn't exist. I'm doing science on this piece of software I'm saying I don't really know what it does but looking around. I think this is what happens Let me try if I give you files and everything were to work great sort of the happy path What would happen? Let me write some specs for that and I see If I don't specify any files at all that's the next thing I can think of well That's kind of a degenerate case shouldn't that do something and maybe it doesn't raise an error. Oh, it doesn't And I leave some markers that should do something useful. It should do nothing those sorts of things that are telling me Once you begin to figure this out come back watch your pending specs and start coming back to these coming back so I'm going to go back to a term that I introduced sort of in the title and It's it's the characterization test. So people talk a lot about unit tests They talk a lot about functional tests. They talk about integration tests acceptance tests and so on one term you rarely ever hear is characterization test and This is again coming from Michael Feathers is terminology, but it's a more general term than that It has to do with Characterizing the behavior of a piece of software. This is the scaffold. This is this thing does this This is my science assertion my hypothesis. It does this thing right now It may not need to do that thing forever It may not be the right thing to do it may be a bug it may be an obvious bug But I'm going to characterize the fact that here is this behavior and as I'm building these scaffolds sometimes it becomes Pretty obvious that either this behavior is wrong or it's something that I don't necessarily agree with So I don't want to write a test suite that for perpetuity says I've proven that this piece of software now does all this stuff and it's right what I want to be able to say is that Most of it is right but these bits and pieces this is what it currently does and I don't necessarily think that that's the way It's always going to be so I've gotten in the habit of when I'm doing this and this is a legacy code thing When I'm doing this with legacy code when I find behavior that should be characterized But maybe it's not what's desired. I write currently I make it so that in the code You can say it currently passes this to this thing and when you run the specs. Oh, it currently does this stuff So we see stuff that's questionable wrong or buggy. We want to change it right now We're adding tests so that we can change it later. I can't change this so I've got to characterize it So we differentiate this characterization tests We're writing from the unit tests of the messages passed between the objects the integration tests of the system as a whole and We make it distinctly visible so we can find this stuff later And so that we or someone else can say, you know, these are the tests, but this isn't questioned This looks like a bug or this maybe was a bad design decision So there we go again sort of that currently right there is me telling myself and telling somebody else This is how it does it one thing about this is this is purely textual This is a string based representation of a characterization test since we're in Ruby We can do something a little bit more sophisticated and what we can say is hey, I'm using our spec Let me open up the example group methods and let me add one and I'm going to call it currently and Instead of saying it blah blah blah I can now say currently blah blah blah and what I can do now is I can see it in the code I can grep for it. I can see it in the spec output I can grep for that as well But I can also instrument currently different later and start throwing exceptions for instance or I can start filing bugs automatically I can do anything I could do with a hook and a code block and so this is a very simple way to do it Now if I change the test the characterization test it can read currently should convert the parse tree into a list of s expression Okay When I run my spec spec doc output that question of that behavior is currently stars on both sides I can see it. I can grep through it This is one of the techniques Like the marking technique of saying this is work in progress I would have never written a currently spec for something that was writing TDD because my intent from ground zero Open green field from scratch applications. I know what I'm trying to do. It should do this thing It must do that thing it can do this thing Not it currently should do this and I'm going to change it when you're dealing with legacy code code without sufficient tests Sometimes you have to say this is currently what it does. I'm going to fix this later, but I can't do it now Getting back sort of into the system One of the globals that's also in use in the system is standard in dollar standard in is in use for the output routines How do you deal with the fact that someone is saying? Oh, let me get the global out here Let me output to that file handle when you don't really or input from this file handle in this case when you don't really want that to Happen, so here's an example for posterity of okay. I'm flogging a list of files and When standard in is specified as an input shouldn't raise an exception and eventually should read this stuff in should do some Stuff with it well in your before block you hook standard in and save it in instance variable And you put a stub in place and now you go about your business and when you're done in the after each block You put standard in back so now you can say well I've stubbed off read and you can do it with standard out. I've stubbed off puts the IO stuff is neutered I've isolated this. It's a unit test now. It still does what it was intended to do I'm still testing through that stuff eventually. I want to get rid of this dollar standard in global, but for now This is how I do it So you move on okay, I've gotten some progress. I'm feeling pretty good There's that recursive call and flock files and this turns out to be a real bitch Because there's just no really easy way to test Code that you don't know very well doing recursive calls to itself how do you get the right hooks in there and If you look through sort of the the commit logs that are going on You'll see that I that I did a number of things trying to get this under control And this is one of these areas where I'm learning, you know every every time I go through testing something like this I learned something new and so I open up flog files Looking right here flog files do or whatever Great, okay. Let's see what I can do So one of the techniques for dealing again with the big blob of code that Michael Feathers talks about is is Sometimes you just have to start pulling things apart. You say, okay, I know what this does and that's flogging directory This is the remainder of it now. Let me try to put specs on these things and I'll talk in a minute about one of the one of the risks you run into doing this And one of the ways that you can help protect yourself as this is going along So you can see me sort of go through this in the logs. I Split these things up and then I start looking at okay. I'm flogging a directory That's one of the things that it does when it recurses so it should get the list of files It's going to flog file once it's going to pass the file name So I'm specking the pieces that I made there's no guarantee that this is exactly the same behavior that I had So I'm so I'm getting sort of risky here But over the course of you know seven or eight commits I managed to split this thing out and I've got flog files at the top which now does almost nothing It just iterates over the files you gave me and it calls flog file which if it's a directory It's going to do flog directory if it's standard in it's going to do this other thing And then it's going to read the data and it's actually going to flog it flog directory expands the directory And then ends up at flog which tries to process the parse tree and then I keep Ryan's sort of warnings How do you catch exceptions and what happens when ERB happens? So that's sort of the guts I pulled this out into into pieces, but along the way I did some risky things So sometimes we're going to make mistakes when we try to do this work. This work is not easy There's no real textbook for it. There's some strategies and we can get better at it But we're going to make mistakes in improving this so Some things we can do, you know small commits are good. I can roll this stuff back Don't be afraid to roll back fix the mistakes as you notice them In this particular case, I should have had a suite of integration tests in place I should have had more than just this unit testing I'm doing I should have something that says overall flog does this Here's some examples maybe throw flog at these examples and what did it do? It should still be doing that I didn't I didn't do that and this bit me. I didn't do it till later I should say I didn't do that and it bit me in another place Which to me was a fascinating sort of view of the difference between unit testing and integration testing or the difference between What some people call behavioral testing watching the messages between things and state-based testing This system as a whole does these things and you can watch the state and assert the state of it So in refactoring that flog files I reached a point where what I was doing changed the structure of things and I hit a point where I've got some ruby in A string and I've got a file name and that's what I want to put together and say this is the results for these things Give me give me the values But that was not the call stack the call stack was just sort of one string coming down It was I think the file name origin. No, it was sorry is the ruby originally So I find myself changing the method signatures going back up the stack to make it happen When that happens and you see a commit number 20 I've actually broken flog, but all of the tests pass So this is the situation you get in you can get into if you're just unit testing Well, I've done all the stuff and this is great run flog. What's it do it blows up if there's an exception So you find yourself working through this stuff and saying what could I have done better? Well, what I could have done better was used integration tests. So better late than never Start using integration tests and the best thing about this as painful as it may be been doing small commits You've got version control in place. You can roll yourself back and say here's where I should have had the integration tests Let's start doing this. Let's see. Did I really break things? How bad is it and start recovering so so I went ahead and said better late than never I'm going to write these integration tests and here's a simple integration test. This is commit number 50 Create some files with ruby code in them Make sure it's not proprietary code that someone's going to sue you for putting in the tree We took some of our open source stuff and just dumped it in directories and said, okay Here's a fixture. Here's a simple little ruby file that does some stuff. Here's an empty file Here's a directory of files. Here's a directory of a bunch of files underneath of that and let's make a little wrapper So here's my files and if I say give me those files and then run flog files on them It shouldn't throw an error at least let's start there. And so I do this stuff Okay, I've got it to a point now where it doesn't throw errors and that works That's kind of cool, but it doesn't tell me if I'm doing anything useful with my version of flog So then I decided I'm going to go crazy and get all the information I can get out of flog 1.1.0, which is the version I started from and include that in my integration test suite so one of the things That is available internally it catalogs a call stack. So here's all these things that were called and Here is the the flog numbers for those and then it eventually creates like a totals Which is like even more broken down numbers. So I took flog 1.1.0. I Open the code up and put puts yaml.dump this data structure right in the middle of it and ran it over those files And it generates a big yaml hash And I take that and I stick it into a to a calls attribute in my spec And that's what this should generate and then I testing it Does my version still generate that stuff and does it produce in this example? Does it produce the actual total flog score sort of exactly that that flog 1.1.0 does? So I've done that in these integration tests and now I've suddenly increased The coverage of the system as a whole without necessarily fully understanding the system And I should have done this sort of out of the gate So what you should have seen the first set of slides step one was this stuff I didn't I made a mistake you can do that You can make a mistake you can recover and you can and you can move on what that gave me Between these these sort of sets of commits where I didn't have the integration test and when I do this is the coverage output and Where you want to see sort of the the top The second one from the top which is flogged at RB 45.6 percent goes to 86.7 percent Total coverage 74 goes to 96.3 and you can get into those To those coverage graphs and see explicitly, you know What is still not covered and it's it's things like this if I get way down the flog library all those little process methods There's a few that aren't covered by the ruby code samples that I put my integration tests So I never call super in any of that code evidently. I don't use until with a block anywhere in that code I don't use when I don't use while and there's about 12 of these later I go back and add some more Files that cover those remaining cases and get it down to I think there's four methods with with some weak coverage and I emailed Ryan I'm like, I don't know how you trigger these. I don't know how you get this stuff to fire And sort of still waiting for for an answer on that and dug through the ruby interpreter and dug through some things to play around But I've got a lot better coverage now even though it's not perfect now I feel a lot better about what I'm doing and going back now I know that flog files is working with the changes I made to fix it now. I know that's producing the same results I've got a flog that's like his but a little bit better tested so deeper What do you got you've got bad dog Whatever the heck that is it's doing something. It's it's got a multiplier It's adding a bonus to it it yields 42 to the block It's given and then it subtracts away the multiplier. So there's a scope here. You're giving it a block and in the scope It's changing a multiplier and then it's it's running that block with with a 42 whatever the 42 is doing This this is kind of a tricky little bit to test to and it's the sort of thing you run into so Find a way to test it, you know test all the fucking time as the man says how do you how do you get in on this? Well, sometimes again, you would need to introduce A sensing variable some way to get in there and find okay when I did this now The thing that I can detect tells me this one up by one or went down by one What have you since we're in ruby we can do some other things We're a little bit more flexible. I can introduce an accessor for multiplier. That's that's an attribute I can put a read me a read accessor on it and I could say what's the value of multiplier And I actually went so far as since I want to test this properly I want to be able to set it as well So I'm I'm now changing the code but at least in a remote sort of AOP kind of way to put a hook on it That doesn't affect that code under under test and it allows me to test it so whatever it is it's going to yield so I can test that and if I Set the multiplier This is where the second thing and I yield into this thing then inside of there that multiplier accessor I can use should have gone up by one and indeed it does and then at the end of this after the yield is done It should be back in place. So whatever this thing is doing. I've now hooked it in such a way that yes This is what it does We'll get back to this a little bit later Similarly you can have Problems when you've we've got methods that are messing with internal attributes and it's difficult to sense what they are Sometimes it's difficult even with an access an attribute accessor without refactoring code and so on So here's another example and a pattern that shows up here that if you're paying close attention is that if you're depending heavily on internally even Attributes all over your code. This becomes really hard to test It's another sign that if you did this TDD you wouldn't have done it this way. This is not TDD code I can't get a real handle. I can't stub out and mock out at total score at totals at multiplier that calls I've got to end up putting accessors on them. So here. I've got a case where I'm trying to make sure that the total score gets cleared In the reset method But the only way I can test that is to run total at the end and then see what it returns So the simplest thing actually rather than adding an accessor in this case This is run one of my integration tests because that computes the total score So simple dot rb if I run it. It's computer total score. I reset that total should now be zero I've got control over it in a different way than using an accessor It's a question. Do you like it this way better? Is it better this way? Is it more readable? Is it more flexible hard to say I'm still learning I feel like in this area. Was this the best way to do this? I'm not sure there may be a better way to do it The best way to have done it was to write it TDD in the first place, but we're not there. We inherited this okay, so Getting down deeper and deeper into some of the core stuff I want to finally get to this point where I can deal with the method that computes the Assignments the branches the calls that gives me a flog score and that was what I identified early on in our review as a big blob of code This is it and it's not a very big blob of code I've seen controller actions and rails that are 200 lines long with no test. So this is actually fairly easy but it's real and What I see in here is that in the core of this thing and I'm going to just go ahead and extract it There is some computation which is purely functional. It has no real side effects So I set some stuff to zero that I use in computing something based on a tally Which is part of his data structure of the calls that have been made and then I Square the stuff and compute the square root and then do some stuff with it So I'm just going to pull this stuff out I'm going to extract a method carefully Identically pass in that temporary variable as the argument and then return it back out to the place It was being held in the first place This is a pretty clean instance of extract method and now when I'm doing that I can write specs for this thing I can compute and then here's the square root of the things that I shoved in and oh, it's exactly the same thing That's now under test I Can continue on down doing this and one of the things that I run into let me actually back up because I don't have the slide in front of me is Next if dollar M and method matches no method What he's actually doing here after inspecting this is you said dash M on the command line So it gets its way all the way down here as a global dollar M And if your method matches no method meaning I was not in a method scope is what it turns out I'm in sort of main. I just I open up a ruby file and typed puts well that puts is in his global scope So if I'm doing that I tally things differently depending on if you have the M switch on or off So when I get to that point I'm gonna have to deal with it And so I start writing some specs because I have I extracted enough. Can I get my hands on this? I'm gonna take a stab at it. And so when dollar M is set Will not compute scores for this stuff and when it is set will compute this and so I start looking at it And then I reach the point where it's like I don't know that I've extracted enough out of this to be able to sense this easily to make It clear so I go ahead and do another step There is some totalling that goes on make a hash new and I shove it in there So let me extract that again and make some specs for it Keep looking at it. I still don't think I'm there. Let me extract out this this chunk of code That ends up let's see this is increment total score by done that This is the one that I want I finally get down to the point where that little bit that's left the self-call each dues Da-da-da-da-da. Well, I'll keep the iterator because that seems like that belongs in totals But I'm gonna pull out this bit the next if dollar M But next is now broken out from the loop That's really return and if I pull it out into a new method and now I'm just calling the other helpers that I've written So that behavior of dollar M and all this complexity that was in this loop that I was trying to spec is now in a nice Little clean place where I can just say no loop. No nothing. Okay return if this happens So now that cleans my specs up a lot all that dollar M stuff. I was doing I just forget about it I'll just do it here and I can then make the remaining calls I can spec that remaining total method and say it should call this and it should loop over that and I'm done and You end up with something again. That's a lot cleaner. So The totals is at the bottom It just does a little loop after initializing a hash and returns the result and it calls these other simpler Methods with this one has a side effect and this one has they have a simple single Responsibility this one has no side effects. It's purely functional. That's nice This one does the conditional thing and sort of wraps the other ones So I've actually tested and refactored a little bit of the stuff along the way and now I've got a sense of What does this thing do? Okay? I've got a handle on this and I'm skipping a bunch of things that were done There are a lot of these little things you can only only cover so much But that's why I'm sort of giving it out to you now I reached a point where through all these things I've got a decent test suite all of the methods that are not process underscore something have been tested pretty thoroughly I've got an integration test that covers most of that stuff altogether I've now read through enough of it to know kind of what this thing is about and I'm going to start refactoring some things to clean it up to make it So the next guy that looks at this me or somebody else kind of knows what's going on I know what bleed does now it just takes a list of these op codes that the parser has handed me and it just runs through them All it's bleeding them dry. Well, what does that really mean to me? It's it's analyzing a list I mean that may not be the best name in the world But at least it says what it's going to do in the scope of this processing I'm just analyzing the remaining list of expressions. Just go ahead and do that one of the things that came up is that Bad dog, which I now also know what it does The 42 that's in it has absolutely no use. It's not used anywhere. It doesn't affect anything It's just a bit of I guess an in-joke a cleverness something like that I mean I get the joke, but it didn't help me in this process of figuring out what flog was trying to do So I can now get rid of it because I have enough tests enough test coverage to be able to do this Another pet peeve self dot whatever usually not needed you say if that's for you Thinking about him when I made that commit clear out the self dots if you don't need them Change some method names change clear up some things in general And now let me get back up to the front end this this command line script and look at the argument passing Dash s is in the standard library that works, but it's confusing and it's linked all these global variables all over the system Opt parse is also in the standard library. I don't want to add a dependency to this project Just because I'm refactoring so opt parse is a little bit better. It's a little bit cleaner Let me go ahead and use that so I'm going to replace this usage stuff and the method Hanging with how you would normally do with option parser So you can say dash a now and dash dash all dash dash score methods only those sorts of things I can do that and I've got a usage methods out of it I can put that in but dollar a dollar s dollar m dollar v all these things are still throughout the code So to be able to do that refactoring what I first do is say I'm gonna leave all the dollar sign variables in there and I'm gonna put option parser in and I'm gonna set an options hash and then I'm gonna set those variables based upon that option has so that I can Implace sort of do this and run my specs and say okay. I've swapped something out, but it still works It's still coupled in exactly the same way as it was that's safe for me I can do that and then I can pass in those options, which is still safe We're still using the dollar sign variables, but now I'm passing them into the initializer for the library So plug new options initialize takes options. It just sets them as a as an attribute inside I also make an accessor attribute reader options. I'm skipping some of the specs You can read them. I spec this stuff and I do this stuff and Then when I'm done Dollar v goes away options verbose dollar m goes away options methods that it so now the globals are gone So in three steps, I've refactored to use opt parse and got written of all the global coupling And it's just a class instance variable and now Hey, I'm back to this is code that I would consider for the purposes of what I am doing Not for all purposes not for everything but for what I am doing I'm now covered and I can go back to BDD type Spectrum and development So let me go ahead and start and this is not finished, but I'll show you sort of where it's headed I'm gonna add the specs for a dash B, which is blame I want to see when I flog these methods on the source code who did it It's gonna call out get blame or SVN blame or what have you and it's gonna compile this So it should take a dash B It should set an option it should pass that in and flog should you know see that option is set and and here's the ops on Am I op parse thing so that's I'm back in BDD land and I want to say Okay, I still got a deal with standard in if I'm gonna blame stuff But it doesn't make any sense to blame something on standard in because I don't know what file it came from I don't know what the repo is. I don't know so What would I where would I check for that in the code? Well, flog file seems to be the the best place that knows about that so knowing what I've learned about this if I found the place number one So let me write a spec for it if I get the flog file and I'm blaming is on your flog new blame true It should fail It should raise a runtime error if I if I give it dash deal with standard and so now let me implement that step three So again BDD we're feeling good and okay, so what I want to do now I want to actually do the blaming so I'm gonna just make a collect blame method I don't know what it's gonna do, but I should say that when I flog something If the if the let me write the spec if if the options for blaming are set to true So that's my before each it should gather blame information for the file and when I'm not it should not and those fail Throw this in collect blame file of options blame boom It's implemented now all I have to do is fill out collect blame and that's that's where I am now That's where this project is and so you know talk to Ryan and say hey This is where this is any any ideas if I can use parse tree for Blaming can I can I get line numbers out of parse tree to match things up? So that's that's the research project that's underway right now and we'll see you know sort of where that goes But now we're shortly in a space where we can begin to hey Flog blame and anyone else who wants to contribute to FLOG now has a better test suite to work from and hopefully now as an example Maybe how you'd put that put that in So I think I got in sort of under the wire. Maybe a couple minutes for questions. I like to thank Allie. Hi Yssef Kevin OTC homies MC anal J for putting this all on will help you in Nashville Mike feathers all the caboosers the get hubs Thanks all you guys zen spider who couldn't be here and all our other peeps who also couldn't be here today Any questions Well, the first thing that I actually did looking through the commit history is I did the total first So that was actually my first integration test. Let's get the hooks around it I'm sorry. Let me repeat the question. So why did when I did the integration test? Why did I actually pull out the yaml dump instead of Looking at what the old flock system reported and just and just use that So the first thing I did was to actually Put those fixtures in place and then say what was the total and once that worked I was I felt actually pretty confident that that I hadn't broken anything seriously then I looked around a bit and It produces a pretty good bunch of detail just from the output and I could have done that But it actually produces more internal state about what it's analyzed and I felt actually Is it the right thing or is it the wrong thing? Looking back on I'm not sure I felt like knowing more information would help guide me towards Sort of black box reproducing the behavior even though it's kind of cheating by making it sort of white box Probably the best thing to do is To look at the output and this is a white box and do that at the time I said I feel bad for not having done this at all I'm just I'm just gonna amp it up as far as I can and make sure I've got it completely Um, yes I just want to Say that I do not use the term legacy code to mean untested code And I actually I've heard others use that I actually kind of dislike it because It seems to me that the term untested code is perfectly good and says what it means the term legacy code Says what it means that I don't see the the benefit of combining them. It's just all it does is make it harder to refer to the concept that Most of us mean when we say legacy covered and I don't see the I see the risk but not the reward I'm trying to sort of conflate those two terms. So I just want to say that Okay, not everyone. I mean I've heard you know Dave Chalinsky and others use it that way and I just I just don't see the point Okay So I mean to summarize I guess it's sort of the use of legacy legacy code to refer to code without tests is Can be confusing And it it doesn't necessarily do a benefit for people who want to identify legacy code in other senses even though it does have some usage I I think the reason I've attached to that term a because I like the book That michael feathers wrote about it and it's easy to say oh You know legacy code And I think I think his book does help if you know you're dealing with legacy code To to get some tests around it, but it for me I think about most of the code that I wrote two weeks ago at any given point in time It's probably bad code or in a lot of cases of my life legacy code. So I don't know if that makes sense, but it's it's kind of self deprecating in a way. It's like, okay I'll let me humble myself. I'm going back and fixing my code. It's a legacy that I have to deal with Okay, let me get it better. I'm just not sure since you know, there's many adjectives We you know there's no reason not to have bad untested and legacy all varying independent like I just don't see why it has to There has to be sort of You know, we can only use two of the There is some value There's sort of no way left to say that I don't know that I've ever seen I obtained from three years ago that it's like I want there to be a word for that whether they tested it or not Well, yeah, right Okay, um anything else anybody else? Okay, well, thank you for entertaining my my spiel here