 engineering. I'd like to thank you all for being a part of our big day today. We are going to switch from talking about what you do to have a perfect process as we can have a talk about for GitHub and now we're going to talk about maybe a less than perfect process or a less than perfect code base. I was advised to start talking about this as a safari into your legacy code maybe. Keep the theme going. How many people out here are working on a project that they would classify as a rescue project or a project that was handed to them in some sense or a project that has code that they didn't develop or so that's that's many of you. How many so of the rest of you how many of you are working totally on projects of your own that nobody else has ever touched. Wow that's great. So here's the scenario you get handed a rescue project some you're hired in a new job you are a consultant who is given a who is given a client project that already has had work on it and it's got all of this code and you don't know what any of it does and you go to look at it and you very quickly come to the realization oh my god the last person to work on this was an idiot. Yeah I personally I this happens to me when I look at my own code for six months so I would assume that this is a fairly common reaction so what's the proper response to this well I think that the only professional response really is you shake your fist at the habits for one minute curse your name and then get back to work. You have about one minute to finish complaining about how bad the last guy was to work on this and then you're just stuck with it because nothing else is productive. So I'm going to talk here a little bit about some strategies for dealing with these code bases and I think that it's interesting or useful to mention what I mean when I talk about an application being legacy. Little feathers in the outstanding book dealing a working effect of those legacy code which you should all get if you're dealing with this kind of stuff defines legacy code as code without tests which I think was charmingly optimistic in 2004 that all code that had tests would be perfect and clean and then we would all be able to work with it. I think that over time as we've as test driven development has become more prevalent and test driven projects have become abandoned. I think we've all found that having tests does not necessarily guarantee that the code is going to be easy to work with. So the definition that I like to use in terms of what is a legacy code base is that the code is based on lost requirements that the decisions that led to the architecture that exists to the methods to the structures to the tests that exists was made in response to some pressure, some time constraint, some resource constraint that you or the people on the team no longer have access to. Maybe there was a reason for that 300 line controller method. You don't know. It's lost to history. Maybe they were paid by the line. I don't know. And the problem with this is that you coming into this code base you don't know necessarily what correct behavior is. And it kind of reverses the normal TDD BDD relationship between tests and code. If you are working in a strict BDD system where your tests really are driving things, then the tests actually are the source of truth. And if there's a discrepancy between the tests and the code, you're supposed to believe the tests. But you can't do that in the legacy system. You don't know whether the tests are working. You don't know whether the code is working. You don't know what correctness looks like. And this is one of the things, one of the many things that happens to you when you go to deal with a project like this. So when you sit down to a project like this, your goal should be respect the working code, do no harm and deliver features at a sustainable pace and improve the code quality as time goes on. Now, one way to look at this is the Boy Scout rule, which is to leave everything just a little bit every campsite just a little bit better than you found it. So every time you touch some part of the project, make it a point to leave it just a little bit better than you found it. Clean up one method, one variable name or something like that. But I also think that it is important to respect the fact that you are probably dealing with a project that is in production that was as much as we like to make fun of the previous developers as idiots, probably developed by reasonably intelligent people who are working under pressures that you don't know about. And so you need to be aware of the constraints that they may or may not have and not go in and change things really, really. So what happens when your project needs to change? They come your first day on the new job, got this existing code base, there's a change that's needed. And there are Goofus approaches to this and get what approaches to this. Everybody, anybody familiar with the highlights characters? Yeah. From highlights, magazines, staple of doctors waiting rooms everywhere. Goofus is on the left, I think. I think the messy hair is the Goofus part of it. Okay. Which one of these is a Goofus approach and which one of these is the Gellan approach? If you're not familiar with the idea here, it would be like Goofus beats people up on the playground. Gellan hands them candy. You know, it's not a subtle kind of trip. So one approach would be, hey, we have to make a change. Let's start off by writing tests to cover the entire application before we make our change. Another approach would be, oh, well, this is simple. We can just drop that in. We don't need to test it. So which one of those is the Goofus and which one of those is the Gellan? Well, it's kind of a trick. They're both Goofus. And they both have problems. Both of these approaches have problems. I am not at all immune to the lure of writing tests for everything that isn't nailed down before you start moving forward. I mean, I understand that. But there is a reason why it's not a good idea to start on a new project by doing that. As perhaps evidenced by the right line, it's a small world, but I wouldn't want to paint it. The project that you're working on is big. You don't want to paint it. Writing tests for writing the test is not a quick win. You sit down with a new client, they ask you to move one thing on the homepage, and you say, that's great. First, I need to write our spec for three weeks. It's not going to go over very well, necessarily. And more to the point, you can easily introduce bugs through the process of having to write these tests. And you are coming to this, you know, as I'm defining this, by definition, you don't understand the project that you're working on well enough to even test it, you know, and you, and if the project, the code is at all complicated, you're going to have to move some stuff around to write the tests, and you'll introduce bugs. And that's going to make, that's going to make the client or the manager even less happy. Not only is it a quick win, but you're going backwards. On the downside of just dropping something in, which is also very tempting, I think that we're all also very tempted by the idea that, oh, we just want to move this header over here. That's really quick. We just want to add one new, one new line of data to the page. It sets a bad precedent. I think as a professional, as a, as a craftsman, as a professional, if you expect to be dealing on a basis where you're going to be writing tests going forward, you want to set that idea from the start. You never have, you never have a better chance to do that than the first time you are asked to do something to show how you plan on working. And to, to aggregate that, aggregate that responsibility right off the bat, is only going to make it harder for you to come back the next time and say, oh, well, now we need to do this test thing. And it also doesn't make your life easier over time. One of the advantages of the Boy Scout role is not just that, you know, we're noble craftsmen and we want everything to be perfect. The idea is, is that improving the quality of the code is improving the quality of your life working on this project. You make things easier for yourself by making the code better. And when you take a new request and just drop something in, at best, you're leaving things the way they are. And at worst, you're making things just a little bit harder for the next time. Okay, so what's the Gallant approach? Well, to me, the Gallant approach is to go back to the basic process, the basic test driven red green refactor process, but apply it only to new changes. You draw a line in the sand. You say, this is the bad stuff we did before. I can't do anything about that. Can't do anything about that right now. But what I can do is I can say that from now on, we will do the right thing. We will go red, we will go green, we will refactor. And as we touch existing parts of the code, we will bring them under test as part of as part of the changes that we make. This is a safe approach. This is approach that in generally limits you to touching the parts of the code that you're actually dealing with, the actual fighting chance of understanding. And over time, code coverage and code quality will grow organically. If you're on a large team, perhaps you can affect other members of that team when they see that you're doing something and it's going well. Very carefully. I mean, at some point you have to hold on to that because it's like you're one five slides in the future. If you still have the question in in 10 minutes, raise your hand, raise both hands and wave them. So great. It's not that easy. It's easy to say we'll do the right thing from now on, but it's a little bit harder to do it in practice and not even counting about on any any political pressures you might get from people who are unfamiliar for the process or other developers. There are practical problems just in dealing with the code base. Like I said, you don't know what correctness is. You don't know what you want to test. On a code base that was not built for testing, it's very hard to isolate objects under test and separate the code that you want to test from the rest of the code base. The Michael Feather's book is very effective in having some techniques for that. We'll talk about a couple of them in a few moments. But also, Git is very much your friend in this kind of stuff. Does anybody here not know about Git Byset? Everybody knows about Git Byset? Cool. Git Byset is the kind of tool that's really helpful here. It helps you isolate individual changes that may have caused defects in your program. So what are some of the things that you can help you build up the code structure on a legacy code base? Cucumber is one of them. Or something similar to that. Something in, but the Cucumber has the biggest penetration of all the tools like it. Cucumber's biggest advantage in dealing with a legacy code base is that Cucumber is completely independent of the existing code. No matter how messy Cucumber is, no matter how messy the underlying code base is, Cucumber sort of floats above it. In a Rails application, Cucumber cares about URLs coming in, basically an HTML coming out. Doesn't matter. The 300 line controller method is not a problem for dealing with that. And because of that, it's relatively easy to cover a lot of app with a lot of your application, at least with smoke test level talks test of does this even run relatively quickly and endlessly? There's a problem. Cucumber's kind of slow. And building up it's not necessarily the best long term way of keeping these tests together. And it's also not very good at isolating the problem is for the same reason that it's been forgiving you a lot of coverage quickly. It's working at a higher level. So it's not going to tell you, it's not going to tell you that. It's not going to tell you what line in your 300 line controller method is causing a problem. But it will tell you that there's a problem, which is a start. One thing that I have used successfully in the past is a process that sometimes called test driven exploration, which is kind of like test driven development with cheating. Where you write the test, you write a test that you know will fail against the unit. For instance, you have a return nil or negative 75 million. And in most of the time that will fail. Unless you pick the negative 75 million answer by mistake. And then you can feed in that answer and fill in that test. What is this good for? It's good for providing regression testing. Obviously, you're not really using this for forward development. But you are using it as a way to explore what the code base is actually doing, and allow you to treat the code as the source of truth. In this situation, you have to treat the code as a source of the truth, but you don't have tests. So you're trying because you don't have tests. And what you're trying to do by sort of sounding out the code with tests is to move some of that information about how the code behaves from the code to the tests, so that then you can change the code and still have information about how the code base works embedded in the tests. Mach testing is another really useful approach specifically against legacy code base. Machs also allow you to use unit tests. And they allow you they allow they help you isolate basically, because you can take one crazy method that you're trying to put under tests, and you can surround it, you can surround it with mocks for all of the things that that method calls out. And essentially, you're cutting across the tangle responsibility of the code. You're just putting in much simpler, the much simpler dependencies of your mock for the more tangled responsibilities that the code may have. That can be that can be a very quick way to start unit testing. The setup there can be hard if the code is really bad, because potentially, you're paying for people's really bad design decisions and the number of mocks you need to set up. And at some point, that may you're going to want to move away from that a little bit because it won't because if them if the tests need to know a lot of internal information through mocks in order to keep the order to run the tests, then that's going to make at some point refactoring the code more challenging because the tests now have really deep internal information about the objects are testing, they may not that they may not actually be. Another strategy is isolation. This is more of a code strategy. And this involves only touching existing methods to the point for the purpose of calling a new methods. This is a kind of an extreme strategy for dealing with a messy code base. The idea is that you deal with it by not touching it at all that all you do is that if you have some crazy insane method, but you have a new condition where you want to write new code against it, you switch off to your own new clean method or your own new clean class that is fully test driven or behavior driven and only interacts with the messy existing code at a boundary that you that you make very sharp. And over time, you have this little island of test driven clean code in your system. And over time, you move things to that island as you as you have to make changes. And then it's fairly clear in your system what parts of it need to have are going to be built with tests and going to be built cleanly and what parts aren't. Another process or another tool is something that my father's called in his book, a seam. And the idea of a seam is that you can change the behavior of your code for the purpose of your test without touching the code itself. And Ruby offered or perhaps minimally touching the existing code itself, Ruby offers a number of different ways to do this. Some of which are more or less helpful in the case of the case of testing and some of which sort of feel like mock objects. One thing that's really simple is to start adding new default arguments for your new condition or for switching off to your new code base. But you can if you add a new default argument to a method existing calls to that method don't need to change. So that is one mechanism for prying in your new code into existing code without having to go through and change everything that touches it. And again, that can be a mechanism for you to start building up a part of your code that you're doing test that you're doing building tests for. Another thing that you can do is that you have a number of mechanisms that basically fall under various forms of monkey patch. You have perhaps a method in your code that you really don't want to touch because it hits a third party library or something but it's really slobably factored. So it's very hard to get around. You can just create a test version of that. You can do that by creating a subclass if you need if you want. And then using that subclass in your test. You can in Ruby actually monkey patch for the purpose of your test, one method, just put it in the beginning of the test that you're monkey patching one method. This is these are sort of emergency. There is no better solution kind of things, but there are ways to get you started. You monkey patch the one messing method. In some sense, this is a very elaborate mock potentially, or maybe a more explicit mock stuff. You can use a singleton class in Ruby. That's, yeah, you can use a singleton class for Ruby. Again, effectively, these are, these are all various ways of mocking that might have more or less benefit in different situations. You can just create your own duct type class. Another interesting mechanism is to use a screen and object called a pebble. That is basically just a blob that referred that this is almost like a tracer. I don't understand what this code is doing. I have no idea where this object is getting passed around. This code is a total mess. I can't figure it out. You pass in this pebble object to the code base. And all the pebble does is it responds to method missing, and it returns a call. Basically, every time it gets a method gets called on that object, it spits out what method was called and where it was called from. So it allows you to sort of trace the life cycle of an object through the system where it gets called by work. And this is actually something it doesn't have to be the pebble doesn't have to be its own separate object. This is something that you could add, you can add this method missing onto an existing object that's going through the system. And you can use that to sort of turn the lights on and help trace the behavior of the system against a particular object while you're trying to figure out what's going on. A lot of times you'll pick up a legacy application, and it will have sort of half it'll have like the worst case like the application has no tests at all. That's fine. You can start writing tests. If the application is fully covered. That's great. It's got it's got a lot of test coverage. There's a what's the opposite of a sweet spot. There's like a sour spot in the middle, where somebody tried to test for a while, they didn't really commit to it. The tests are there, but they're they didn't keep up. So you have a test suite that goes along with it that just fails. And you have to step one, even if you're not going to cover the entire application with tests before you start, you really do need to make sure that if there's a test suite there, it runs. And that actually really does need to be job one. And you need to look at the test as though they're code. And I think though you need to make this kind of snappy like this is not something that you can spend a whole lot of time on, especially if you think the tests are not very high quality. This is something I took from Mike Gunderloy is our Rails Rescue Handbook, which is a pretty good book on many of these topics. If you're looking at a test and you can't figure out what it does in five minutes and it's failing, just delete it. It's not. And that's a pretty quick way. That is a one time deal when you come into a new code base. It is not something that you would want to do over time. At some point, the fact that you don't understand what the test is doing is a reflection on you, rather than a reflection on the test. But in this brief interval when you come into a new code base where your inability to understand it is probably a reflection on the test, then it's time, then it's time. At that point, you can get rid of tests that don't seem to be adding value. So when can you refactor against a test suite? I feel like there's a case, there's a level of stuff that is level of refactoring that is relatively simple that you can try, that you can start to do at any time when the tests are green. And you do need to be careful here. The accent here is on simple, where you have, these are simple method extractions, name changes, stuff that is unlikely to really either break a lot of tests or change a lot of code logic. But stuff that allows you to isolate individual parts of the test so that you can test, individual parts of the codes that you can test some of them. I think that that you can sort of do as you come across them. There are some things that are going to be simple and obvious enough that you don't necessarily need test support to them. More complicated things, things where you're actually changing the abstraction structure of the code, you're adding objects, you're refactoring on classes, you're changing relationships. That should be tied to new requests that you really shouldn't go in and say, it's really not productive to go in and say, we're going to just move all of this code to a service right now just because there's a total mess right now. It should come in when there's a request that's a big enough request to justify the work to come across, to justify the work to write the tests to do the refactor. In that case, you probably do want to build up the test coverage first or at least start to and start to do them kind of in lockstep. You build up a little bit of test coverage, you clean the code up a little bit. You can kind of do them in sort of a step-by-step, step-by-step fashion. So what I want you to come away from this with, working code deserves respect. You shouldn't pull it up just because you don't like the way it's indented or something like that. If it's actually in production and it's serving people, there's value to that and there's value that you need to be careful of when you're trying to clean things up. Test at as high level as you can. If you can, if all you can do, high level may not be a great phrase here, but if all you can do is cucumber tests, do cucumber tests. If you can get into individual units, if the code is cleaning up to get to individual units, do that. Write the test that you can. Any testing in this case is better than nothing. Always remember to leave things better than you found them and things will continue to get better and eventually you will have the code base. You eventually will move towards having the code base that you want. Always try to isolate the code that you have under tests, either by actually isolating it in the code through a simple refactoring or through monocs or something like that. And again, you need to consider the cost of making a change when you're trying to clean up a code base against the cost of not making that change. That's pretty much all I have for you today. I work for GroupBind. A couple things I just want to mention. I work for GroupBind Engineering. As you may have heard, we are hiring. Also, anybody who is coming next week or two weeks from now to the SC&A Southwark Rasmussen North America Conference in Chicago, I hope to see you there. Anybody who's not coming, take a look. It's going to be really great. I blog at RailsRX.com, speaker rate down there, at NoRap on Twitter, and all that. Thanks for your time. We do have a couple minutes for questions if you want. Otherwise, thank you.