 to better code through boring your tests. If you're in this room or one of the few stragglers still filing in, don't worry, you're welcome. We got started late. I'm assuming two things. First that you value testing and second that you hate your tests. In part this is because these are the problems I told you I would solve in my abstract but it's mostly because I think of these two things as the default state of developers. Agrees that testing is important. Even people who maybe don't find so much value in test-driven development or in other specific testing practices, they still agree that you need some kind of testing. That's a blog post that's all about why things that are not TDD are still awesome. But we fight about testing even though we collectively value it, even though we all know you need something because no one, literally no one, loves testing all the time. Even died in the little TDD fanatics like me. It's okay to admit it. The testing police, they are not coming for you. Today, I'm going to help you fix this mess. Today, we are going to learn why we hate our tests. Oh dear, this is not a good font color on this train, is it? Mostly everything's brighter, don't worry. We are going to learn why we hate our tests and why we fight about them. We're also going to learn about some ways that we accidentally make our tests worse sometimes by getting fancy and making them too interesting. But all is not lost. We are also going to learn about ways we can fix the ways we accidentally make our tests worse. While we're at it, we're going to learn about how we can make our application code better at the same time. Because at the end of the day, all I want is a developer, is boring code that I know I will understand in four months. Empathy for my future self, there's connections, theme building, between me and my co-worker. So in order to do this, first we're going to talk generally about some underlying causes of test hate and test fights. Then we're going to talk through three specific ways that our tests can get annoying and learn how to fix their root causes. So underlying reasons, why do we hate our tests? Simple answer is test or code, code is terrible, therefore tests are terrible, but that's cheating. The version without cheating, we write our tests with imperfect knowledge. We write all of our code, our application code included with imperfect knowledge. We can make educated guesses about what we might need in the future, but all they are is guesses. We can never get away from our assumptions or our current mental model when we write application code. And of course our tests run in the same groove. Now whenever we've made some incorrect assumptions in our code base, and this has made our tests or our application code, or more usually both, kind of painful to work with, there's always going to be some died-in-lilty nephatic like me who insists on making this jerk of themselves. They will say, oh it's not a problem, you should just try listening to your tests. I know this because I have been this strict before, I am so sorry. So why is listen to your tests when you say it like that, such annoying and jerky thing to say? Because you are presenting your listener with this. Because not only are you giving someone this useless whole-ridden process and calling it a map, you are telling them they're stupid if they don't magically understand what the question marks are, how to fill this in. And so like I used to do this, right? And so if you are the person who is annoyed by this, while your annoyance is legitimate, I ask that you empathize with the person who is doing the annoying, because I can tell you from intense personal experience that people aren't saying we need to listen to our tests because they want to be arrogant and know it all. They are saying it because they see a thing that is wrong and they want to fix it and they don't quite know how. They just know that something is weird and they want you to give them a chance to fix the weird thing. So in this talk, I want to fill in the question marks a bit so that no one here has to feel stupid or sound like a jerk again. Before we get started, everyone take a deep breath. Bad code happens. Bad tests happen. This is fine. At one point, there was a this is fine dog slide right after this and I deleted it because it's actually fine. Code is terrible because the world is complex. It is good for us to be honest with ourselves about whether complexity is useful or necessary, right? And this takes discipline and et cetera, et cetera, et cetera. Yeah, yeah, yeah, yeah, yeah. Complexity happens. That is why we are paid a lot of money. Complexity happening to you does not mean that you're a bad developer or a bad human. Next, we invented the more esoteric testing techniques. I'm going to say our test smells later for reasons. Sometimes you actually need them. You need to be honest with yourself about whether you actually do need them, but none of the techniques I'm about to talk about and say maybe you shouldn't use are inherently bad. Finally, and most importantly, I have done everything I have about to show you in production. Everything I am about to call out as a mistake is a mistake I have made. So let's begin. The first test smell we're going to look at today is testing private methods. So when you see this line of code in the test, when you see that send method, chances are someone is using it to try to test a private method. Gasp. I've heard a lot of different justifications for testing private methods directly instead of indirectly through public methods. These boil down to three things. One, the method might have weird indirect results that were hard to introspect. Maybe the public methods that you'd used to test it were really expensive to test too. Maybe someone just wanted to isolate the test from weird side effects elsewhere in the code. The thing about these reasons is that they're hard to argue with. Even if you're feeling uncomfortable at testing a private method, they're legit. So I'm going to suggest is that instead of trying to confront these arguments, we can sidestep them. Instead of testing a private method, we can, first off, just make the method public. This is an underrated idea. If you're feeling it need to hit something directly in test, it's because it's important. Maybe it's important enough that in your application code later you'll want to hit it. You always need admin interfaces for the weirdest things. Second off, you can build more introspection logic. If it's hard to check on whether a process completed in test, or if it's hard to check on whether a process completed properly in test, then it will also be hard to check on while you are debugging. I promise you. So do your future self this favor. Lastly, but not least, you can extract a simple class. I'm going to show you a really quick and dirty way of starting to do this. You can use this shim. So we replace the contents of the private method with an object, a new object, and we pass the object that has the private method in it into this class. Then we just copy and paste the code from the private method into that new class's call method. Once you've got this, you no longer have a private method. It's super cheaty, whatever. You can refactor it down later. It'll be fine. Sometimes there are going to be other private methods from the original class that you need to pull in with it. That's fine. Extract this as well. This is a technique that can often get you someplace really good really quickly. If it doesn't, don't worry. Feel free to abandon it. If you pull in a thread, sometimes you just get a bigger tangle. I'm a knitter. This is something I think about a lot. That's why git checkout dash dash dot was invented. It's probably still a smell that you want to test a private method, but sometimes it is OK to fix things later. We call it test smells, test smells, because it's not the end of the world if we cannot unstink them right away. We don't call them test errors. In fact, if you don't know how to unstink something in the time that you have, it is usually better and not just OK to wait until you have more time and knowledge. In the meantime, practice harm reduction. You can isolate the weird bit and make it that obvious that's weird. Maybe that means leaving the direct private method test in. Maybe it means running the slowed and direct tests. Either way, it means commenting the hell out of it. Leave it to do. Comments, as my co-worker wonderfully said, are not the enemy. But once you've communicated this is ugly, sorry, please fix it, we'll know more later, promise, then you can feel free to move on. The next smell I'm going to cover is test duplication. So for this, let's pretend that we work for a service that coordinates donations to nonprofits or political groups, perhaps a little like my previous employer. And our clients, the nonprofits we serve, each have a primary contact phone. We can look up whether the phones are mobile or not. So here's some tests that do that, that check on that mobile lookup process. We set the client's phone number to a predetermined phone that we know is a mobile phone, and then we check on the result of that mobile phone method. Then we do the same for a known landline phone. Now, one thing I should be clear about is I'm using mini test as the least common denominator for all my test examples here, but everything I'm saying in this talk applies to our spec too. If you look at these two test cases, the syntax is different, the content is not. Both are bounded test cases where we set a client phone number, in both we verify the method returns the thing we expect. Very similar, that's because the actual thing that you're doing is the same, regardless of the library you choose to express it with. So anyway, we've got these clients, and we need to know whether they're using a mobile phone or a landline phone. And then later on, we get donors. Donors are distinct from clients, but also have phone numbers, and we test them the same way, the exact same way. You may have noticed that one thing changed on the slide back and forth. See? So we've got some test duplication, and I don't know about you, but any time I see code duplication, it makes my brain feel weird. It makes me itch. It makes me wonder if there's complexity that I am not yet managing to organize. So how should we deal with this duplication? One wrong way we can use is shared example groups. There's a special DSL for these in our spec. I'm not going to cover it. In many tests, we just share code between the two test classes, with a module just like we would in our application code. So with a shared example group, we sling all the shared code into a module. Note how very like that this module is. Note how very little has changed. And then we include this module in our test classes. This is a good testing technique in some circumstances. But I'm calling it out as a mistake here, because it should almost never be your first go-to when you see test duplication. When we look at the application code we're testing, we see why. Our tests, they're basically just a one-to-one mirror of our application code. This means that our tests are making the exact same assumptions that our application code is making. Over time, that will lock us into these assumptions. When we go to refactor this code later, we'll be changing, maybe we'll be changing the application code, but not the tests. And so we'll have this very awkward fight between our application code and our tests. Maybe we'll be frustrated by feature requests that conflict with this kind of over-refactored set of tests we have. And so because the new assumptions conflict with the assumptions we're making on our tests, we need to change a lot of test code before we can even think about adding new application code for those features. So one alternate thing we could do, we could try to reduce duplication by testing the module directly. In this technique, you create a fake mini class, you include the module in it, and then you run your module tests against that fake class. This, again, can be a great technique sometimes. It is specifically a great technique when you are trying to test code that you are sure needs to be in a module, no, really, really. But it ties you really hard to the assumption that your code needs to go into a module, and that assumption rarely holds true. If we turn that module into a class instead and then we test that class directly, we reduce duplication much the same way that we reduced it when we extracted the shared example group or tested module directly. But unlike in these cases, we're attacking the application code first. We're looking at our tests and seeing what can we learn. By listening to this awkward duplication in our tests, we were able to see a cluster of shared functionality. We were able to improve our application code by properly encapsulating the single responsibility of phone number lookup. This is a much more sustainable solution than the way the shared example group doubled down on the module architecture and locked us into it long term. I modeled this example, incidentally, after a lot of times, I have seen or have made API clients that are inserted as modules. But it can apply to any time you've got a module that really wants to grow up into a real class. So last mail I'm going to show you this morning, inventing the universe. There's a particular kind of test where, to paraphrase Carl Sagan, you are trying to bake an apple pie seriously from scratch. Don't even bother reading that code. There's too much of it for reason. And so you need to invent the universe. You need to plant some trees. You need to water them and pick the apples and harvest the flower and, and, and, and, and, and before you can finally test that the pie came out tasty. Now the first mistake we can make when we are faced with tests like this is to do something about the iffy setup bit right away. This, again, hurts. Our job as developers is organizing complexity. So when we see complexity that's not organized, that when we just leave that massive wank of code in our tests, of course it makes us go, eh, eh, eh. But boring tests make boring code. If we made any new test abstractions now, we'd be pretty doomed, the same mirroring nonsense, that we just learned how to refactor away from in the module example. This is because we only have a little bit of knowledge about our domain so far. Specifically, we know exactly this much about our domain. We understand one possible path for creating one possible kind of pie. And we know that this one possible path is strictly required for application. We know nothing else. Which means that anything we did to abstract this further would be guesswork. We'd be using our current understanding of and assumptions about the problem to guess at what abstractions you might need. Guess where else we are using our assumptions. We are using them in the application code. When you try to make new test abstractions in advance of test duplication, you risk making parallel abstractions to your application code. Again, just like we saw in the module extraction example. So the first really important thing you need to do here is wait until you have two tests or three tests even. Once we've got multiple test cases, we actually have enough information that we can start to tell what our abstractions are. We can start thinking about techniques that we can use to encode those abstractions. So I'm going to start talking about those now. Just like in the last section, I'm going to show you two ways that I've seen people make mistakes by trying to encapsulate abstractions within test-only code. And then I'm going to show you an app code-only approach that I recommend as an alternative. Things in the section are a little less clear-cut. The techniques I'm calling to call out as mistakes are really valid techniques in some contexts. They're just better for those very specific contexts. That was a word I probably shouldn't have used quite often because it's going to be like that thing you keep on seeing, nevermind. The first test approach that I'm going to tell you about is shared contexts. RSpec, again, has a DSL for this, but in many tests, all you need to do is write a method and include it in your setup block. Maybe you even include it in a module to share it between test classes. What you're doing with a shared context can often be pretty benign. Methodically, you're reinventing the universe from scratch every time you write a test can be painful and sometimes it's not a particularly useful pain. For example, if your application has a database, perhaps you are using a popular application framework that I'm not going to name here, it is okay to extract that database cleaning logic to a shared context or into a helper file. That same popular framework does so by default. There isn't much of anything that you can or should do in your application code about something that's that test specific. The difficulty lies in telling what's test specific and what's not. In the case of cleaning your database, one way we can model it is, in production, are we ever actually going to want to drop our database under normal circumstances? Probably not. So it's probably safe to assume it's a test-only thing. Now, when we've stripped out all the inessential lines for test or moved all the things that we're pretty sure are test-only outward and we still have a fairly extended test setup phase, like a screen full of code or whatever, it's a sign that an object has a lot of dependencies. It's always wise to examine why these are necessary. I make that as a disclaimer. Sometimes they are. That's fine. A pie without a filling is super sad. And so, while I'm going to flag a shared context here, it's a warning sign that perhaps you have too many dependencies. I'm not going to talk down to you all because when people do that, they imply that the domain complexities people are facing that have led them to have too many dependencies are unnecessary or easy to remove. And in practice, that is never true. Luckily, the mitigations for necessary domain setup complexities and the mitigations for unnecessary domain setup complexities that for complicated legacy code reasons and or deadlines, you cannot actually remove right now. They're very similar. And so we're going to get to those in a bit, but first I want to cover one more false trail. Again, I should emphasize the fact that this is not a rant against Factory Girl. I use it in many projects by choice even. It is a great DSL for describing test and seed data when it is used effectively. It just has a few traps. So Factory Girl enables and encourages us to make factories, that is to say dedicated test setup helpers. Such as the line Factory Girl create pie makes a pie with a filling and the line Factory Girl create filling creates a filling with a pie. This is a problem, but it's not a problem because of dependencies. If we just say, oh, we can't see the direction of the dependencies, we're going back to the magic and particularly arrogant elves definition of listening to your tests. The real problem is never that something hides dependencies because dependencies in practice usually just means code you're not looking at right now. 95% of the time, I do that on purpose. I want to be able to look at my dependencies when they're wrong or when they misbehave or when they're slow or something like that. But the rest of the time I extracted them to think about them less. The problem with Factory Girl create pie versus Factory Girl create filling is that whether a pie belongs to a filling or a filling belongs to a pie is really important domain logic. The problem with tests that hide dependencies comes in when the dependency they're hiding is the domain model. As a maintenance developer, I want to know that we make fillings in the course of making pies, not the other way around. So I'd like to suggest that when your test setups get complex, you start learning how to move, you start thinking about how to move the important things that your test setups are telling you about your domain logic into your application code instead because it's your domain, it's important. There are a few different ways to do this. The first and simplest is pull more of the setup logic into the constructor. But maybe that makes you have a really complex, terrible constructor. So you can also make factory methods on a class. These are usually a shorthand for invoking the constructor with a certain configuration. You can also go full Java and build out honest to God factory classes. There is nothing wrong with going full Java if your application needs it. I swear, don't worry, you're gonna be fine. Sometimes you just got this really complex, hard to set up piece of domain logic. And in that case, it's pretty important to understand exactly how that fairly complex piece of domain logic works. So when we hide this logic in factory girl, that's a problem. When we decide to dig it out, what we're doing is akin to someone, it's akin to a startup giving an agency its prototype, but then later on, when you've got a little more money and you've proved your concept, you wanna bring development in-house a lot of the time because the core thing that makes your business tick is way too important to not have full control over. And so just like that, if we delegate this important stuff about how we're supposed to set up our domain, how we're supposed to handle the core functions of our application to factory girl, we're losing out on the ability to control that and the ability to understand it. So let's look at some foreign afters, like let's make this a little more concrete. This is how our invent the universe test looks before we start extracting some of its logic into factory methods. Then if we start using application code factories judiciously, we can get it there. We were able to move a lot of code into the application that way, where later developers can discover it and understand it better. We've lost enough lines to actually put this on our screen, so I'm going to do a little victory lap here. So all done now, right? We've gotten through everything we wanted to cover, underlying causes, test smells, blah, blah, blah, not quite yet. And instantly I did this this way because it let me make a DC flag. I could not resist making one. Our statehood vote was probably futile, but I'm nonetheless super proud of it. Thank you for humoring me there. So we've just performed a magic trick. By insisting that our tests be as boring as humanly possible, we have isolated a domain concept that we can use to make our application code more readable to new developers. By insisting on boring test code, we have made our application code more boring. Here's the trick part of magic trick. This is a whole bloody series of subjective value judgments. Value in boring code over clever code is a really popular idea and it's how I got you in this room. Since we can all not along to that popular idea, we sometimes miss that it's a value judgment. But if we don't define boring code very specifically, all we are doing when we say that clever code is bad code is creating another way to call code bad without explaining what's bad about it. When I say that factory girl create pie is the literal opposite boring is a single line that should strike fear into the hearts of any developer new to a code base and most developers old to it. What I'm reacting to is a history of seeing that one line imply thousands of lines of code. That I suddenly need to care about but that I have no easy way of searching for. But the developer wrote that line, the developer who's worked on that same code base for six or seven years, he might disagree violently with me. Because that developer has a great intuitive feel for the thousands of lines of implicit domain knowledge embedded in that one line of factory girl. He does not need pointers for how to refresh his memory on specifics. And so he might see the Java factory version as a useless verbosity versus the sake of verbosity. I am pointing this out to you so that you can figure out what parts of this talk are relevant to you and what your needs and what parts are not. Regardless of what parts of this talk you keep and what you discard regardless of what boring application code looks like to you or boring test code. The thing I want you to take away is that whenever you are writing tests you are writing them about your current understanding of your code about your current understanding of what your application needs to do. Duplication is cheaper than the wrong abstraction to quote Sandy Metz, who is wonderful, applies to test code too. When you do introduce test abstractions they will likely mirror the assumptions you are making in your application abstractions. Application code influences test code and test code in turn influences application code right back. And so whatever boring means to us sometimes the easiest way to make our tests more boring to hate our tests less is to just make them more boring by sheer force of will and glare at our application code until it recognizes their good influence and falls in line. That is literally the only thing listening to your test means. It means that places where you hate your tests are opportunities to refactor so you hate both your tests and your application code less. This is me, Betsy Havel. I tweet at Betsy the Muffin. More of it is like feminism and cats in science fiction than it is code, but it's there. You can also find me on the internet at BetsyHavel.com, on the Githubs at Behabal. I work for a lovely company called Roustify that tries to make mortgages less terrible. You may have heard from them. I'm from my coworker Tara who also talked to you recently. We are in fact hiring. Ha ha ha ha. Thank you, ringer in the back. We're very remote friendly. We have a lovely friendly culture and if you are interested, please come talk to me. I would love to sell you on us further. Before we get to the questions part of the talk, I want to do a brief advertisement. The other testing speakers, that is to say Justin Searles and Noel Rappin as well as Sam Thippen who organized the testing track and I are all going to be doing a special question, answer, ask us anything, tell us where we're wrong, et cetera, kind of lunch. Sam, Noel, do we know where we're doing this? Okay, find us in the lunch line. Sam, Noel, can you stand up so that people can spot you? Sam's very easy to find. He's got like a great beard and we'd love to see you there. But if there are questions you would rather ask now in the time we have remaining, then ask away. The question was are there techniques that you can use to feel safer about a test refactor without worrying that you're changing what they're testing in a way that makes them less reliable. The best advice I have there is boring is good again. And I mean something very specific by that rather than that stupid platitude. What I mean is reduplicate. Don't try to do anything interesting with it, just one of the ways in which your tests probably got weird and wonky is you saw a duplicated process or something complex that you wanted to hide behind a thing and you abstracted that too early. And one of the best ways you can fix your tests and get them to a place where they support where there are fewer assumptions encoded in them and they can support more kinds of application code changes in the future is to just deduplicate. Is to take whatever method you have and replace all of the places where you call that method with the contents of that method or unpack a factory girl, step into many boring lines of test setup, or et cetera, et cetera. When your, the goal with refactoring tests is never to make them clever, is never to make them express things more concisely, is to make them more verbose so that you can see the places where the shape of your application code is forcing your test to be verbose or indirect. The question was one of the interesting techniques that the questioner saw was the idea of moving your setup code off into a module and what were the downsides of that. So it depends. It depends specifically on what that module's doing. If that module is doing something that's very test focused, there are probably going to be very few downsides. The trick is that it adds more indirection to your tests and because I want my tests to be reliable, because I want my tests to not contain any bugs of their own, I want my tests to not do anything interesting or fancy or particularly unexpected. And so one of the pitfalls can be like I said in my talk, when you extract much domain logic into a test dependency, but the other big trap is that you might accidentally do something weird or unexpected in that test code that you've just extracted. And so in that case, particularly if you're using something like RSpec shared metadata to kind of side load it or extracting it too far into a helper class, what can happen is that your tests will create an artificial, too much of an artificial world to something that doesn't actually simulate what's going on effectively, so that when you read them, you have a set of expectations about what's going to happen that does not actually map to what they're doing. Do I recommend testing your test helpers? It depends on how complex they are. If it's like one line, you're pretty sure you know what it does, maybe not, but past a certain point of complexity, yes, they're like any other code. They can fail. I don't trust myself. Is that everyone? Wonderful.