 All right. Let's get started. So, when I wrote this talk, Apple had just announced OS 9 with, like, the iPad Pro, and they were talking about how, like, now it's ready for real production and creative work, and so I figured I was going to do my own first talk, conference talk, entirely in OS 9, which is a little bit risky, but, you know, if you guys don't mind being patient with me, let's dive in. So it's a little bit retro, it takes a little bit of time to boot up. Interesting design decisions. So here we go. All right. So open up my presentation here. Title of this talk is How to Stop Hating Your Tests. My name is Justin. I play a guy named Searles on the Internet, and I help run a software agency called Test Double. So when I say people hate their tests, why do people hate their tests? And I think it often starts as new projects are experimentations. They're fun. We're moving fast. We're breaking things. But eventually, we have the tyranny of success, and we have to make sure that things continue working, so we write tests and we introduce CI builds to make sure everything works. But if we don't plan carefully, then those tests eventually get slow, and they bog our teams down. Eventually, we reach a point where teams cry out and they yearn for the early days when they could just build stuff and not have to worry about these really slow test suites. And I see this pattern often enough on teams that I'm starting to think that an ounce of prevention is worth a pound of cure. And so we're going to talk about that today. Now, in Agile particularly, there's a couple of things people say once you have a team with a test suite that they hate. You know, the team might say, hey, our testing approach isn't working. And a response I really hate to hear is, well, then you're just not doing it hard enough. You're not just, you must be testing wrong. Try harder. And I found that if we see the same problem over and over again, the work harder comrade approach just isn't appropriate. We need to always be introspecting our tools and our processes to try to improve them. Other people come in and say, well, maybe we just need to buckle down. Testing is job one. But that doesn't sit right with me either, because testing from the perspective of the people paying us is never job one. At best, it's job two. They want to see us shipping code. So if we stop the world now to try to write better tests, eventually we're going to get kicked out, because we aren't going to look like we have our act together. Now, I'm not saying, like, if you're not building a green field application, if you have a big application, I'm not saying that it's hopeless if you already hate your test suite. It's not a problem. There's actually a weird trick to starting fresh with a test suite. That's right. You're going to find out what the one weird trick is. It's real simple. All you've got to do is move your tests into a new directory, and then you create a second directory, and then you have two directories, and you can write a thing called a shell script, and you can run them both. And so when you have to go change a crappy test, you can actually redesign it, clean it up, and port it over, eventually decommissioning the old test suite. I hesitated before I wrote this talk, because I'm an expert, and you should never listen to experts. My problem is that I have too much experience around testing. I'm too close to the problem. I've got seven years of naval gazing about testing, lots of open source testing projects. I've been the guy on every team that you've ever been on who cares a little bit more about testing than you. And I get into a lot of Twitter arguments about jargon and stuff that doesn't matter. And so my advice is this toxic hell stew that would just demotivate you if I told you what I really thought. So that's not what I'm going to do today. Instead, I'm going to try to distill that advice down into just its component parts and share a few lessons that I've learned over the years. The first is about structure of tests, like physically, like how do we organize the files, the directories, the lines, the test suites. The second thing I'm going to talk about is isolation because I found that how we isolate the thing we're testing from the world around it is the best way to convey the concept or the idea of the value that we hope to get out of the test. And third, I'm going to talk about feedback, like do our tests make us happy or sad? What's it like to live in the code base live with this test suite? And again, keep in mind, I'm coming from the perspective of prevention. A lot of this advice makes sense if you get to start fresh. But once you've got a big mess, there's not a lot that I can do for you. At this point, I was realizing that I work in, or Apple works, yeah, in macOS 9 is really hard to use. And I needed some sort of like artistic motif. So my brother dug up this old family feud game. So we're going to use a family feud. This is an American game show board. If I point to the board, it tells me the answer to the survey questions. So if I point and I say, chili crab, it'll say X because that's not the correct answer. We're going to work in three rounds. And the first round is test structure, the physicality of our tests. I want to point and say that people really hate tests that are too big to fail. If you ever notice that people, and especially in Agil Land, who are really into test driven development, they tend to hate big objects more than other people do. And it's obvious that big objects, big code, big methods, they're harder to deal with. But why is it that people who are really into testing really hate big objects? And I think it's because testing actually makes big objects even harder to deal with. And so let's dive into why that might be. First of all, big objects, of course, they've got many dependencies. And so that means for your test, you're going to have lots of tests set up. Big objects also might have multiple side effects in addition to whatever return value they boil down to, which means you've got a lot to verify. But that's all linear. Where the rubber really meets the road is that big objects have many logical branches. And that means that you've got many test cases to write. And that grows super linearly. So let's take a look at some code. Now, right now is when I realized that macOS 9 doesn't have a terminal because it's not Unix. So I had to go find a new one. So let's boot up this other computer I found. Again, it's also pretty slow. All right, so this is a little bit retro, but it's a working Unix terminal. I can type in commands like who am I. And I'm going to open up a test here. And this is a simple validation test of a Ruby on Rails model object. It's a time sheet. And I want to validate whether the time sheet is valid by looking to see whether it's got notes, whether they're an admin user, whether it's an invoice week, whether the user's entered time. And that first case is really simple. But then I start to think about all the different booleans that I can flip. And I start writing out the additional pending context of additional tests. And I'm realizing that there's a lot of test cases that I have to concern myself with. And for what seemed really simple, the simple validation of a single form element, that's a lot of tests to write. What happened is I fell victim to a thing called the rule of product. And it's a real thing from math because it has its own Wikipedia page. And what the rule of product tells us is that if you want to find the number of combinations of something, just multiply all of the different variations possible for each of those entries. And that will effectively give you the number of branches, the number of potential test cases that you need to write. So in this case, this is actually the easiest case because they're all booleans. And so I'm merely having to deal with two to the fourth, which is 16 test cases. So you go back to the original example, like why do people who write a lot of tests hate big objects? Because you might say, look, I'll just add one more argument to this method. You don't realize that implicitly you're saying, and double the number of test cases that we have to write. But that's effectively what you're doing. So testing does make writing big objects harder. That's why we discourage writing big objects if you're going to especially try test-driven development or do a lot of unit testing. So I encourage people to stop the bleeding and stop adding to big objects, create an escape hatch, you know, start writing new objects. And for those new objects, limit it to one public method and at most three dependencies. And if you're not used to writing a small, a lot of small objects, that's going to seem like ridiculous advice. People freak out when I when I when I propose this. They say, we'll have too many small things. It'll be this chaos of so many well-organized and carefully named and comprehensible small things. How will we ever deal with it? And I think that as programmers, a lot of us get off on our own incidental complexity because it makes us feel like serious engineers that we understand something other people don't. And so they feel like advice like this is like programming on easy mode, like it's not real. But it is real and it actually is easy. You can, you know, it turns out that most of the applications that we're writing are not rocket science. We just managed to make it look like that by how complex we are. So I encourage people to just write small objects. If you take one thing away from this talk, that would be a good one. The next thing, people hate tests that go off script. You know, code can do anything. Our program should be unique and creative. That's why we're in this business. We're fascinated by this. But tests really can only do three things. And all tests follow a single simple script. That is, every test ever sets stuff up, invokes a thing, and then verifies the behavior. And this has been formalized into a pattern language of arrange, act, and assert. And if that sounds awkward, you might say given, when, and then. Those are the three phases that are common to every single test. And so in my test, I try to call out what phase I'm in because it should be obvious which one I'm doing. And surprisingly, it's often not. So in this example here, you can see that I've got an X unit style test and it's all sandwiched together. And so in this style test, I always add significant white space. The only new lines in this test are always going to be between my age, my given, my when, and my then. If I'm writing a spec style test, like here in our spec, I'm given additional framework goodies. I can actually pull up those setups into like a let statement. And then I could split up my assertions into separate tests. And so now I can have two tests with my side effect, my when happening in a before block, and then each of my assertions happening in their own it. Additionally, I try to minimize each of these phases to like one line per meaningful action for the user. And the late great Jim Wyrick from the Ruby community, he wrote a tool called our spec given, which is a DSL like on top of our spec, which is itself a DSL, that was just a range act assert conscientious about how the API was designed. This was ported to many tests. And then I ported it to Jasmine and other people have ported it to Mocha and a handful of other tools. And what it does is it takes this this complexion of all different method names and allows me to just say given given given, these are set ups, when I do this, and then really it shines in the assertions because it knows it's an assertion, it can just say then, you know, this thing on the left should equal this thing on the right. And it'll actually perform introspection of the language and give you good error messages when it fails. But you end up with a much Turser test where your intention is much more obvious. And I found that tests that are written this way, whether you use a DSL for it or not, they're easier to read. They point out any potentially superfluous test code because if they don't fit one of those buckets then it's a yellow flag. And they can highlight certain tests like design smells in your code. Like if you've got a lot of given steps then maybe there's too many dependencies in your subject. And if you have multiple when steps, then it's possible that your API is just hard to use. If your API does something, you should be able to do that with one line with one method. And if you can't that's a problem. If you have a lot of then steps then it's probable that the code is doing too much. Maybe you're violating command query separation and you're having side effects as well as a return. It's good that you can listen to your tests and learn about the design of your code. Next thing about test structure I want to talk about is hard to skim or hard to read code. People are fond of saying that test code is code so therefore we should treat it with seriousness. But I kind of feel like test code is untested code so we should treat it with extreme suspicion. You know logic that we write into our tests that confuse the story. That simple three step story makes it more confusing. And additionally if you've got test scoped logic like loops and branches it's really hard to read and any errors that you find are going to be easy to miss because it's self untested. And so people often you know that they see like syntactical duplication in their tests. They want to like dry them up by generating those test cases. In Agil land like if you've ever done a testing Cata you might have done the Roman numeral Cata before this is converting you know Roman numerals to Arabic numerals and this looks really repetitive. And so you might get the idea like I'm going to pull up this and key it off of a data structure. So here I've got all these keys on the right and the left mapping to the to the Arabic numeral answers. And so I can generate a test method. And there's nothing technically wrong. This totally works. The problem is that when you think about what we just did we had experienced pain in our test and our solution was to change the design of the test. My inclination is always to change the design of the code first. I'd rather improve that because at the end of the day that's what's more important. So that silence that design feedback. Because if we look at the exact like like the actual production method here you can see that all that data is actually hiding in if and else statements. All those that primitive obsession still there. So what if we were to pull out the same set of keys in the production scope and then iterate over those keys or use use that information when it's codified into that dictionary to loop over it and now instead of needing hundreds of examples in order to make sure that it's fully tested I can make sure that I've got pretty good code coverage and just add additional keys without necessarily needing more tests. It's a much simpler approach. Sandy Metz from the Ruby community has a thing she calls the squint test for sizing up code. And if you just squinted a test what I would try to do is say like I should be able to add a glance know what the subject under test is. Additionally each of the methods under test should be organized in a symmetrical easy to discover way. And when you have a tool like our spec or Jasmine you can say here's a context and here's another context and so those should represent each of the logical branches that your code can fall under. It makes it really easy at a glance to zoom in on what matters. Additionally a range act to search should always be called out somehow. If you're doing an X unit style thing you can see a range act to search here is with nothing more than just that significant white space that I talked about. People also hate tests where there's too much magic going on and possibly where there's not enough magic going on where there's too much verbosity. You know both sides can be a problem and just like any problem in software figuring out the right testing API to use the right testing library is a balancing act because when you look at testing library some have very small API's which is great and some have very large API's. An example from Ruby of a small API is called mini test the name implies and the stuff that you write with are classes, a set up override, a tear down override, every method is a test. It's very easy to learn how to do assertions. The guy is a funny guy so if your tests are already dependent there's like little hooks for that too but there's literally like a dozen methods in this whole test framework. Meanwhile you compare that to our spec it's this this this avant-garde collection of aliases and different ways of setting up hooks and metaprogramming and specifying and clever matchers and it's a really really huge API for testing with a ton of opinions. Jim's given API is very straightforward given when then and a couple other niceties like invariance and natural assertions but it all because it stands on top of many tests in our spec it's still cumulatively that that that complex and it's not that there's a right or wrong level of how expressive or terse your testing API should be but you have to realize that while smaller test APIs are much easier to learn and easier to get started with you may end up with a lot more test scope complexity and bigger API's if you have a team that all has the time to really really learn it and stew in it that's great because you can write really terse and really intent revealing tests but it means that to an outsider or to a beginner it might be too intimidating you have to decide for your own team what the right balance is. The next thing people hate about test structure is tests that are accidentally creative if I've learned one thing it's that consistency is golden consistency is the most important thing that you can instill into any test suite if we look at this example here you can see here like I've got a whole bunch of names but I can't at a glance know what the thing under test is so I changed its name to subject and I change I always name the thing that is returned by the thing under test to result or results every single time so even if you find a 500 line test that I wrote and it's really hard to navigate at least you'll know that the thing that's under test is called subject and the thing that I'm like asserting on is called result otherwise that in that mismatch both of those things can be easily lost and when you're generally consistent any incidences of inconsistency can convey important meaning in detail so for example if like test a and test b and I'm reading test c and test d as I reflect on this I'm going to realize oh there's probably something interesting about c it like that test looks different than the others and so I know to read that one with like a higher degree of scrutiny than I read the other tests but if every test is a special snowflake and looks completely different I'm not going to see based on the test that there's something interesting about test c in fact my reading of all four of those tests has to be that slow and that laborious which really slows me down as a maintainer of your test suite and as as as somebody who comes on to a lot of teams I would much rather inherit gobs and gobs of mediocre but very consistent tests over even just a handful of very creative custom one-off tests this stuff on the left is much easier to maintain because I can make broad based improvements to it whereas everything on the right is all going to have to be a a a handcrafted adjustment readers of our code they're silly right they have this funny habit of assuming that all of our code is meaningful and important and in general our production code should all be meaningful but test code often isn't there's a lot of stuff in our tests that's just there to kind of prime the pump to make it so that we can invoke the thing that we're testing and I try to point out the meaningless stuff in my test to help my reader out so I try to make an important test code obviously meaningless to the reader I don't want them to accidentally get the idea that this author object here has to have a valid looking name and valid looking phone number and email address because maybe the method doesn't really need that much I make it minimally meaningful so I'm going to actually change his name to pants and I'm going to change his eat remove the phone number change the email to pants mail update the assertion and now when you look at that you're under no delusion this has to be a valid author in fact you could probably just by looking at this test now know exactly how you would implement the display name method so it's a more informative test so yeah test data should be minimal I think most people get that but also minimally meaningful so that we're able to read it so that people don't think that a bit of test data is more important than it actually is and that self-importance people trying to write into their tests you know a bit of bravado to make it look more serious can actually be really distracting when you're trying to understand what's the bare essential so congratulations we just talked a lot about things that I I've observed about test structure over the years now let's talk about how we isolate our tests are excuse me our subjects when we're testing them from the world around them the first thing that I run across is test suites that don't have a clear focus and purpose if you define success on your team as simply saying yes or no is this thing tested I think that's the bar that a lot of teams still operate at as if that's the only measure of success but if you were to ask hey is the purpose of this test suite like readily apparent like does the test suite promote consistency of each of these tests most teams never take the time to think about that and I think the pushback I get is like why would I try to be consistent I've got my app does 500 different things I have 500 different kinds of things I want to test for you know like everything's a special snowflake and it's probably true that there are four or five classes of tests that you write but for each of those four or five types of tests create a separate test suite it's this complete fabrication that all of your tests have to go in the same directory and like sit side by side with each other if you have four or five different types of tests by all means create separate test suites for each of those types because then each of those suites can have their own conventions they can have their own test helpers and you can easily discriminate just by looking at any of those tests you know the intention that somebody's trying to get out of it I wrote a whole talk just on that topic called breaking up with your test suite available at that short URL to try to provoke some thinking about how you might do that so this is an agile conference everyone's seen the testing pyramid before we all seen this pyramid right so you know that the stuff at the top of the pyramid represents tests that are more integrated stuff at the bottom represents stuff that's less integrated and if I was on like come on to an average project and I had to chart where on the pyramid all the tests that I discovered were they'd all be squiggles like some would call through to other units maybe some are a little lower they fake out the relationship to other units some maybe like hit a database but fake out HTTP APIs some maybe hit those APIs but they still operate below the user interface and these are all tests that are all in the same folder and so to try to avoid that spaghetti I always try to start each test or each new application with just two test suites each at the two extremes at the top I write tests that are as integrated as I can possibly manage and I give whatever affordances I have to if I can't make it maximally realistic and at the bottom I write tests that are as isolated as possible and those tests at the bottom their job is to make sure that each little file listing does what it says on the tin and the stuff at the test just make sure that I plugged all those things together right and it's really easy when you're trying to chase these two extremes to answer questions like should I fake this or not because your pair is going to agree just because intuitively while we're trying to be as realistic so no you shouldn't fake it it makes it really easy to make decisions about test structure so as the need arises you know it might be the case that your app gets bigger and you need to define a semi integrated suite with a clear purpose and as long as you like establish norms and helpers this is perfectly acceptable I was on a project when we decided we were building a bunch of UI components we wanted to kind of write this functional test layer in the middle we called them component tests and we just arbitrarily decided we're going to fake out all the APIs we're not going to use any test doubles we're going to trigger like application actions instead of user interface events and we're going to verify the application state instead of the HTML templates that it was creating and we could have gone either way on any of these the important thing is we got together as a team we planned up front we decided we're going to make it consistent this way so all the test helpers were able to promote terse clear tests next thing in test isolation that people hate is too realistic of tests now this is a terribly framed question if somebody were to ask you how realistic your test should be most people would just say well I guess maximally realistic right as realistic as I can afford and you might look at your web test and you might say like these are really realistic tests look I've got a browser talking to a server that's connected up to a database but you can poke holes in that right you could say like well if your DNS goes down is it testing for your DNS rules and they're going to say no well of course not and you might ask well does it test your asset fingerprinting before you send all your assets up to the CDN if there's a bug in that well no because it all happens locally and so the truth is if you're chasing maximally realistic you still have a border of things you're faking and things you're controlling for versus things that are real but it's poorly defined and it's porous and so you really don't understand what's real and what's fake when you chase that goal so when a bad thing happens you might be liable like anyone on your team is liable to ask like why didn't we write a test for that why didn't we test that case and it's this trap that I've seen a lot of teams fall into where they write some tests then they have something blow up in production then they have an after action report they all come together and they scream why did that happen the only answer in this context in this frame of mind is well I guess we need to increase the realism of all of our test suite let's get rid of this fake service or throw away VCR and call through to the real API increasing the cost of the test because realistic tests they're slower to run they take more time when we're writing changing debugging they represent they require higher cognitive load because like we have to keep more in our head at once because there's more things underneath us and because there's so many moving parts they have more reasons to fail so realistic tests aren't free in fact having clear boundaries can increase our focus because we know what's tested and what's being controlled for and a team with clear boundaries that are established when they write some tests they're still gonna have stuff blow up in production but afterwards they can stand tall and have like a grown up discussion about well we all agreed that testing this class of problems was too expensive we also didn't predict this failure because it just blew up in production so we wouldn't have thought to write a test for it but maybe like let's say it was the Facebook API maybe instead of making all of our tests hit the Facebook API we can just write a targeted contract test just against that assumption instead of slowing down all of our other tests and that's the kind of thinking you get when you establish clear boundaries about what to make real and what to make fake and so we get caught up thinking that realism is a virtue and that all these other things are just affordances for our own convenience because we couldn't possibly have the time to run infinitely many realistic tests but I think that less integrated tests are helpful simply because they offer us better design feedback because they're closer to the code and when there's a failure because again it's closer to the code it's easier to understand where that bug is and what to do and how to fix it next thing people hate about test isolation is redundant code coverage this is a term you may not have heard but like let's suppose you've got a lot of different tests you might have browser tests you might have tests of your views tests at your controller layer tests of your models and that model may be associated with other models and you've got tests through there so that model is tested eight ways to Tuesday and you feel like that's a very thorough test suite and you're really proud of it and you're a test driven development team and so you write a failing test for that model and then you make it pass you implement the change that you want and you push it up to continuous integration and you come back down and you realize oh well the controller tests are failing the view test the browser tests are failing the other model tests are failing because all of them incidentally happen to depend on that model's original behavior and so yeah it's thorough but it's also quite redundant and I've seen that like on teams as their build slows down redundant coverage becomes more and more of an albatross hanging around their neck because it might take them an hour to do a story on Monday morning but then two and a half days of just cleaning up the build that broke and continued like increasingly unexpected ways so if you if you agree that this is a problem you might ask like how do you detect redundant code coverage it turns out that the same way we detect all coverage like so this is a code coverage report typically our eyeballs zoom in on the red stuff because that's the low hanging fruit where we can write tests but there's all these other columns too and people might ask like what are in those other columns so if you look at this last column it tells you how many hits per line and if you sort hits per line what you're really saying is that that method at the top is that's getting hit by 256 different call sites in my test suite and if I've got a method that's getting called 256 times if I change that method it's probably going to break a lot of tests so I want to keep those numbers down I've been on Ruby projects before where there's been methods with 45,000 calls you know in a unit test suite so minimizing that value can help you target how to reduce redundant code coverage additionally you might decide certain classes of tests just aren't that valuable on a lot of projects I've seen people throw out view and controller tests and simply again kind of just squeeze the edges of the pyramid have the browser test tell you everything's working and then have model tests where most of the application logic lives you could also try your hand at what you know might be called outside in test driven development where you work from the outside in and you isolate each layer underneath you with test doubles with fake objects some people call that London School TDD Martin Fowler in the room here termed it mockest growing object oriented software guided by tests is a great book about this so some people call it Goose I've deviated enough that the Goose authors have asked me to come up with my own name for it so I call it Discovery Testing I've got a four hour screencast series that's up online free about how I practice discovery testing this is a Java example of like an implementation of Conway's game of life talking through like this approach to TDD but unfortunately we don't have six hours together today so I'm just gonna have to move on next thing people hate about test isolation is careless mocking so I brought up test doubles let's talk about them if you're not familiar with the term test double it's like a super type over any kind of fake object or stub or mock or spy that you might see in a test suite incidentally I happened to own a company called test double and maintain several test double libraries so there's a little bit of a self-interest in me helping explain what test doubles are to folks one of the libraries I maintain is also called test double up on npm it's a javascript test double library similar to sign on and so like when I come to conferences and I talk about testing people often assume that I love mocking I mean I named my company test double but it's not nearly that simple it's a nuanced relationship the way that I use test doubles is very careful and rigorous like I have a subject and maybe I'm thinking that subject should have dependencies a b and c as a way to break up the work and so since a b and c wouldn't have existed yet I'll replace them with test doubles and then I'll carefully listen as I'm writing the test with those fake things you know for feedback like is this signature good or bad are the types that are flowing through these three things productive like can I can I can I get a green test at the end of all that without anything awkward and if so then I feel pretty good but I feel like 99.9 percent of how people actually use test doubles in the wild are they try to write the realistic test of a unit without talking to a real instance of a b and c and maybe a is easy to use and b is awkward and c is frustrating and so they use test doubles as a cudgel just to shut up b and c so that they can get their test to pass and they wind up with this inconsistent poorly reasoned mess where somebody else is going to look at it later and ask what was the value of this test like what are we really sure of after having read it and the answer is not much so it treats the symptom of the test pain like that was a hard to use dependency but not the root cause which is probably that you have an awkward api on that dependency and it can really confuse future readers as to what the point is additionally it makes me really sad from a branding perspective because I just named my company test double and most people hate test doubles because in practice they're so often abused so if you see somebody abuse the test double I would take it as a favor if you would say something you can you can share your story at Machio mox on the hashtag and I keep an eye on it actually and get some funny stuff so if you have any terrible test double usage war stories that you'd like to share I'd love to hear about them next thing on test isolation are application frameworks a lot of times people bring up how do I write this test inside of this app because I'm using this particular framework I'm using spring I'm using rails and so forth and frameworks are interesting because they provide repeatable solutions to like common problems that we all see but the most common problem that framework solve is how do I get my app to talk to x thing to perform some kind of integration for me and so you can think of your application as like in the center the rich gooey middle part is your domain is your custom objects that are that are not coupled necessarily to the framework or to its integrations but all the stuff around that crust is the framework coupled code so maybe the framework provides HTTP or email integration or database stuff major job queues and that sort of thing and each of our apps is tangled with frameworks to a different extent you know maybe a little bit maybe a whole lot or maybe barely at all and most of our application is actually just you know plain old objects but the dilemma here is that people forget that frameworks mostly focus on solving integration problems but the test helpers that frameworks provide us as a result they want you know to test that you're using the framework correctly so all those test helpers assume that same level of integration but people who are using frameworks a lot tend to just default to always ask the framework always look to the framework to get the answer to every question and so they end up only ever writing integration tests often times we have to remind ourselves that you know if your code doesn't rely on a framework then its tests don't necessarily need to either if you've got a plain old object somewhere or if you can articulate some code as a plain old object you can always write a plain old test so if this is your test suite that's coupled to the framework you can always start a second test suite that runs a lot faster and leaner without loading up the framework or operating under the context of a real database connection for instance so congrats we just got through everything to say about test isolation we're making good progress that's right we're like through two out of the three rounds we're going to talk about the third round today we're going to talk about test feedback like what's it like to use our tests the first thing I want to discuss is useless error messages people really hate bad error messages from their tests here's a gem that I wrote and the build failed and so whenever a build fails what's the first thing I do I pull down the code and then I run the test command so here I'm run the test so okay I can see my failure now I'll highlight it failed assertion no message given that's a really bad error message I have no idea what just happened to that test that means that my workflow looks something like this I see the failure then I've got to open up the test now I've got to probably print or debug at that point to see what the status of all the variables was then I can change the code hopefully see the test pass but this was so laborious that now I feel like I got to take a break because every time a test fails like 15 minutes has elapsed and so I see a lot of people really brag about how fast their test suite is at runtime without taking very much mind at all of whether or not they have good or bad failure messages and if you've got bad failure messages you might eat all of that speed up in just the waste of having to like do root cause analysis every time you see a failure and so if you think about tests as being responsible for giving us good messages and prompts for what to do next so here's a test here that's written our spec given I expected to give me a much better error message even though all I'm saying is username equals this other thing when I dive in what Jim did was he was he gave me you know expected an actual printed out the values and actually shows me both sides of that comparison and their actual value so he keeps calling to get at what the value of each of those objects is so I can see the username with Sterling Archer and if I dive in the username is this and then the whole user active record object so I can see a lot of state by just looking at a glance and that means that my workflow is simply to see the failure I can probably change the code immediately and then I earn a big promotion because I'm so much faster at writing tests than everyone else around me so when you're looking at like a testing API or an assertion library consider like don't just compare how cool and snazzy the syntax is but look instead at like what are the quality of the messages that it's producing this is really important next thing about test feedback to talk about a slow feedback loops I think about the number 480 a lot if you're not familiar the number 480 is really important in productivity terms because it's the number of minutes in an eight hour work day so if I think of my feedback loop and I try to visualize these numbers I might say that it takes 15 seconds to make a change to my code five seconds to run a test 10 seconds to decide what to do next that would be a pretty fast feedback loop at 30 seconds but by incident that means that I have an upper bound of 960 useful thoughts every day and that's if I'm really fast in reality I've got all these other responsibilities like non-coding stuff like email and so forth and meetings and then I have to pay a penalty of context switching when I go between the technical and the non-technical and so this would afford me based on that timing two hours of non-code activity each day and a 60 second feedback loop overall so 480 actions per day but five seconds to run a test is pretty fast and we just talked about some slow tests so maybe it takes me 30 seconds to run my test suite locally and if that's the case now I'm back up to 85 seconds but my email isn't going to care my meetings don't care how fast my tests are so actually even though that's 338 actions per day that non-code is also going to creep up every time my tests get slower so it's really more like 91 another thing we talked about was like bad error messages if my messages are bad it's not going to take me 10 seconds to figure out the next thing to do it's going to take me a minute and so when I account for that now I'm looking at 155 seconds or just 185 actions per day I'm way down from 960 these things add up if you've been on a project with a lot of integration tests I was on a project once we were running a single empty cucumber test took four minutes because there was so much data set up so if I got four minutes there to run a single test that's really slow right at 422 seconds 68 actions a day if you've ever had to run a command that takes four minutes you're probably tabbing over to Twitter or Reddit you're distracted you come back and without fail you realize that three minutes have passed since the test finished running and so what you're really seeing is 11 minute feedback loops and you only have 43 useful brain waves actually implemented per day you're not going to get a lot done and so when you compare 43 to 480 that's pretty significant in fact I'd say it's so significant that we just found something very elusive and mysterious in the software world the 10x developer so if you're curious about how to make yourself more productive as a developer get out a stopwatch and measure where your time is going and try to identify like what are the activities in your feedback loop and what's holding you back and then optimize that so I try to profile my productivity in exactly that way maybe my editor is like my unfamiliarity with my editor is the biggest hang up so if it's still too slow if there's nothing you can do you can always implement new features off to the side you know maybe create a little scratch project get fast feedback however you can and then integrate it later next thing painful test data so controlling test data is really hard you know it's really an issue of control you could have every single test control its own test data or you could end up having those tests have no coupling to the test data whatsoever no control over it like in line control here would be like I create all of the objects that my test depends on inside the test I might have a common set of fixtures giving me kind of a playground to work in I might have a snapshot of a known database and schema that I can work from that I loaded at the beginning of my test suite or I might do what I call like self priming tests where I just run the tests against any old environment like staging and that means that the test has to go and create a list create items if it wants to measure something about it and then tear it down itself you don't have to just pick one strategy you can you can choose so like I'd say that like in line makes a lot of sense for like simple units for models fixtures work really well for integration tests data dumps work really nice for smoke tests so that you don't have that four minute cucumber time and self priming is often the only choice that you have if you're running any tests and staging and production environments because you don't want direct access to a database and so in slow test suites data setup is typically the biggest contributor to why they're slow I don't actually have proof of that but it feels really true so I made a slide and I try to profile slow tests wherever I can just try to figure out where that cost is coming from and whether or not evaluate like whether or not I need to change my test data approach next this is a concept called super linear build slowdown people hate this as a term coined by JB Reinsberger what he identifies our intuition about test speed betrays us if we've got a test suite here and we write a single test and it takes us five seconds to run our brain is going to tell us that if we've got five times as many or 25 times as many tests it's going to take 25 times as long 50 tests 50 times as long and this turns out to be incorrect because what we what our brain tells us is if we have a five second test is that we've got five seconds spent in test code but that's not really how it works right because our test is primarily invoking something else it's going to spend some amount of time in app code some amount of time and set up and tear down in fact most of the time is probably in those two things and only a little bit actually like exercising the test itself so as you have more tests remember the system is getting more complex getting bigger your data is getting more complex and bigger so it's going to spend more time in app code you're actually your original test is going to get slower over time which is a little bit counter intuitive and so you'll start to see that deviate from where our intuition tells us our build should be if you have 25 tests maybe it goes from three to four seconds four to six seconds and now we're starting to really peel away and if I have 50 tests maybe I go seven seconds and 10 seconds here this is still like it's 18 seconds per test that's not the end of the world but you got to remember all those earlier tests are now getting that much slower too so this compounds geometrically and now what maybe was like a 150 second deviation when you're halfway through the project now your expectation is the test suite is five six seven times longer this is how teams wind up with these multi hour long builds that six months ago weren't that big of a problem so the best way to avoid this is to avoid the urge of like every time we do a story we always have to add create read update destroy as if by wrote integration tests don't assume that that's really unhealthy instead I try to exercise every new feature through an existing test I try to snake through the system and early on an interesting thing that you can do is you can set a firm cap on build duration like we're never going to let the build get slower than five minutes and so if we wanted to have something like add something else to the build we got to take something out first or we got to make things faster last thing I'm going to talk about is false negatives false negatives might be a new concept to you but really at the end of the day it's this question what does it mean when a build fails the intuitive response is to say well it means that the code was broken that you broke something but that's not quite right because what file needs to change when you've broke the build and almost always it's well we forgot to update a test so we had to go update a test to fix it so it was actually a test that was wrong not the code and this gets to like what the difference between a true and a false negative is a true negative when there's a red it means that you actually broke the code and so the fix is to fix the code a false negative a red means that you just had like some unfinished test or some unexpected test failure and the fix is to go fix the test now when our managers see the build break they think that all of our breaks are true negatives aha a bug was just caught and that reinforces the value of our test but in practice they're depressingly rare 90% plus of all build failures are probably false negatives and that's a bit of a bummer in false negatives they erode our confidence in our test when you as a team are complaining about how you hate your test suite a lot of it's probably because most of your build failures are just chores they're just you know, silly busy work that you have to do in order to get to green so that you can ship and that's why I think people end up hating their continuous integration suites the top causes of false negative test failures when you have a lot of redundant coverage you have a lot of unexpected breaks and you have really slow tests so like if you've got a lot of integration tests in your system you probably deal with a lot of false negatives and so what I try to do is track each build failure and identify postmortem was the fix to the test or the code and try to track then as a result like how long it took to fix it I can use that to justify to the business saying hey we wasted 300 hours last month fixing a bunch of false negatives please give me 200 hours next month to invest in figuring out the root cause of this so that's a little bit about test feedback congratulations that's all three rounds if this talk was a bit of a bummer for you because it was about stuff people hate about their tests I just remember like no matter how much how bad your tests are how much you hate your tests I probably hate Appleworks and macOS 9 more than you hate your tests this is a really bad idea to do a talk in macOS 9 I don't recommend you do this last bit I told you I run an agency called Test Double well we're based in the US and Canada we work with companies all over the world if anything that I said today resonates with you and your team might need some help whether it's in testing or general like software craftsmanship we'd love to work with you you can find me on twitter there I've also got business cards and stickers and stuff and I'm going to be hanging around would love to chat so without further ado thanks everyone for sharing your time with me I had a lot of fun coming out here I presume no time for questions or do we have time for we have five five minutes all right let's let's take a couple questions nobody got a question I may have talked too fast sorry for that for what it's worth if I did talk so fast that you missed something this talk is already up on Vimeo you can check it out on our blog I'll tweet out a link on the conference hashtag or you can just google how to stop hating your tests and you'll find it no questions perfect glad I could just help you all out and it was that obvious so great have a great day everybody enjoy Martin's keynote it was great to meet you