 It's a little dark here. It's a little dim because all my slides are in black, so please don't fall asleep. I'll try to speak loudly and quickly to keep us high energy. So the story of how this talk came about was I realized last year that conference season tends to line up with Apple's operating system release schedule and I'm a huge Apple fan boy and I always want to be writing like the latest and greatest but at the same time I also need to make sure my slides work because I'm always really nervous about that getting broken and you know maybe it was because they announced the iPad Pro or something but I was thinking like maybe OS 9 is finally ready for me to build an entire talk in it and really put my you know typical 80 hours of work into building a slide deck so this talk was written entirely in OS 9 let's boot it up and see how it goes please bear with me all right so we're starting up it's a little retro the flat look all right so I've got to find my my talk on the desktop all right it's Apple work six and I got to find the play button there and here we go this talk is called how to stop hating your tests my name is Justin I play a guy named sorals on the internet and like Jonah so kindly introduced I work at a company called test double a really great software agency if you don't know me you might recognize this space as being associated with lots of snarky tweets getting retweeted I'm told I don't actually look like this anymore which is depressing for my hairline this is the tapper edition of the talk I've never given a talk at a conference that served free beer but I hope that means that you all like this more than you otherwise would and let's dive in you know like I talked about hate and test why do people hate their tests and I what I think happens is that there's a cycle that teams almost always seem to go through first you start a new project you're an experimentation mode you're just like making stuff work you're having fun you're doing new things you're iterating really quickly and eventually you get into production it's important that your stuff still works so you start to write some tests you put a build around it you make sure that your new changes don't break stuff but without a lot of forethought without a lot of planning you know like it's likely those tests are going to be slow and as they aggregate they're going to really feel like a like a burden eventually you see teams get to this point where like they feel like they're slaves to their builds to their tests and they're always cleaning up tests and they're yearning for the good old days when they didn't have to worry about that they could just move fast and break things and I see this pattern often enough that I'm beginning to think that an ounce of prevention might be worth a pound of cure in this instance because once you get to the end there's only so much you can do you can say like our test approach isn't working and like how do you respond to that you might say well then apparently you're just not testing hard enough you need to buckle down and whenever I see the same problem emerge over and over again I'm never a fan of the work harder comrade approach right I think we should really like introspect our tools and our workflows and be open and honest maybe like how we're testing or that we're testing isn't that effective another thing that I see people do is like they'll say hey you know like clearly we're off off off page off base here testing is job one let's treat it as the most important thing and try to remediate and you might get away with this for a little while but testing is never job one from the people who are paying our paychecks at best it's job two the job one is shipping stuff is getting stuff out the door and you you can only get away with this impedance mismatch for so long and if you're you know not Greenfield I'm talking about prevention stuff today if you're in like a big monolithic app and you're like checking out now because you're like oh well this is all about you know stuff to do at the beginning of the project I'm way too late for me don't worry not a problem there's just one weird trick to starting fresh with test testing that's right you're going to find out what the one weird trick is basically you just move your tests into a new directory create a second directory and then you have two directories and you can write a shell script that runs the test out of both directories until like you know as you're porting them over to the new clean test suite eventually maybe you can decommissioned the old crappy tests now I hesitated before giving this talk or producing it at all because I'm the worst kind of expert I've got like years of naval gazing in the agile community about testing I've done lots of open source projects that to help people with tests I've been that guy on every single team you've ever been on who cares just a little bit more about testing than you do and I'm I'm in so many highfalutin Twitter spats about terminology and stuff that I don't know why anybody follows me so my advice if I were just to tell you what I really thought is toxic it would just demotivate you it's risk averse it's it's no fun at all and so my goal today is instead of just spewing all that at you is to distil down that toxic hell stew of opinion into three component parts the first one I want to talk about is structure like the physicality of our tests like how are we writing and laying them out and organizing our code the second is isolation because what I've found is that how you isolate the thing you're testing from the components around it communicates the concept of the value that you're trying to get out of the test and finally we're going to talk about feedback do our tests make us happy or sad are they faster they slow what's it like to live with these tests and what can we do to make it better now keep in mind we're always talking from like the perspective of prevention here these are these might be newer different for for your particular project but I hope that you can pick up some of them along the way and experiment with them yourself. Now is it around this point that I realize doing custom artwork in Apple works six is a huge pain in the ass so my brother picked up this old family feud Apple to game and we just ripped off the artwork from that so we're going to just operate off of this like family feud board it's a real board because like if I point to it and I say something ridiculous like show me potato salad it'll give me an X but I in fact I didn't have a hundred people to survey for populating this board I just surveyed myself a hundred times so I know all the answers I know all the answers already so so it's going to be a really bad game of family feud first we're going to talk about test structure right and the first thing that people hate about their tests are when they're just too big to fail big gigantic test files and in fact I want to pause have you ever noticed people who are really into especially TDD but testing generally that they really really seem to hate big objects more than other people do big methods and sure I think we can all agree that big stuff is harder to deal with than small stuff but why is it that like testing aficionados in particular hate it and what I found is that testing actually makes working with big objects and big methods much harder which is a little bit counterintuitive I think that the root cause if we were to like analyze the nature of big versus big objects and big tests is if you have like a big object you probably have lots of dependencies and so you could imagine that means that your tests have lots of setup okay that's weird big objects typically have lots of side effects in addition to whatever they return which granted means like your tests might have numerous verifications okay that that adds to the tests but big objects also have many logical branches based both on the arguments you're passing it as well as like the broader state of the system and what I find is that this is where things really fly off the rails is you have many test cases to write based on all of those branches and we're going to take a look at an example now now I want to show some code off but it was at this point that I realized OS 9 has no terminal because it's not Unix so I had to go and find another one somewhere so let's check this out we're going to boot it up takes a minute sorry it's old almost there all right so here's our CRT terminal editor this is a fully operational Unix terminal that means I can type in arbitrary commands like who am I okay great and open up them here I'm gonna write a simple test of a let's say it's an active record model called timesheet and there's a validation and this validation depends on whether they've entered notes into their timesheet whether they're an admin if it's invoice week whether they've entered time and so I've got the first case down but then I'm thinking okay well there's all these other permutations you know like what if there's no notes or what if it's like a non-admin user or what if it's an off week instead of an invoice week or what if they don't have any time entered and now I'm realizing I've got a lot of different permutations to concern myself with this really sucks what happened here well what happened is I fell victim to a thing called the rule of product now the rule of product comes from the school of combinatorics and math and the reason I know that is because it has its own Wikipedia page and what it says just to TLDR for us is if I've got a method like this and has four four arguments to figure out the number of possible combinations of those four things you just multiply together the number of variations between them so in this case that would give us the upper bound of potential test cases we'd have to write and in fact when they're all Boolean attributes we have a really easy case because it's only two to the fourth so we only have to write 16 test cases for this very trivial validation method so if you're used to writing a lot of big objects and big functions it's not uncommon just feel like oh I'll just add one more little argument here you don't realize the implication of that statement is and double the number of tests that I have to write so if you're used to writing a lot of like if you're comfortable with big code and you're trying to get more serious about tests my recommendation to you is to recognize testing makes writing big objects harder to deal with testing is supposed to make things easier to deal with but here in this case I just advise you to stop the bleeding stop adding on to big objects I try to personally when I'm writing Ruby I try to like limit every new object I write to just one public callable method and at most three dependencies and maybe just a handful of arguments and never any more than that so I've got lots of small objects all over the place now that people push back right if you're used to big objects and like seeing everything in one place you know that's uncomfortable like how are you gonna organize all this people respond they say but then we'll have way too many small things like how will we possibly deal with all of these well organized and carefully named and comprehensible small things and we always have to guard ourselves against the fact that I think as programmers we get off on our own incidental complexity you know this enterprise crud stuff that we're doing doesn't have to be rocket science but when we make it so convoluted we feel like that's real hardcore programming right and so to some programmers this advice makes them feel like you're telling them to program in easy mode and that's my reaction is like yeah it is easy like we don't have to make the stuff so hard just write small objects the next thing people seem to hear about the test is a test that go off script what I mean is like when we think of code we were here because code can do anything you know every program can be creative unique and solve an entirely new problem in some new and fun way but tests can only really do three things they all follow exactly the same script every test ever you know set some stuff up it invokes a thing and then it verifies the behavior we're all right like every test is we us writing the same program over and over and over again it's been formalized in such a way as like you might have seen these broken up into phases the arrange phase and then the act phase and then the assert phase a little bit more natural English might be given when and then but the important thing is in all of my tests I always take great pains to consistently call out all three of those phases so for instance if I've got a pretty condensed looking like mini test here what I'm going to do is like I'm just going to add a new line after my arrange another new line after my act and if I do that consistently it means that when I'm just skimming that test at least I always know this is set up this is the behavior I'm invoking and these are the assertions and always having it thrown in that order so that it reads like that script if I'm doing like a more BDD are specky kind of thing then I can take advantage of the expressiveness of our spec they've got constructs for stuff like let to set up a new variable at the beginning anyone who knows our spec is going to know oh that's a setup step I can use before there to like invoke my action which has a side effect so there's no return value and then I can split up my assertions into separate it blocks if I so choose and it might be more verbose but at least at a glance we know exactly what phase each of those lines belongs in if we're familiar with our spec I was trying to minimize each phase to just one line per action so that it's really clear what it's doing from like the user or the stakeholder of the test perspective one way that that I really like conceiving of this is the late great Jim Wyrick who's a huge hero of mine should be of anyone in the ruby communities one of the his final gems that he contributed back to us is called our spec given and it provides a given when then conscientious API that tries to be as terse as possible but still as expressive as you need for tests I think Mike, Mike Murphy is in here I think he was the one who initially ported it to mini test I've ported it to Jasmine and somebody's ported that to mocha for JavaScript tests and I really like it a lot like what it looks like is you know instead of using let we just use a more natural label given our action we just call win and what I really like is the assertion so you can just say then and you pass it a block and it'll actually produce really good error messages by using source where to look at the AST of that block recognizing it's a comparison and then building you know dynamically a really good error messages whenever anything fails what I like about tests that think about given when then like this they're easier to read they point out potentially superfluous test code because if they don't fit into a given or a winner then then something's a little weird and they can highlight certain design smells for instance if you have a lot of given steps in any bit of code then maybe you have too many dependencies or things are too hard to set up or too complex of arguments if you've got more than one when step in order to invoke a bit of behavior then clearly the API could be more user friendly like why why do you have to take multiple actions against an API in order to see the thing that you want to see and if you've got a lot of then steps then maybe the code's doing too much or maybe you're violating command query separation by returning a value and also having side effects these sort of smells like this is the what people might mean when they're saying like you're getting design feedback from your tests you sort of build an intuition about this over time next thing people hate about tests is hard to read tests some people are fond of saying hey test code is code and I think the implication here is so you take it seriously but like in my mind test code is untested code which means it should be treated with derision and skepticism I and minimized in particular logic and tests confuses the story of what's being tested by logic I mean ifs and elses and branches and loops and stuff and test scoped logic like that isn't only hard to read because it obfuscates the script we just talk about but it also like any errors in there are really easy to miss because no one's testing the test now what I find is that people most excited about testing are often the most eager to introduce test scoped abstractions for salving their test pain for instance in this case somebody might see an opportunity to dry something up by looping over and generating test cases like this is the test that they're looking at you can see they're doing some kind of like Roman numeral Cata converting Roman numerals to Arabic and yeah there is a lot of textual duplication here so the the the temptation might be okay let's use a data structure for all that loop over it and then for each of those entries in that hash let's generate a new test and technically there's nothing wrong with this this test will work it'll give you good messages there's nothing like technically incorrect about what this person did but I think that they robbed themselves of an opportunity because they salved that test pain without asking themselves maybe there's a root cause about the production code which is granted much more important than the test code that I could have used as design feedback instead because if this is what their production code looks like and it's a bunch of ifs and elses that all if you look at the like the constant the strains in there they're all like the data is hiding in there so I could pull out the same or a similar set of keys into a hash and simplify dramatically get the data out of that function and now because the the the data is all driven through that initial hash now I don't need to worry about quite as many test cases I can just test a handful of cases I don't need 25 assertions in the front end Sandy Metz has a thing that you might have seen her talk about having a squint test and I don't have anything quite so fancy but when I'm looking at a test suite I open up a bunch of tests at random and then I just look around and see like how obvious is it for me to tell you know what is the thing that's under test and what is like how are the methods under test organized are they in order like lexically to like what the what the subject source file is and I like to use things like in our spec context to indicate like these are the logical branches underneath that method being tested and very orderly very consistent most importantly I always want to see can I read a range actor cert really clearly in the test even if I'm using X unit there's still no reason I can't be super consistent about those kinds of things next thing about test structure people hate tests that are too magic I think some tests might struggle with not being magic enough by being too repetitive but it's important to keep in mind like all softwares a balancing act you know testing libraries and their APIs are no different they vary dramatically in expressiveness some have very small API's and so they require you to do more heavy lifting some have big API's and so they allow for more expressive tests but then there's more to learn if you look at something like mini test you know I love mini test just because everything is a class we know classes these are methods everyone knows methods assertion the assertions are all really straight forward now of course Ryan's a funny guy and he's got some methods you can you can extend like I suck in my test or order dependent but in general it's a very, very small API and that's it's greatest strength our spec meanwhile has a massive API and they've got described in context and subject and let you know the before with each sweet and all after each sweet and all around each sweet and all it specify object should have and all these rules and and expect to be and mostly similar rules and there's a lot to our spec which allows you to write expressive stuff but boy it's a lot to learn what I like about what Jim did with our spec given is you know he's got given when then and in variant and then instead of having a big assertion library he mostly does that heavy lifting through introspection that could be called natural assertions but unfortunately it stands on top of all of mini test or all of our spec so it's not like you're actually obviating that complexity you're just standing on top of it so that all that to say that there's not like a right or wrong level of expressiveness and testing APIs but it's important to keep in mind if you're using a small one it means it's easier to learn but you have to be on guard against having like complex test scoped logic like introducing and helpers into your tests because obviously those are all one off and you own those forever whereas using a bigger test API like our spec might increase the like burden of being able to like get into that project even though it's getting you like much tercer looking tests the last thing about test structure people seem to not like is accidental creativity what I found is I've only learned one thing in this whole journey it's like consistency is worth its weight in gold if we look at a test that's similar to one we saw earlier open it up I always look for like what's the thing under test and I always always always just name that subject then I look at like what's the thing I'm getting back from my act step I always always always name it result or results it's just that kind of little bit of consistency even if that test becomes huge and terrible at least I can scroll through and like no aha that's the thing under test I can't do that in 90% of the projects that I open when everything is mostly consistent then inc cases of inconsistency can actually convey meaning to us you know if I'm looking at your test suite and like oh okay test A B C D okay great I'm gonna stop and I'm like oh hey look test C is different that must mean that there's something interesting about object C so I'm like when I'm reading that test when I'm reading that source code I'm gonna have like you know my ears are gonna perk up and I'm gonna be a little bit more conscientious as I dive in read it with more scrutiny because there's probably something special about it but when every single test is its own special snowflake I have to bring that same level of scrutiny everywhere I go and so it's just way more cognitive depletion as I'm working through your test suite as a result as a developer I'd much rather mean like inherit a gigantic test suite of hundreds and hundreds of very consistent tests even if they're crappy and mediocre because they can be improved in broad based ways versus a handful of like handcrafted artisanal brilliant one-off tests because these ones if I make some improvement to test A I'm completely you know starting from scratch in my own comprehension not to mention improvement of test B and D and so forth readers also have this silly habit they assume that all the code that we write has meaning that's actually not true at all in test code a lot of the code that we write in our tests is simply plumbing to be able to like you know facilitate the downstream behavior that we're trying to assert and so I try to point out meaningless stuff so that my readers can laser focus in on the stuff that actually matters so I try to make unimportant bits of test code obviously meaningless to the user for instance look at this case here we're creating a new author he's got a very realistic looking name and phone and email but it turns out that doesn't matter so I'm going to change his name to pants and I'm going to eliminate the phone I'm going to change his email to just pants mail because that's all I really need to drive the assertion I'm after and now what might have been like you know a confusing situation of like well why do we need a valid author here it turns out this thing's just doing very simple string interpolation so now like a just looking at this test because it's so minimal anyone could probably implement that function by just looking at it so test data should both be minimal but I also try to strive for minimally meaningful don't make something meaningful in your test data unless it really needs to be so congratulations we've talked a lot about test structure let's move on to talk about test isolation first thing here that people I think really mess up is they don't have a very clear focus in their test suites because when teams are defining what success looks like for them most are just happy to ask the question is our stuff tested yes or no if yes I feel pretty good but there's like so many more nuanced important questions like hey is the purpose of every test readily apparent does the test suite promote consistency so we can maintain it and if we answer no here you know people push back because like you know most common feedback is like I have all kinds of different tests that I need to write because all these conditions because my system does a million things but I think if you're really careful about it there's probably like you could identify three four or five different types of tests that would you know cover 80% of what you need you not every single test needs to be its own special snowflake so what I encourage them to do is actually create separate test suites we can make as many folders in our in our system as we want right so like create a separate suite with its own directory its own configuration its own set of conventions for every type of tests that you are going to write in your system that way it's supremely consistent in each one I actually did a whole talk just on that topic called breaking up with your test suite giving some strategies and thoughts behind that but one way to visualize test suites is this thing from Agile land called the testing pyramid the the short version is that illustrated tests up at the top are like more integrated more realistic and tested the bottom are like less integrated more like unit tests and most test suites that I come across which one big gigantic folder and I'm going to open up one test at random and it calls through to other units okay fine I'm going to open up another test at random it fakes out its relationship with maybe just some of the other units around it another test might hit the database but fake out third-party HTTP APIs another test might call through to those APIs but operate underneath the user interface and so they're all over the place and that means that every time like a pair is going to work on a test or write a new test they have these low value arguments of hey should we mock that should we not mock that and that is not a valuable discussion to keep having so instead I just try to start with two suites each approaching one extreme I write one test suite at the top that's as integrated as I can possibly manage that way if I'm in an argument with my pair about whether to mock something or not the answer is no let's make it as realistic as we can and similarly I write another test suite that's as isolated as possible so the answer is always isolate everything from all of its collaborators unless you get to the point that you're writing a pure function now it's really easy to know what to mock everything so the bottom thing it's job is make sure each individual little file listing does exactly what it's supposed to do the thing at the top is ensure that when it's all plugged together everything seems to work okay and that seems to work pretty well for the first you know several months of any project eventually things get complicated enough where like you might identify like a middle tier where you'd like to have some test suites around the branching and that because maybe it's not so imperative anymore and it's fine as long as you agree up front to like what are the norms of that test suite because we have to be careful of getting all fushy again fushy fushy like you know squiggly lines so in this case last year I was on an Ember team and we agreed we were gonna start writing a middle tier type test for the Ember components we were writing and all we agreed up front was we're gonna fake out all the APIs we're not gonna use any test doubles in these tests we're gonna trigger logical actions not user interface events we're gonna verify app state not stuff that's like in the HTML that's rendered into the DOM and we could have gone either way on every single one of those questions but because we just agreed up front things were much more consistent much more manageable and those tests now everyone at a glance knows exactly what's happening another thing people hate about the tests is test suites that are too realistic and I think that teams get trapped here because it's almost as if we're asking this question about fake stuff versus real stuff of something like well how realistic should your test be and that's an unfair question because of course anyone's gonna say like maximally realistic I guess just as real as possible I wanna make sure my stuff works and so somebody might be proud of their very realistic looking test suite because they've got a real browser opening up against the real server against a real database and they feel like yes that is maximally realistic but you can poke holes in that you can say well does it for example test through like your DNS configuration or make sure that your DNS is working well no no of course not or you know does it actually make sure that your cache and validation strategy for your CDN works and then they'll say well no he doesn't test that that far and so what they're really saying is yes there is this boundary of real and fake but it's very poorly defined and we don't understand it really well and that gets you into trouble in situations where you know maybe something breaks and somebody replies hey why didn't we write a test for that and there's no good good response other than yeah I guess we should have tested for that it gets teams into this trap where they write some tests stuff blows up in production there's no helping it it's gonna happen and then they have an after action report they all sit down around the table and they're like why did this happen how can we prevent this never again let's go back and increase the realism for all of our tests now there's nothing wrong with realism per se except for the fact realistic tests are slower because they have to do more stuff they take more time to write take more time to change they're harder to debug there's a higher cognitive load because they're doing more stuff and they fail for more reasons because there's more moving parts so there's a real cost to increasing realism that we often don't acknowledge additionally having real clear boundaries wherever we set them helps us increase focus on like what's being tested and what's being controlled so that seems like a good idea when I find teams that have really clear boundaries about what they're testing up front they all agree we're gonna be real about this and think about this they write some tests and you know stuff will still blow up in production because there's no helping it and at the end they can stand up have a backbone and acknowledge you know we agreed up front that if testing for this classic class of issues up front was too expensive it wasn't worth the time we can absorb the hit additionally it's really hard to automate stuff that we didn't predict and we did not predict this production failure so it's likely we wouldn't have had a test for it even if we had been looking for it and finally we might be able to write a targeted test of just that one thing that we're worried about without increasing the realism of all of our tests so we talk about realism and test as like being a universal good but I don't think it's an ideal necessarily because less integrated tests are useful too they give us feedback about how our object APIs are to work with so we get the design feedback and anytime there's a failure it's much more local it's easier to understand what went wrong we spend less time trying to figure out how to fix things make things work next thing about test isolation that people hate even though they might not know this phrase is redundant coverage because suppose you're really proud of the fact that you've got a huge test suite you're testing everything you test you've got browser tests you got view tests you got controller tests you got model tests you got models that relate to other models so like you got incidental coverage by the fact of like this model depends on that model and when I test this one I'm also getting some free coverage of that test and you're really proud of having you know very thorough coverage but you know let's say you got a new requirement in that model down there and you're a test first team so you write a failing test then you write a test to make it pass or write the code to make it pass and you think great okay let's let's push that up to continuous integration this is a guy on a construction hat so I think this qualifies as meeting the theme of construction hearing today I rarely ever do stuff on theme like that so I'm proud so anyway we pushed it to CI what we find is that all the other stuff just broke the controller test depended on that former model's behavior sort of the views and the browser incidentally depended on that too and now what might have taken me 30 minutes Monday morning I'm spending two and a half days just cleaning up all these tests that broke incidentally and so it was thorough but it was also highly redundant and redundant coverage can really kill team morale because in the early goings when you can run all your tests locally and you do so really frequently you catch this stuff but as stuff gets too big to run really frequently locally eventually you push it up into the cloud and you're just cleaning up CI for days and days whenever you have a new feature you can detect redundant coverage it turns out there's this really cool tool that we can use to detect redundant coverage it's the same way we detect any code coverage it's just that normally because there's cool fancy colors we all hone in on that first column and we're like aha the coverage is low here let's increase it but the other columns are interesting too it turns out this last column if you organize a story by hits per line it tells you things like hey look that method at the top is hit 256 times when I run my test if I change that method there's probably a whole lot of tests that are gonna break so it's something that's worth looking for another approach just heuristically is to identify a clear set of layers to test through so for instance in this case we might like you know really be honest with ourselves maybe those view tests and those controller tests aren't adding a lot of value maybe the browser test is enough and just testing at our model layer eliminating entire classes of tests that might not just be adding a bunch of value we can also try our hand at what you might call outside in test driven development where you work from the outside in but every layer has test doubles that like fake out the layers below it some people call that London School TDD or mockest or you might have heard of the book goose growing object oriented software I've lately started calling what I do in this in this area discovery testing if you'd like to watch a free screencast series there's one up at this URL it's also on our blog where I talk through how to how to do outside in TDD if you're interested and I can't talk about mocking and outside in stuff without talking about how people really hate test suites that have a lot of careless mocking in them so if you're not familiar with the term test double a test double is a super type of any kind of fake or stub or mock or spy that you might use to us like a stunt double stand in for a real thing when you're writing a test also like like Joan and mentioned come from a company called test double so it's we've got it we've got a company named test double we always also maintain several test double libraries so there's a little bit of brand confusion in the section of the talk additionally I didn't want anyone else to grab the NPM package test double so I was like all right let's go make a test double library for JavaScript so that's the thing now that's taking up most of my nights and weekends and that means that when I'm at a conference people are like oh well clearly you're pro mocking you love mock objects right well I hate mock objects because they suck on my project is a conversation that I've had too many times and I have to say well no it's kind of more complicated than that because I've got this very rigorous approach to using mock objects the simplified version is say like if I've got some subject that I want to write I'll start with a test of it and I'll imagine like what are the things that this thing might conceivably depend on well maybe if I had A and I could pass the results of A on to B and then on to C that would carve up the work for me nicely and so they don't exist yet so I start with fakes of those things and then I use the test as a sounding board and I'm actively listening for you know like what are the data contracts between these imaginary dependencies how would the data flow between them and then get a passing test that would like just wire up all those things together and if there's any sort of awkwardness it's really cheap to fix those A, B and C dependencies because they don't even exist yet so it's a really great way to get some design feedback early in a cheap way most people I don't think have this nuanced relationship with mock objects I think most people try to write a realistic test and some of those dependencies that actually existing ones are wired up they're easy to work with but some are hard to work with some are a pain to set up or instantiate and so they just use mocks as like a cudgel just like shut up the B and C or replace you with fakes because you're too hard to deal with I just wanna get my test passing they get the test passing, they're exhausted they push it up but really like at the end of the day what they did treats the symptom of that test pain but not really the root cause that they've got these bad relationships with their dependencies it confuses future readers cause now they don't know what's really being tested what's the value the person wanted to get out of their test it makes me really sad because my brand is so unfortunately tied up in test doubles that it gives mock objects a bad name so I implore all of you if you see somebody abuse a test double please say something feel free to hashtag it mock your mocks and I'll follow along last thing about test isolation people seem to really hate when they don't have a clear story about how to integrate with their application frameworks because frameworks are really useful they provide repeatable solutions to common problems right but the most common problem that a framework solves is how do I get my app to talk to x other thing it's solving integration concerns and if you think of a framework like this you know I think of the code in the middle as being like my domain code it's just like stuff that's just pertinent to like the app that I'm writing and the stuff in the thin candy shell is all the stuff that's coupled to the framework that's mostly concerned with how that framework helps me integrate with other stuff stuff like HTTP or email or databases or job queues like Sidekick got Mike right Mike for him he maintained Sidekick and Sidekick is a great job queuing solution for Ruby if you don't use it I just saw Mike out of the corner of my eye I wanted to give Mike Sidekick rocks all right so we all have different levels of tangling with our frameworks like you know maybe you have a moderate amount of like code that's decoupled from your framework and moderate that's a couple to it maybe you have a lot maybe use Rails that's really coupled to the framework maybe you don't use a lot of frameworks and it's just kind of like a little thin shell around it but regardless there's this dilemma right because most frameworks focus on integration problems and they also tend to offer us a lot of test helpers to help us write tests but those test helpers are very integrated because the framework is trying to solve integration concerns so naturally the test helpers assume a high level of integration which results in people who look to their test or to their framework for everything writing only integrated tests and that leads to that realism problem that we just talked about earlier where all of your tests are too realistic, too slow so if some code doesn't rely on a framework your tests don't need to rely on the framework either you can just write plain old Ruby objects and plain old tests and things will go much nicer you might have one test suite that's aware of the framework and calls through all the framework stuff to make sure everything's working in an integrated fashion but you might have another test suite that just focuses on your domain code with no attachment or awareness or loading of the framework not only is it faster but it gives you a much clearer focus of what you own versus what you don't so that was a little bit about test isolation good job we're two thirds of the way through this show test feedback is the last thing we're gonna talk about I wanna start with useless error messages cause it's something that really sucks about living with a lot of test suites so let's say hypothetically I just broke the build now what? for instance like whenever I break the build at test double our branding changes from green to red if you go to the website it might be because it's April Fool's Day but apparently all of our sites are red today I found out in Slack this morning when I was trying to take a different screenshot but seriously now what? If I run that test that just failed up in Travis and I pull it down like let's say I'm running rake here I wanna see that test failure locally I look for the message to figure out what went wrong I see okay failed assertion no message given that is not helpful like my workflow for solving this is now I see the failure I read the test now I've gotta put in print statements or I have to like actively debug to figure out what the values are what's going on then I can change the code then I can see the test pass and finally I need to take a break cause that just took me 20 minutes that's not a good workflow now you might brag about how fast your test suite is but if you have really bad error messages when things go wrong all that time that you're saving in the speed of your test suite might be going into analysis whenever any test fails unexpectedly a different test here this is one that uses our spec given and it's still really terse but because our spec given has good messages in mind when I run the test I'll get a better message even though I'm just saying user.name equals equals sterling archer here I can see in the error it's like oh sterling Mallory archer didn't equal sterling archer I got it I see like the comparison right next to each other okay and what's cool is our spec given will actually continue invoking both sides until it doesn't have any callables anymore so you can see underneath there it's actually printed out the entire user object for me and so now I might be able to diagnose the entire problem just by looking at the message output so now my workflow is really fast I see the failure, change the code and then I earn a big juicy promotion because I'm so much faster at my job so judge assertion libraries as well as how you're using assertion libraries not just on how fancy or fluent or nifty the API is but on the quality of the messages that it's giving you another thing about feedback slow feedback loops 480 is a number that I think about a lot it's the number of minutes in an eight hour work day and so if we use that as a baseline for how much time we have every day let's say that I like take 15 seconds to change my code and it takes me five seconds to run a test and it takes me 10 seconds to like interpret what just happened and figure out what to do next that's a real fast feedback loop that's a Gary Bernhardt speed feedback loop that's 30 seconds so that would give me an upper bound of 960 useful thoughts I'm allowed to have in a day which isn't a lot now most of us like me have a lot of like non-code responsibilities at work so you know you factor in some for that factor in the context switching back and forth and that would be a 60 second loop or 480 actions per day and if this is our baseline that would give me enough time for like two hours of meetings and email and talking to humans and stuff now where things fly off the rails is like let's say instead of five seconds my app is getting bigger and now it takes me 30 seconds to run my test locally well that increases up to 85 seconds in my feedback loop down to 338 actions a day but unfortunately my email box doesn't care how fast my tests are and so I've got to increase now the amount in my feedback loop for like non-coding activities so really it's more like 91 seconds now another thing we just talked about bad messages right if your messages are really bad maybe instead of 10 seconds it takes you a minute to figure out what to do next so you're looking at like 155 seconds that's only 185 actions per day these little things are starting to add up they have big impact I've been on projects too where it took me four minutes as a baseline to run like an empty cucumber test because there were so many factories and so much data set up and so that feedback loop super duper slow 422 seconds I could only ideally run that test 68 times per day but in fact you know like if you're waiting four minutes at your terminal you're gonna get distracted by email or Reddit or hacker news or Twitter and so inevitably this gets bigger because I come back and like shit my tests have been done for three minutes and so you gotta chalk that up to distraction and that's like an 11 minute feedback loop and so then I have like at best 43 actions per day if I don't contribute like or even think about the fact that my brain has been rotting from just how frustrating a job that is and 43 as we can all see here is a much smaller number than 480 so this stuff is really significant in fact this is really special because we just found it together we found the ever elusive 10X developer so if you whether you believe in a 10X developer or not you'd be surprised how a few seconds here and there really tend to start to add up I encourage everyone to pull out a stopwatch and think about where your time is going what is your feedback loop look like and what can you do to optimize it and if you're in a system that is just like too far gone and stuff is too slow consider like spiking out new stories new stuff that you have to do off to the side and then integrate it later with your app so you can at least be productive when you're creating new stuff even if it's going to be slow once it's back and integrated another thing people hate about the test painful data controlling test data really hard you know you can there's a kind of a spectrum of how much control over your test data you might have like you could use inline models you're just creating instances like by hand and all of your tests you could use something like Rails provided fixtures or you could use a data dump like I have like a carefully created SQL file that you load at the beginning of your test or you could use what might be euphemistically called self priming tests where you don't have control of the test data and if you want to test an API you have to first use the API to go and create all the stuff that you want to test and then tear it down manually at the end some people are under the solution that you can like only have one test data strategy in like your entire project but it might make sense for different test suites to have different test data strategies so for instance you might have inline for models you might use fixtures for integration tests data dumps are great for smoke tests if you want those to be fast and then self priming is really the only show in town if you're trying to write tests against staging or production environments where you shouldn't have direct access to the database so in slow test suites I find like data setup is usually the biggest contributor to that slowness and it's usually the first place I look I don't have any proof of that but it felt true so I put it on a slide and it's always the first thing that I profile and I'm very eager to like rip out like if I'm music factory girl rip it out, start over if I can because it's really difficult not to speed up on factory girl necessarily but it's really really difficult once you have a ton of tests to completely change the test data setup because it's so tightly coupled to the test themselves all right let's talk about something J.B. Reinsberger coined super linear build slowdown so our intuition betrays us we have this intuition that says if I got one test and it takes this long 25 tests is gonna take 25 times as long 50 tests is gonna take 50 times as long the reason that our intuition tells us this is that if we got a five second test we assume that that's five seconds running that test but in fact we've got time spent in app code and we have time set in data set up and tear down and maybe it's spending two seconds actually exercising the application code you write two seconds and set up and tear down and only one second in the test and what like this counterintuitive aspect does is it means like if we've got five tests now but maybe as our system gets a little bit bigger as we add more tests the app is getting more complicated there's more interaction of features the app code part is actually a little slower as well as the setup tear down we have more fixtures we have more factories we've got more stuff going on so every test is getting marginally slower as we add new ones and so our intuition says we should be way along that green line but reality starts to deviate even relatively early on now we had 25 tests maybe we start to see like oh yeah we're spending about four seconds in app code six seconds and set up and tear down and we start to see like oh man like it's not uncommon at all for me to talk to a team who's been working for a year and they're like yeah it builds about 30 minutes and that really stinks we don't know why but you know it's still manageable or whatever but if we go up to 50 seconds here maybe things get a little bit slower it's like more interaction very often just one or two little commits will make everything a lot slower and we won't catch it because we're so carefully acclimated to these tests and then you just see things fly off the rails this is like a geometric curve go up so that same client that called me and said oh it's about 30 minutes not uncommon at all like if it takes like a year to get you to a 30 minute test suite like three months later you're at like a three hour test suite and just like everything in your organization is like come do a crawl because in that first 25 tests maybe we only increased like we deviated from our expectation by 150 seconds but by the end 500 seconds in the second 25 tests this is a really common phenomenon that I see over and over again so in general when I'm writing a new story for any new application I avoid the urge to just create some new crud integration tests to go along with it I try to find a way to exercise that new feature with an existing suite or with an existing test snaking through the system wherever possible so I'm not slowing everything down another thing you can do is you can set it like a budget for your tests you might say like hey we're gonna have like not allow more than five minutes of runtime for our entire test suite and once we get up to five minutes we have to stop either have to make the test faster make the system faster or throw out a test before we can write a new one Last thing we're gonna talk about false negatives so if you ask the question like what does it mean when a build fails people will say well it means the code's broken right especially our managers they probably all think that but that's not really quite true because what file needs to change to fix the build well more often than not we have to update a test to fix it so in that case it was a test that was broken not the code and this gets to this definition between true and false negatives a true negative when you have a red build it means that the code is broken and the fix is to go fix the code that's a that's great a false negative red means that like we forgot to update a test somewhere so essentially our work was unfinished the fix is to go update the tests so true negatives they reinforce the value of our tests they make us feel good unfortunately they are depressingly rare I can only think of a few times in the last six months that a test actually caught a bug for me so that's a bit of a bummer false negatives they erode our confidence in our tests they are what make tests especially slow test weeks feel like a chore the bum teams out so the top causes that I found for false negative test failures if you got a lot of redundant coverage you have unexpected failures if you have a lot of slow tests and you're not able to run them all locally i.e. you have a lot of integration tests so that like if you've been like zoning in and out of this what's now a little bit of a way too long talk sorry about that if you've been zoning in and out the TLDR is like write less integration tests write fewer integration tests and you'll be a lot happier one thing I do is I track whether each build failure is a true or a false negative and how long it takes to fix it and this is an interesting exercise if this is a new concept to you like it might be interesting to look back in the last couple of weeks of failures and see how many were true negatives and genuinely valuable and how many were false negatives and just waste because that waste can be used to like analyze like hey we wasted you know 40 man hours last week we could use that same amount of time to invest in broad based improvements to our tests to reduce redundant coverage or reduce unnecessary integrated tests so congrats we got through all three of these sections you know this test either this talk is a little bit of a downer because it's about 15 things people hate but remember no matter how bad your tests are I probably hate Apple works more than you hate your tests this is a real pain in the ass but I'm really glad that I got to share all this with you today if your team is hiring I know everyone is always hiring and they're always looking for senior awesome developers and it's hard to find them test double as an agency we would love to hire us instead we'd let we work with existing teams we love working on big gnarly apps alongside engineers you know helping each other get better as we go if you're interested in helping improve how the world right software consider joining test double I'm gonna be around all day we got Dustin Tinney here and Katie Miller also from test double we'd love to talk to you and most importantly thank you for sharing all this time with me I really appreciate it thank you Justin