 be talking about what is good test coverage and why does it matter? So I'm gonna start with why this talk. Where I came from before a couple years ago, I did a lot of Java development and a lot of PHP development and I'd never written a test in my life. I think maybe once I had a college assignment and I wrote assert true equals true and thought this is dumb. But then I joined Pivotal Labs. Pivotal Labs is a you know big on test-driven development and testing and I kind of got immersed into this world of testing code. And I found it really interesting. I love testing. I love TDD. I love to argue about it and talk about it and try to convince people it's a good idea. So I was surprised a few months ago when I read a blog post with this title. TDD is dead long-lived testing. I'm sure many of you read this blog post. DHH wrote it. He gave a very lovely talk at RailsConf around it. And it interested me. I mean the blog post to be honest is a little bit of a mess. There's a lot of different points mixed in there about TDD and about test practices that don't necessarily all fall under this title. But there is one quote that I pulled from that that really kind of started to pique my interest. And it was this one which I'll read. I test active record models directly, letting them hit the database and through the use of fixtures. Then layered on top is currently a set of controller tests. But I'd much rather replace those with even higher level system tests through Capybara or similar. Now you'll notice this really doesn't have a lot to do with TDD being dead or long-lived testing. But it is an interesting point. And before I go any further, I wouldn't be a good pivot if I didn't defend TDD for a quick moment. So here's why I like TDD. I think that a lot of times it helps you develop faster. It promotes good design practices, which is something I know DHH would disagree with me about. It promotes Yagney. It kind of keeps you focused on building exactly what you need and nothing more, nothing less. It helps promote the single responsibility principle if my testing is difficult to do because I have five things going on in a single object. Then it kind of helps me find those kinds of smells. And I think the best part for me is that it promotes testing to first-class status. So because I'm writing tests first, I'm kind of building up test coverage. But it does bring up a good question, which is, am I getting good test coverage from the test that I write when I do TDD? And we'll come back to that. So after that blog post got published, Uncle Bob had a great response, which I will also read out a good quote from. If you have a test suite that you trust so much that you are willing to deploy the system based solely on those tests passing. And if that test suite can be executed in seconds or minutes, then you can quickly and easily clean the code without fear. So it's kind of a run-on sentence, but let's distill it down to its two main points here. And I want to kind of form these as a question that you can think about. Does my test suite give me confidence to deploy the system? If my tests are passing, can I say ship it? And is my test suite solid enough? Is it fast, and is it reliable enough to allow me to refactor and change and modify and add to the code? So with that in mind, I'm going to start talking about test coverage. So let's talk about the traditional definition of test coverage. So there's a few different ways of measuring coverage. And when I say coverage, I mean, you know, you have a tool that says you have 85% coverage or you have 40% coverage. And so there are different ways to measure that. So I'm going to give a quick introduction to that. So here's a piece of code that do I need groceries method. I'm a bachelor, so if my fridge or my pantry have food, I don't need groceries. So I say if both of them are empty, or if either of them are empty, actually sorry, if either of them are empty, I get groceries. Otherwise, I don't. So function coverage says, I write this test, I call do I need groceries, I pass some arguments, I assert. I have called the function, therefore function coverage has been met. Now we can get a little more specific. So here's another piece of code. I returned the numeric day of the week. And I have a flag that says, do I want today's date or not? And so statement coverage would say that I write this test, I call numeric day, I assert on the result. And I've covered every single statement inside of this block of code. But you'll notice that there's a huge problem with this. Which is that if I pass false to this method, it will blow up because I have an initialized date. But statement coverage has been met. So now we go back to the do I need groceries method. And we'll talk about branch coverage. So you're familiar with the method now. Here are my tests. So in one case, I say I pass empty and full, I get true. I pass full and full, and I get false. And so going back to the code, I've covered both branches of execution here. And so branch coverage has been met. Back to the same piece of code, condition coverage is another type of measure. And we go back to our tests and what condition coverage says is that both sides of my Boolean expression have evaluated to true and false. So in this case, I pass empty and full, and I pass full and empty. And I have satisfied condition coverage. But you'll notice I don't actually have a test that returns me false. So then we get further down and we get condition decision coverage. And that is kind of a combination of branch and condition coverage. And so now we have kind of probably what you would write if you were going to write tests for this, which is the empty full, the full empty and the full full case. And you can actually get deeper. There is a modified condition decision coverage, which says that every Boolean expression should affect the result on its own. And there's multiple decision coverage, which is basically sending in a truth table. You see those more in safety, critical applications like aviation. But we have some basic kind of understanding now of how our tools measure coverage. In Ruby, most of the coverage tools measure line coverage, also known as C0 coverage, which is more or less statement coverage. There's some differences there, but I'm not going to get into those. But let's talk about the downside of measuring coverage this way. First of all, not all tools are created equal. So as I just said, Ruby is lacking in getting kind of deeper into this condition decision coverage and these kind of more meaningful metrics. It only covers have I executed every line. Missing code. So here in this example, here in this example, I have a little block of code. I test this variable and I say if it's foo, I do in some action. If it's bar, I do some different action otherwise I raise. But what if there's an extremely valid case that my variable could have been assigned to Baz test coverage tools can only tell you how how covered the code you've written is, they can't tell you how much how covered the code you haven't written is. So this is also known as a fault of omission. And it's going to be missed by your coverage tool. Assertions. The first test there call do something and assert that it's true. Test coverage tools will tell you that you have the exact same amount of coverage from that first test as the second test where I don't have an assertion. So test coverage tools can't tell you that I've actually tested the behavior they can just say yep, lines of code were executed. Coverage goals are dangerous. So, you know, managers like to do this a lot. If I have something that gives me a number, then they'll say, well, let's set a target. So if I have this thing that can tell me percentage coverage, they'll say, well, we should meet 85% coverage. And what inevitably happens with a team that's told to have 85% coverage is that they get 85% coverage. And the reason for that is that, you know, no matter how well-intentioned the development team is, they will inevitably run into situations where they're having difficulty testing, and they might test the easy things that can get them up to those percentages, but, you know, they'll allow it to be a crutch to say that 85% coverage means I'm done. And, you know, if my team has 85% coverage, and that was my goal, I mean, that doesn't mean that 15% of that code was unimportant. You know, it just means that the metric is kind of driving the behavior of my team. So you have to be careful about setting goals with metrics like that. But now I've spent a lot of time dishing coverage metrics, but they actually are useful. They can help you find weaknesses in your test suite. So for example, if my tool tells me that this section of code here has 40% coverage, if the tool is smart enough to tell me that my coverage is bad, my coverage is probably bad, right? But the reverse isn't true. If my coverage tool tells me that I have 100% coverage, I could have a suite with no assertions in it. So I can't trust the good numbers, but I can use the numbers as a guide to tell me what are the problem areas, what are the areas that need more coverage. Metrics don't tell the whole story, though. So, you know, this whole idea that 85% coverage is fine, whatever, you know, that is just a tiny piece of the story. And this kind of goes back to what DHH was talking about that interested me, where he said, I want to see more capybara tests. I disagree with that, and I'm going to talk about why. So let's talk about types of tests. And for those of you who write tests, this is going to be a little bit boring, but bear with me here. So the most basic type of test is a unit test. So green block represents a unit of code, got some inputs, got some outputs, and I test just that block. If my block has dependencies, I mock them out, I do not allow that code to communicate externally. I just test the unit that is under test. So unit tests are fast to run. They're fast to debug, because I have a distinct unit of code that I'm testing, so I know if the test fails, exactly where to look. They cover a small logical subset of functionality, which is really what helps them be good for documentation. They're great. It's great to come in to modify an object and to have a nice unit test suite that allows me to see what is it doing and help me figure out where I need to add what I want to it. So then we step out a little bit, and we have integration tests. So this is that same code earlier, but I'm not going to stubborn mock my dependencies. So I'm testing the integration of my unit of code with its dependencies. So integration tests are generally slower to run, because there's more going on if the dependency happens to be my SQL than my SQL is going to be a bottleneck. They're harder to debug than unit tests, because there are more things that could go wrong along the way. So if a test fails, it could be my code, it could be my dependency, it could be my integration with the two. But that's not to say that they're not useful. They're extremely useful. They verify that things talk to each other. You know, pretty much anything you write is going to need to have integrations between pieces of code. So I want to verify that that works. They're also really good at defending against regressions. You know, I could have a developer that maybe, you know, didn't run the whole test suite. He changed his little unit of code. He pushed it. But the integration tests will run and tell me, oh, you actually broke something, because somebody else depended on you doing something in some way, and now you're doing it in some different way. And you know, I actually ignored costs. So I'm going to call the unit tests low cost tests, because they're fast to run and they're fast to debug. And integration tests, because there's a little more going on, I'm going to call them medium cost. And then we get to the big boy, the acceptance tests. This is what DHH wants to see. These are also known as system tests or functional tests. I'm going to call them acceptance tests. So here is my whole system. And so if this was a Rails app, maybe this would be my models, my views, my controllers with some my SQL and some Redis in there. And I'm going to test the whole system as one. So acceptance tests usually slow to run. I have yet to find an acceptance test that runs as fast as the unit test. They are by far the hardest to debug. Any single line that fails, there could be 11 things that fail underneath the hood. And you know, sometimes I get useful data back that says, hey, undefined method. And then sometimes it's just like, hey, this thing didn't show up on the page. Good luck. But they are useful. Again, they verify that business requirements are met. Acceptance tests are like, you know, many QA people in your automated test suite. And they run through and they click on stuff and they say, yep, the user can pay for a service or the user can create a blog post. And they do provide the most feedback about regressions. I mean, nothing else is going to be better than actually running the system and making sure it all works together. But because of their speed and because of their difficulty to work with, I'm going to call them high cost. So before I put it all together, I want to talk real quickly about risk assessment. So how do I find high risk code? This is the obvious one. This is the obvious high risk code. Any code that could cause bodily injury, economic laws, data laws, anything irreversible is bad, right? That's extremely high risk. There's no question about that. But there are two other things I would bring into high risk code. One is high traffic code. So if I have a system that requires my user to be logged in, the login page is extremely high traffic. And any bug, no matter how small, will be, will affect every user along the way. And so the effects of a small bug get multiplied because of the high traffic. And then also a very complex code is high risk. And the reason for that is if I have some code that, say, pulls in eight lists of users and collates them and then decorates them with more data and there's all this logic going on, and the code is complex because the domain is fundamentally complex, it's high risk because if another developer wants to come in and fix it, there's generally a high likelihood that they're going to break it, you know, because they need to get the entire context. And so when we put all of this together, how do we use the evaluation of risk and the evaluation of cost to put it all together? So high risk code, generally I want coverage at any cost. That's not to say I should just throw high cost tests at the problem, but I should be less concerned about having high cost tests covering high risk code. Low risk code, I definitely want low cost coverage. If I have to debug acceptance tests on something that says that the admin has a link to the admin page in the header, you know, that's generally a low risk feature and I don't want to have, you know, these complicated tests that I have to maintain covering them. And then, you know, there's code that lives in the middle and I want to have a balance. So, you know, the purpose of this talk is I want you to be able to go back when this conference is over and the hangovers are cured and you get back to work on Monday and I want you to ask yourself these questions. So, if you're not using test coverage metrics, can they be useful? There's a lot of good tools out there and, you know, with the caveats I mentioned, I think that they can be valuable. So, if you're not using them, I'd recommend considering it. If you are using them, I would recommend considering are you using them correctly? You know, are you focusing on the things with low coverage and are you still evaluating the things with high coverage and not trusting the metrics to tell you the answer? Have you evaluated what is the highest risk stuff? I mean, I'm sure that, you know, you know some of the high risk stuff intuitively like, can the user pay for my product? That's high risk because it's extremely high importance. You know, but have you covered all the high traffic cases? Do you find yourself getting stuck maintaining high cost tests? I know I have gotten stuck where I spent two days figuring out why these acceptance tests flake out on CI and why the developers have to run the suite three times to get a green pass. And then, you know, days and days get lost to this. And so, is the time that I'm spending doing that worth it? Is the test that's flaking out something that's testing something that's really not that important? Because if I spend four days fixing a test for something that's really minor, you know, that's not an efficient use of time. And so then with that, can I move my test coverage away from the higher cost tests toward some of the lower cost tests? And, you know, can I basically can I get the highest return on investment with the time I'm putting into my test coverage? And kind of to put all that all together, like the way I kind of emotionally feel about it is, you know, do I feel happy with my test coverage? I mean testing is supposed to make our lives easier, not harder. And, you know, is it, going back to what Uncle Bob said, is it giving me that confidence to deploy things and to change things? You know, is it making my life easier? So those are all the kinds of things I want you to think about. And, you know, if the result of this talk is that any of you maybe delete some of those unnecessary acceptance tests and maybe write some more tests around things that you realize are higher risk than you thought, I think that my mission will be done today. And so with that, I just want to say thank you for having me. Thanks to Stefan for organizing this event. Thanks to everybody for being here. And I'll take some questions. Yeah, what I would say is that, oh, sorry, it is kind of a long question. I'm going to try to restate it. Basically, the question was, in web development, the integrations are the most complex piece and the acceptance tests seem to be the best way of making sure that those integrations work together. And, you know, what are my thoughts on kind of, on, you know, that idea that, you know, I really, my integrations are hard so I need more acceptance tests. I agree with that. You know, I would say that, like, full on Cappy Bar at tests are, I mean, sometimes they're overkill. I mean, so, you know, what are the integrations and can I test the integrations in smaller pieces? You know, rather than having to click around on the app and have, you know, these big test suites of clicking around on the app, you know, if I'm worried about how my controller and my database communicate, then I would maybe say, okay, I write controller tests that talk to the database and those are integration tests, right? And also, I was going to bring that up. If you write controller tests that talk to the database, that's not a unit test. A lot of people seem to have this misconception. It's a dependency. But, you know, if that's, that's the complex part, then I would test that. You know, if, if I've seen projects and this is kind of a question I don't have an answer to, but where I have a heavy JavaScript front end talking to an API, that's very hard to test just those pieces. And so a lot of times we depend on acceptance tests. And so I would say, you know, if that's the only way you can come up with, then yes, go for it. You know, that's how you're going to test that integration. But if you can maybe lock down the API and then create fixtures and then test the JavaScript alone against the API of your, of your, you know, backend, then that is a more efficient use of your time because now you have tests that run faster on both ends and you don't have these big cap bar tests along the way. That's a good question. I actually don't know any good coverage tools for JavaScript. I'm mostly familiar with the ones for Ruby, which are, there's a few services out there like code climate and a lot of them use SimpleCov under the hood, which is a jam out there. I'm not too familiar with what's available out there for JavaScript. But if there isn't anything, I recommend you write it because I think we could use more test coverage tools and especially something that can do something more. So a lot of the Ruby tools are limited by the fact that they do statement coverage and line coverage. And I think that we are lacking in both in Ruby in particular for branch coverage and condition coverage. Like testing in general or test coverage? Testing in general, I mean, that's such a wide topic. You search, if you Google Ruby testing, you will find a million good resources. I mean, Google pretty much helped me research this talk. There's so much out there. There's a lot of good blog posts out there. I know when I first learned Rails and I started to learn the TDE process, there's a, there are a couple great Ruby tutorials that are very heavy on doing the TDE style of Rails development and that really helped me kind of get started on seeing that and then, you know, kind of once you start to get comfortable testing. So, you know, if you're just starting testing, I would say, don't worry about this. Just start testing, right? You know, like just get out there and add test coverage to your suite. It's really kind of when you have a lot of test coverage and then you find yourselves getting bogged down in testing your app that, you know, you really need to start considering this risk versus cost analysis. So the question was, have I worked with third-party developers and had to get them to add test coverage? I have. So the approach I would take, so if you're starting out with code that has no coverage, you know, I think that you start with the high-risk code so you evaluate one of the highest risks. I mean, so first of all, they have to buy into the fact that that testing is good, right? Like, if they're not going to buy into that, then, you know, I'm not going to get anywhere. But if they've bought in that, hey, we want to add tests to my code, you know, you want to find the high-risk code first and you want to test that at any cost. There's a great book working with legacy software. And in it, he talks about, you know, if you have this giant piece of legacy code that you have to start adding tests to, you know, start adding those big, broad, important tests. So what, you know, if I'm going to take an app that lets me pay for the service, that's, you know, probably the most important part. So I had a test, can I pay for the service? And then, you know, can I use the service if it's a blogging service, can I write a blog post? And you just kind of start to add those big tests. And then, you know, as you go further and further along, you can kind of start to add the lower risk and the lower cost tests along the way. All right. I think that's my time. Thank you very much.