 Our next talk is going to be Michael Lynch. He's going to give us a great talk about why good developers write bad tests. Let's give him a round of applause. Hello, everyone. Welcome. Thank you for coming. My name is Michael Lynch, and this talk is called, Why Good Developers Write Bad Tests? So there's the answer. My whole talk is just these two bullet points. So for the next 30 minutes, we can take a nap or something. Now, you probably need some more convincing. So all of you in this room are good developers because you're here at Pi Texas. You're expanding your skills. You're willing to hear a talk from a new speaker who right off the bat accuses you of writing bad tests. So a problem that a lot of us have is that we learn a lot of these good practices in production code. We learn to refactor our code to eliminate redundancy. We learn to write these small, tightly-scoped functions. And so we've been told they're good practices. And in fact, we're told that they're the best practices. And so when it comes time to write tests, we don't really question them. Because why would you question the best practices? We just keep using them. But in a lot of cases, applying these best practices from production code actually leads to worse test code. And throughout this talk, I want to give you a few examples of how that can happen. So before we get any further, just to clarify in semantics, I'm going to be talking mostly about unit tests. A lot of these techniques will apply to various kinds of testing. They apply best to unit tests, but they apply to others in general. We're going to keep talking about unit tests just to keep things concrete. And so unit tests, as Andy talked about yesterday, they're tests at the smallest level of granularity. So you're testing that a particular function is correct. So if we had a function called Fahrenheit to Celsius, here's a simple unit test. You'd pass it 212 degrees Fahrenheit. And we expect that the answer is 100 degrees Celsius. And so if this is ever not true, it means that something drastic has happened in the world of standard measurements, or more likely you broke your code and you need to fix it. And so here's an example of an anti-pattern I see a lot with people who are really good at writing code, but not so experienced at writing test code. So we're just going to walk through it. We're testing this class called the count manager. And so we're calling self.accountManager. Get score for the user Joe123. We get the initial score. We make sure the initial score is 150. So there's two big problems with this right off the bat. So first is where did this Joe123 account come from? Like why does account manager have Joe123? Is this just hard coded? Every single account manager has a Joe123. And then similarly, why is Joe123 score 150? Where is that coming from? And so a lot of times when I see a test like this, the developer says, oh, the answer's in the setup. And so if you scroll up to the setup, and if you're not familiar, the setup is the method that the unit test framework calls automatically before every test. So our setup, we're creating this mock database. It's got a row with Joe123 with a score of 150. And it assigns that to the account manager. And so the developer might say, okay, so all the answers are in the setup. And so this is a good test because everything you need is in the setup and in the test function itself. So I think this is actually a bad test. And throughout this talk, I'm gonna explain why. So to understand, test code isn't like other code. So when we think about production code, it's too complex to read all at once. So if every application was 10 lines of code, we wouldn't need things like functions or classes or modules. You could just write those 10 lines and read them top to bottom. We invent all of these techniques and these things like functions and classes to break large bodies code into small consumable chunks. And so good production code is well-factored. It allows us to think about these very complex pieces of code in small, narrowly scoped pieces. And so the components are in logical chunks. We break things into layers of abstraction where we can understand each thing in its own layer. Test code is very different. Test code, if you think about a unit test function, it basically is its own tiny application. It's very simple. It usually just starts up, calls one function, exits, but that's its own application. And so in the case of test code, it is simple enough to read top to bottom. Most unit tests, you could write them in just 10 lines of code. And so you don't need these things like classes and functions to break up the complexity. The other thing that's different about tests is that developers often read tests in isolation. It's not often that if you read a production function, if you read a member of a class, a method, often you have to read other parts of that class to understand. You have to read the member variables. You have to read the other functions that it calls. If you have a unit test that fails, often reading that unit test and understanding that single function is enough to understand why it failed. You generally don't have to read other functions in that test suite to understand what went wrong. And lastly, tests must be correct by inspection. We have confidence that our production code is correct because we have tests that exercise it. You generally don't have tests for your tests because you would just end up in this infinite regression where you're writing tests for everything else that you're writing. And so tests, the last line of defense is the fact that you can read it and reason about it. And so good test code maximizes obviousness. It makes it so that you can quickly jump into a unit test if it fails and understand quickly what it's asserting and why it's failing currently. And it should minimize cognitive load. The more complex you make your unit test functions, the harder it is for developers to reason about it, the easier it is for them to overlook bugs that are in the test function. So this, I don't know if this comes up, sort of. So this is, I took this in my basement. I've got a radon remediation system in my house that sucks air from below the floor of my basement out above the roof of my house. It's to make sure I don't die of radiation, radon remediation or radon poisoning. And so it's got a very interesting unit test in that it's got a glass tube that's filled with liquid and the tube is connected by a straw into the vacuum tube. And so I know that the vacuum tube is creating suction because it's sucking the liquid up to this four mark there. And so this is a very interesting unit test for the physical world because it's very hard for, it's very obvious when that fails. If it's not reading four, if I, for example, pull out the straw so that it's not getting suction, the water recedes to the zero level. And so it's very hard for there to be a false negative here. It's very hard for the level to read for and there not to be suction. So this is a very good unit test in that it's very clear, it's very hard for something to go wrong and for me not to see it. And so we see this a lot in the physical world. If you think about something, if you've ever done hobby electronics, you probably recognize the image in the center as a multimeter. And so it's a very simple tool. It's just got these two probes and it lets you diagnose what's wrong with a circuit board, which is often very complex. There's also like, if you look at a weather vein, it's a very simple tool and that it shows you which way the wind is blowing. It just points in the direction that the wind is blowing. And so the thing about the multimeter is it's a layer simpler than the thing it's testing. And so this is a good thing to keep in mind with unit tests as well because unit tests need to be correct by inspection. So they often need to be at least one level simpler than the thing they're testing. And with a weather vein as well, like you, it's very, if you see a test fail, you don't wanna worry about whether the problem is in the test itself. You wanna think that should be as small a possibility as possible. And so a weather vein, like if it's pointing north, you have a pretty high confidence that the wind is blowing north. It's very unusual for a weather vein to point north when the wind is blowing east. And so you can think about that in your unit tests. You wanna apply those same principles. And so one thing that's very helpful when you're writing unit tests is to think about if another developer comes along when this test fails, can they diagnose what the problem is? And so going back to this unit test we had before, it's very hard for another developer to understand what's wrong here because they don't have information about Joe 123 where that's coming from or why the score is 150. And so we can solve that by inlining the setup method. I showed you before there was a separate setup method. And if we just take the body of that setup and put it right in the test function, that solves the problem because now we have everything we need to know right in the test function. The reader doesn't have to scroll up and down in your unit test file to understand what's going on. And so the other benefit of this is a common pattern in unit testing is this structure of Arrange Act Assert. So Arrange is where you set up your preconditions. Act is where you act on the object under test. In this case is the account manager. And Assert is where you assert that the post conditions are correct. And so if you inline your setup in this way, you achieve this structure that a lot of people will recognize quickly and it makes it easier for them to understand your tests. So the reader should understand your test without reading any other code. Everything should be right there in the test function where they can see it without a lot of scrolling. So next I wanna talk about DRY. We talked about this yesterday. So DRY is don't repeat yourself. And so we've generally learned this in production code and it's a good idea in production code. So here's an example of some production code where I'm showing you some calls to a SQLite database. And so if a lot of us in this room saw this code, we'd say, okay, a lot of this code is redundant. So almost every line in both these functions are repeated. The only thing that's different is this one segment of a string. And so for a lot of us are factoring like this looks pretty good. We'd say, okay, we're abstracting away all the things that are repeated. And so get user names and get user IDs are now just different by the only thing that's different about them. And so a lot of us looking at this is like if this is production code, we're saying, okay, yeah, I like this for factoring. It's fewer lines of code, less redundancy. So this is great. And so in the unit test I showed you before where I inline the setup, some of you are maybe thinking ahead and saying, okay, that's fine for that one function. What if you got another function like test increase score? Now you've got these three statements that are exactly the same. And so that's a big problem because we know we don't like redundancy. And so here the thing to keep in mind is that eliminating redundancy is not a goal in itself. You eliminate redundancy because it serves some other goal. And so the goal you wanna keep in mind is simplicity and obviousness. And I think those two are the most important things in unit testing. And so in this case where you're making trade-offs either way. So if you do refactor this into a setup, so sure you're eliminating redundancy, but you're also increasing complexity. Now you're making it so that the reader has to jump around your test file in order to understand the test. And so you're kind of making a trade-off either way. So in my opinion that just inlining it, it's three statements, it's not a big deal to inline these three statements. You add seven lines. It's gonna vary for everybody how many lines you're comfortable repeating. But for me, just these three lines, I feel like it gives you a lot of simplicity. And for unit tests, I think that's more important. So I think accept redundancy if it supports simplicity. And so maybe you'll indulge me that last thing. You'll say, okay, three statements, I don't mind doing that. But what about when you've got an object that's really complicated to instantiate? So imagine this account manager class, instead of just taking this one parameter, takes three and they're hard to instantiate too. So you've got these like, I think there's like 15 lines of code. And so you're probably saying like, I'm not gonna copy paste these 15 lines of code in every single one of my tests, that's crazy. And so I agree, that would be crazy. Cause at 15 lines, it's taking up so much of your test function that it obscures what you're actually trying to test. If a reader comes and tries to understand what behavior you're trying to assert, they kind of get lost in all the setup. And so your initial instinct might be to refactor this into helper methods. But first you should take a look at the interface you're using. So in this case, so the first parameter is this user database object. Okay, that sounds fine. That's, it's an account manager that manages users, so sure it can take a user database. Then the next thing where it gets kind of weird is it's taking a privilege database that's wrapped by this privilege manager object. So already there's a red flag that it's taking these two objects that are at different layers of abstraction. And lastly, it takes this URL downloader. And so that's very logically different from the two previous parameters. And so I see this case a lot where like this, the class interface has evolved over time where the parameters don't really make sense anymore and it becomes this thing that's very hard to instantiate. And so the problem is if it's difficult to instantiate in your tests, it's also gonna be difficult to instantiate in production. And so if you're tempted to write a helper method here, you can refactor your production class and in so doing you'll improve your tests and you'll improve your production code. And so yeah, so improving your production code simplifies your tests. So when you find yourself tempted to write a helper method to make your tests easier to write, think first about if you can refactor your production code to make everything easier. So sometimes you just, you have no other choice. You have to write a helper method. Sometimes you just don't have the luxury of refactoring the production class because maybe it's used in 200 places. So if you do have to write helper methods, you have to make sure you don't commit the cardinal sin of test helper methods which is to bury critical values in your test helper. And so when I say a critical value, I mean any value that the reader needs to know in order to understand why your test is correct. And so I'll give you an example. So this is similar to the test I showed you before. So here's our test method. We've got account manager and then we call this self.add dummy account helper method. And then we do adjust score and assert equal that the score is 175. And so here we see the same problem that I've been showing you where there's hidden information. We don't know why the final score is 175 because it assumes that there's this user Joe and 23 and it assumes their score was 150. And so the helper method, it buried these critical values. We needed to know these values in order to understand why test increase score was correct. It also commits a slightly less severe sin but there's this call to account manager.add account. So you also want to avoid this. It's burying interactions with, in this case, account manager is the object under test. And so as much as possible, you want to keep all interactions with the object under test in the test method itself. It makes it much harder for the reader to understand what you're doing to the object you're testing if you're sprinkling interactions with it throughout different functions. So you can rewrite this and you can still use the helper method if you just respect this law of not burying critical values. So this is the same code. We've rewritten it so that we're still eliminating some boilerplate in this make dummy account method. But you can see that all the information we need is right there. We see Joe 123 is created. We see that they start with a score of 150. And so it's very intuitive for the reader to follow this. We got 150 plus 25 is 175. So that all makes sense. And so I basically don't have to show you the helper method, but I can, but you see like all it's doing is eliminating some of the dummy work of presumably these parameters to account are required values. And so the helper method is just eliminating some of this stuff that's not relevant to your test. So don't bury critical information in your test helper methods. You want to keep all the information that the reader needs in the test itself. So now I want to talk about naming. So if you're writing some production code and you had the choice between these two function names. So user exists and their account is in good standing with all bills paid or is account active? So a lot of people in this room would probably choose the second one because the first one is very precise, but you don't want to burden your team with this super long function name. If this were a Java conference, I think everybody in the audience would be saying both those names are far too short. But it's Pi Texas, so we value conciseness. And so there's a difference in tests. In production, you don't want to force your teammates to have to constantly write this super long function name. But in tests, you never write calls to test functions. You write out a function name exactly once when you define the test because the test framework itself is what calls the test function. And so because of that, conciseness still matters, but you can err more on the side of being more precise and being more verbose. So here's an example of a unit test for imagine you're editing this class called tokenizer and you're editing it and then you run the unit test and you see this test fails and it says fail, test nest token failed with empty string is not known. So if you were the person editing tokenizer, you probably wouldn't know why this test failed. And this is a very common naming pattern I see where people will just prefix test with the function that they're calling in the test. And so the problem here is you're forcing the developer to go read the test implementation to understand why this test failed. And so if you just go crazy with test function naming, you can give a lot more information. So if you were modifying this class and you see the test fails is test nest token returns none when stream is empty, that's very clear. It's clear that so you modify this behavior where the test expected that when the stream parameter is empty, then next token returns none and somehow you broke that. But you can fix that without ever having to go read the test implementation. And so that should be a goal of your test naming. Your test should be named so well that people can diagnose failures just by reading the name of the test that failed. And so lastly, I wanna talk about magic numbers. So if you're not familiar, magic number is a numeric value or string that appears in code without information about what it represents. So here's an example of a magic number. So we're calling this function calculate pay and we're passing it a value of 80. And so we don't really know what this 80 represents. And so we as a developer community have basically all decided that magic numbers are evil and so we vanquish them whenever we see them. And so a rewrite of that would be we use named constants. So we have a named constant called hours per week and weeks per pay period. And so anybody reading this code can understand where this 80 is coming from. They don't have to guess about why the number is 80. And so we're so used to this in production code that we've brought this practice with us to test code. And so if you're somebody who hates magic numbers this test looks correct to you. So we're not using any magic numbers. We're using named constants. And so this is a good test in that we're not using any magic numbers. And so if you're somebody who's very devoted to getting rid of magic numbers the next slide might be kind of shocking to you. So brace yourself. So this is the same test with magic numbers. And it's a lot more readable. You can just trace it very easily, 72, 80, like that. It's very intuitive. It's half the lines of code. And so I think in test code we're used to this idea of eliminating magic numbers but magic numbers are actually fine in test code. And the other issue that was kind of hidden in the example with named constants is this line. So we're calculating expected billable hours by adding together starting hours and hours increase. And so the problem with this is I said earlier that we want to minimize the level of logic in our test code. We want to make it as obvious as possible. And so here it's very simple logic but it's nevertheless logic. And the problem is that in our production code there's almost certainly a line that's almost exactly like this. And I see this a lot. People will take a complex calculation from production code and basically just copy, paste the exact same thing into their tests. And so the problem is if there's an error in that calculation you don't know it because you're using the exact same calculation in both your production code and your test code. And so if you just embrace magic numbers in your test you can avoid this. And so a lot of the reasons that we have to always use name constants don't really apply in tests. So prefer magic numbers to name constants in test code. So in summary, you want to keep the reader in your test function, you want to make sure they're not jumping all around your test file to understand why the test function is correct. Redundancy isn't, eliminating redundancy isn't a goal in itself. You want to accept the redundancy if it makes the tests more obvious and simpler. And you want to, if you're tempted to write helper methods you should first think about whether you can refactor your production code to eliminate the need for helper methods. You want to avoid varying critical information in your test helpers. You want to make it so that the reader can stay in your test function without having to look outside for this critical information. Go crazy with long test names because you never have to call your test functions and embrace magic numbers. They're your friends. They're not as evil as they are in production code. And so that's all. Thanks to Pi Texas for having me and all the volunteers. This was originally a blog post so if you want to read this as a blog it's, you can just Google, good developers, bad tests. And if you want to tweet it to everybody and say it's the best blog post you've ever read that's totally fine, I don't mind. I'm on Twitter, I'm at deliberate coder. There's my email. If you want to read these slides they're mtlynch.page.link slash gdbt good developer, bad test. And do we have time for questions? So we've got some time for questions. Does anybody have any questions or want to yell at me for besmirching DRY? So the question is how did I decide magic numbers are okay in tests? Yeah, good question. So in production code it's, we usually have this problem of, we want to avoid magic numbers in production code because usually the number we choose are not arbitrary, like they're usually related to something else in the production code. And so it's necessary, like name constants eliminate that problem of, if it's related to some other number the named constant kind of enforces that relationship. Whereas in tests that doesn't really apply. Usually you're, it's not related to something else. It's not going to be related to something else in your tests. And the other thing is that in, in tests they're, like in, in production code you want to show that the number isn't arbitrary. And in tests the number often is arbitrary. You're just choosing a number that meets a, a number or a value. So like, you might want to choose a negative number to make sure that your, your function handles negative numbers. And so it could be like negative five or it could be negative 10. It doesn't really matter. You're, you're just choosing something that's negative. And so the, it should be the test name itself that, that tells the reader why, what's special about the magic numbers that you're choosing. But thank you. That was a good question. Question over there? Yeah. I just, would you recommend, because I know you can write test classes and then that can hold your functions that are in test cases called. You say it's okay to write your own functions and then it has class called. Yeah. I mean, I, I personally use the, the unit tests module in, in Python. So I, I write all my unit tests in, in classes. And so I think classes can be a good way of when, when the test fails, you also see the class name. And so that can be a helpful way of, of giving context to the reader. So I think it's these, these aren't hard and fast rules. They're, they're not like never write test operas. But I think the, the idea I want to get across is that you should be thinking about, like what's going to help the reader to understand why the test fails. What's going to, what's going to make it obvious to the reader. So using, using a test class, if it, if it makes it easier for the reader to understand quickly what the test does and why it's failing. I'd say use whatever tool you can. So, yep, other question? Hey, thank you. I did have one. Sure. Would you prefer like a redundancy versus focus? Would you prefer if your account manager was the thing that was failing, then you got a bunch of tests all of a sudden and then where you copied and pasted that code instead of in the setup. Now you've got a bunch of tests now that you prefer to see that everywhere account manager was copied and pasted, those are all failing as opposed to one test showing. I guess I understand the question. If you've got a setup, then they would all fail anyway. Right? Yeah, well, it would actually be different. It's a failure versus an error. Okay. Yeah, I mean, I guess I understand what you're saying. So if you have some, something where like you, you're breaking all of your tests and you could maybe avoid that if you, you were better factored. Yeah, I mean, for me, I feel like that's fine. Like if you, if you do something that does break all of your tests, then all of your tests should break because, yeah, so, I mean, but I think that's a matter of opinion. It's, it's kind of, kind of debate depend on what your team thinks and how you feel about the code. So any other questions? Oh, question up here. Surprise, nobody's yelled at me about the DRY yet. So when you say that you have to have useful information in the test, do you are separating that from a sensitive information versus what most of it to be the open functions right because of the encapsulation? So I think the, I think this kind of goes back to what I said in the beginning where I think unit tests are small enough that you don't, we use encapsulation in production code because there's too much code to understand if you just read it top to bottom. I think often in unit tests, you can't, it's, there's small enough logic that you could just read it top to bottom. So you, you're making a trade-off. So if you use encapsulation to refactor out some of the redundancy, you're eliminating redundancy, but you're now increasing complexity. So you're forcing the reader to, to go outside your test function to understand what's happening. And so, you know, there's no one rule that's going to answer it in all cases. I think we're engineers. And so part of that is making trade-offs. I think people are too used to eliminating redundancy. So they just do it without thinking about it. But I think often it's better to just have the simplicity of, of having redundant code, but it appears in every test function so that the reader can just read it top to bottom. So, we, out of time or more questions? What is simply the main efficient that you're using in testing? So that might be kind of the best approach for a fixture one method from once, but. So to me, that's still like, I, I can appreciate the explicitness of it, but I think you still pay a penalty in forcing the reader to, to jump around. I think, I think there's tremendous value in letting the reader just read a test function from top to bottom without having to read it another fixture or. And I mean, sometimes it's fine if there's not critical information in the test fixture, if it's just all information. Yeah, I mean, it depends. Yeah, it depends on what your preferences are. And one more question. In heresy into this conference, but the goal in community says a little bit of copying is better than a little bit of dependency. I think that's the thing we're talking about. If you break something in the class of your test and not all of your tests break, you only need to change those tests that did break. And it's more explicit when you get to the plural question point of like, these are the exact things that I broke. If what you break is in the setup, you effectively broke all the tests. You effectively changed all the tests without that actually being necessary. So unlike the breaking the dependency by getting rid of the set. Okay. Any other questions? Alright, let's have a round of applause. Thanks everybody. Thank you.