 Hi, everybody. Thanks for having me here. I hope everybody can hear OK. So testing design. This is me on the internet. You can find me on GitHub. My email's on the last slide. You can come talk to me afterwards. I work for Magnetic. We're in advertising. We do online bidding on real-time bidding on online advertising. We use Python, PyPy, in production, lots of fun stuff. You can talk to me about that also. OK. So luckily, this talk is fairly simple. So simple that the core ideas fit on a slide and a half, which is basically that we know a lot of things or we think we know a lot of things about what makes designing software good. And the good thing about testing is that most of those things translate quite easily over into what makes running test suites good. And so tests are just code like anything else. So we have all these principles that we think help us out when we write software. And they're obviously something that you think about as you're writing software. And they translate pretty well over to tests. So we have these principles like make sure your objects only do one thing or make sure that you try a separating thing so that things are fairly simple and composing objects together and those sort of things. And most of those principles translate pretty directly into testing. So try to keep unit tests, integration tests of any sort down to testing one specific thing. Trying to make sure that your tests are both simple and also transparent because you're not testing them. So all of these principles that we have for regular software design apply pretty well to testing. Getting down slightly to specifics, we have this three step process that gets drilled typically into our heads when writing tests, which is that there's three steps to a test. First you set up some stuff, then you do some exercise, you do whatever it is that you think you wanna test in that test, and then you make some verification that what you expected to happen actually is what ended up happening. Our supply is fairly uniformly across all the different types of tests that you can end up writing. As a three step process that if you actually think through it as you're writing tests, your tests end up clearer, they end up being more self-documenting, all the sorts of nice things that we like out of our test suites. One particular thing that people sometimes say when we have this three step process is make sure that your verification is only one assertion. So you write a test, make sure it has only one assertion. And it's kind of a peculiar thing to hear the first time that you hear it. First of all, most of the time people's first thoughts are how do I actually make that happen? Because they remember back to times when they've written tests and they had this long list of assertions. So how do you actually make it happen? But even more than that is like, when you hear this statement, when you hear someone tell you this, your first thought is like, what's the actual benefit? Like what am I going for by keeping my tests down to a single assertion? And so you stare at this example here. It was a fairly simple function. It just takes a bunch of dictionaries and adds them together, adds all the keys and associated values together. And then you have this alternative. It's green, it's better. So what's better about like, what is the actual difference between the two examples here? Why is this any better? And of course this is a simplified example, but it's the first representation of this idea that like make sure to keep your tests down to one assertion. And the most obvious benefit, just to answer that straight off, is that the most important things about tests are their failures because tests are destined to fail, but meant to pass. So first you gotta see the failure. And the difference between those two slides is basically how much context you get when that test fails. So the main benefit that we're aiming for with this sort of idea is that we want more context when our tests fail. And so rather than seeing stuff like, well this isn't this, which is what you get when you have assertions that look like that, if you make these sort of larger assertions, then you get extra context. And that's useful for a lot of reasons. One of which is that, while the first one tells you that what you got is not what you expected, the second one tells you not only what you got isn't what you expected, but possibly the extra information that you got is telling you in what way the actual implementation in your code differs from what you expected it to be. So for example, you're swapping values for some reason for keys. Obviously in this example, that's pretty unlikely from an implementation point of view, but in real world examples, it's quite common for things like that to happen. And if all you see is just, well this isn't this, it gives you sort of less information, less ability to just be able to look at it and say, oh, well that doesn't look right in multiple places. And the combination of the places where it doesn't look right means that what I did wrong is something. And in particular, this applies to unit test two, which has this type of quality protocol where you can get all sorts of nice context like that. So moving on a bit, so now we sort of like the idea of having one assertion in a test for the reason of getting this extra context. But sometimes it in fact turns out to not be possible. Oftentimes now we're shifting over from unit test to integration test. And sometimes what happens is that there's sort of these two sort of worlds of assertions that you wanna make. Sometimes you wanna make assert equal or some assertion of that sort, which are basically data comparison assertions. They're like, I have some stuff, some values, and some expected values, and I wanna make sure that those two values match up with each other. But a lot of times when you're writing applications, what you actually wanna assert at some point when you're writing your test suite is more like, I wanna assert that some collection of things are true, that the state of my object application, something is true. And unit test won't have assertions for those because unit test doesn't know about those, they're part of your application. It's about basically the difference between making assertions about some data and making assertions about the meaning of some application-specific thing. So to take another specific example, you wanna compare some strings, cool, unit test can help you, you just make an equality assertion. But if in your application those strings are actually HTML, unit test probably isn't gonna help you because while a standard library has a bunch of parsers for things, it doesn't have assertions for that. So if you wanna make assertions about some two pieces of HTML being equal as HTML, not necessarily as string literals, you're sort of out of luck. And that's kind of unfortunate because that's kind of useful to do. I'll talk about it a bit more in a minute, but in the test weeks that we write, this is what we're doing a lot of the time and sorts of things like changes in white space and things like that are just annoying. So if you can just compare some HTML, that's way more useful, it makes the test way less brittle, but it's not something that you're gonna find out of the box. So here's a pretty specific example that I have this sort of fake, but very much similar to a real example of a test for an ad server, some web application that basically you give it an ID, it shows you the associated piece of media with that ID. And we have our three steps here, we do a little setup, there's some hand waving here, that basically there's some in-memory database that we're adding this advertisement to. And then we hit the URL in our application that's actually supposed to be showing that, and then we make these assertions about, okay, I expect these three things to happen. I wanna make sure that I got the right status code. I wanna make sure that I'm actually properly setting headers for content type or whatever. And then I wanna actually make some assertions about the body of my response. And you stare at this for a moment and you try to apply the rule that you had before, which is one assertion per test, and it's sort of not necessarily obvious what the way to make that happen is in this case, because all three of these assertions are useful, they're all things that are basically does my response actually work correctly? And so the same mindset that we had before basically leads some people to basically split this up into three tests with the sort of same setup and exercise, which is not ideal for reasons that I'll skip over at the moment. And instead just skip straight to something like this, where I have nothing highlighted because this is way more code on a slide than I expect anyone to read. But so what is this? This is basically the conversion of these three assertions into one assertion that actually encapsulates some meaning. And the meaning that this assertion is actually trying to encapsulate is that responses have content. I sometimes wanna assert against that content and there are a couple of things that need to be true when I'm making that assertion and I want this assertion to just handle taking care of all of those things. So for example, we're checking all of the same things here, along with the addition of content length. So we're checking all these things, making all these assertions in the assertion method. And so you end up with something that looks more like this. And obviously we've cut down the number of lines in the test. I think also we've sort of gotten some extra clarity. I think the assertion with this name has more of a direct, it's telling you directly what you're trying to assert against. And we also have the same benefit that we get anytime that we take a bunch of code and refactor it into one place, which is anytime that you wanna come back and make some improvement to this slide, that's gonna immediately go and affect all of the tests that you have in some positive way. And you actually obviously have to be careful with that because you can silently break tests in that way. But if you are careful with that, what it means is that things like, if someone comes and makes a change to a test and it starts failing in this assertion, and they notice that it fails in a particular order, so for example, it'll fail first for the content length order. And they decide that, you know what, that doesn't really as helpful as if I reordered these assertions. So if an assertion fails, potentially they wanna see the body first and have that be the failure message if you got a different response than you expected because that'll tell you more information. They can go in here and reverse that assertion and now that particular thing, which is probably true across your test suite, you now get better failure messages everywhere just because someone noticed and went there and made that improvement. So it's the same benefit that we get anytime we take some software or we take some particular operation that we're trying to perform and refactor it out into one place that we can basically concentrate on. And so what happens? What happens is if you do something like this, the proposal of starting to build assertions on top of the data comparison assertions is that you end up with a sort of hierarchy of assertions. So at the bottom, you have your data comparison assertions because at the end of the day, you are just comparing two objects, so that has to happen somewhere. But on top of that, you can add these layers of meaning. So rather than just being about comparing strings, now you have an assertion built on top of that that's really about comparing HTTP responses. Even though at the end of the day, it's just comparing a bunch of values, but it becomes a much more powerful assertion that's able to add all of the nice messages and things like that that you might possibly want to do once you know that you're actually dealing with comparing HTTP responses. And then on top of that, even build more interesting things, some of these things in our case are only in theory. They don't actually exist yet, much as I would like that they would. But if you come back and decide that, you know what I want to do even more fancy things or even more useful things, depending on your perspective, you can come back and layer on top of these assertions that are making assertions about HTTP responses and start making assertions about, well, is this valid HTML or layer on top of that something like, well, now I want to start doing assertions about how these things render in the browser, given whatever conditions I actually feel like placing around them. And so we're actually doing well on time, so I'll actually spend a bit more on this slide, which is, so where do we go from here? Assuming that we've all agreed that this sort of using assertions to build more meaning on top of the data comparison assertions. So what comes out of that? And so it's kind of interesting what comes out of that. I think after a while of doing this, what comes out of that is lovingly mix-in hell, I think, by which I mean we build up all of these assertions in a bunch of mix-ins in our case. And that's great, it gets you a ton of benefit and I'll specifically tell you what some of these things have before I actually tell you why there's a nicer way of doing all this. So for example, we deal with GDBM, it's a like GNU database in memory key value store thing, it's in the standard library. But for example, GDBM doesn't have ways of, there's no object layer on top of that, there's no way of basically comparing GDBM databases, there are files. So we have a mix-in that will take a GDBM database and compare it to a dictionary and tell you if those two things are equal. We have a mix-in for logging, this is something that I think everyone has written at least once, there are some packages that actually try to make this helpful for you, it basically attaches a handler to the logger for the standard library logging module and then lets you make assertions about things that have been logged. I think everybody has written that, well, I think that assertion is quite often written and rewritten by people. We have a mix-in that has assertions about our own proprietary log format, which is quite crazy. So we parse that into something more sane and are able to make assertions against that. We have this response content mix-in, which I mentioned, which has this content assertions, something has content, doesn't have content, has content that looks a certain way. We use statsD with Datadog. StatsD is like a, you send it metrics and it basically puts a nice UI on top of that and shows you graphs and things like that. So you want to make assertions about things having been, a metric having been incremented, all those sorts of things. So what happens is basically you end up growing this companion to your test suite, which is I have all the things I want to test and then I have all these useful assertions that have either don't exist yet or have some meaning specific to my application and then I'm able to basically use those in all the places in my application. And I called it mix-in-hell because it's sort of annoying. Just the coupling of inheritance to actually adding these assertions to your test is kind of annoying. So I will mention that there is something else. It's up on the slide, which encapsulates this idea of, I have this collection of things that I call an assertion and I want to just be able to use that all over the place. And those are test tools, matchers. They sort of claim to solve this problem. So they're worth checking out if you're convinced. And so the last thing that I want to say is sort of a call to action, which is that, in part to myself, which is that I think a lot of people are writing these sorts of assertions. There are things that are useful, right? Like if you want to compare some HTML, where are you going to go and look for the assertion that actually does that in a way similar to how I described? So I think we need to start sharing these things that we're writing. When was the last time that you downloaded and installed a package that whose job it was to add a bunch of assertions? I think it's a bit more common to do that for test tools, matchers. And there possibly are some packages that do that, but I'm not sure how widespread they are. So I'm sure someone will tell me. But regardless, I think there are a whole bunch, there's this whole layer of things that it's possible to build on top of the simpler assertions that we have for doing these sorts of comparisons. And it would be nice if we sort of built up a bit in sharing these assertions so that they only have to be written once. We can sort of distribute that benefit across everyone. That's what I got. Thanks everybody. Thank you Julian. So we have at least five minutes for questions. So there's one microphone here. Is there another one in the room? No, okay. So has anyone got any questions and I'll bring the microphone to you or as close to you as I can. One question here. I would like to know how strict are you with this one or third equal statement? Because if you go back to your first example, what if your data structure, say you have a dictionary, is more complex than that? But in this test specifically, you only want to test for cats and dogs. If you do a direct comparison, you will get a big diff which is not telling you anything, right? So you would need two assert statements that look into that dict. So what would you do there? Yeah. So sorry, didn't mean to cut you off. So I gave this talk last week in London and it took 40 minutes, not 20, which is why I thought I'd be close. And someone asked that same question. I think he's actually here. So the answer is, or there's no general answer. We're all gonna just cave at some point sometimes. The ideal answer I think is usually that I try to actually use those cases as examples for when to try and look at the code again and see if there's a possible way to split whatever it is that's doing that apart. Because if you have this thing that's being outputted and there's a bunch of assertions that you're making on pieces of it, then possibly that means that what you actually really have in code land is this thing that really should be spitting out a whole bunch of different results and then something later that's combining them. So I try to be fairly strict because I think that's turned out well in that particular case. It sometimes tells me useful things about the actual code that I'm writing and tells me how to break it up differently. But sometimes it happens. Yeah. Okay, thanks. Yeah, so if I catch you correctly, it's like makes less assertions but make them more meaningful by using custom assertions. And that works well, I guess, if you have generic things that you test for, like HTML, for instance, output and you can even share them and use them. But I guess in a lot of cases those will be specific to your application. And I've had something recently with a PyTest fixture and then I realized, geez, I'm putting so much logic into this assertion, into this fixture, I need a test for this. So then I ended up writing a test for my fixture and then I suddenly noticed the afternoon was over and I really got, and then I kind of went home and thought, thinking, is that really the way forward? I'm making my tests more, they're smarter and they're certainly better readable, but now I have to test the test assertions. Do you see any way out of that? What's your experience? It's the same thing and it's the same thing with anything in testing, I think, which is like, it's always a balance. So that's obviously not good. I don't wanna be writing tests for my tests. I don't think anybody does. The nice thing I think is that often the assertions that we have that we've written come out of like, I have a bunch of places where I wanna make the same set of assertions and I notice that and I say, okay, that means that that's a good place to factor out. We try not to start from the other side, which is, I have this thing that I wanna assert about, start writing the assertion method and then using it in places because then it often does turn out like that, that you try shoving a lot of things into that assertion at the end of the day, like there'll be flags in the assertion to do different things. Somebody tried to, it actually got merged, I think. There's a flag in our test suite now where, yeah, sorry, this is embarrassing, it's gonna be on video. There's a flag somewhere in our test suite for doing comparisons on like HTML and it sort of switches on whether or not to do an assert in or an assert equal and I hate it so much. Yeah, so it happens, unfortunately. I don't have any other, I don't have any thing that can help out with that other than just use your best judgment as it's happening. Any other questions, hold up your hands, if you have a question. No. Well, since we've got some time, I've got a question. Aha, oh no. So we both work on the Twisted Project and I know that I've been told off umpteen times for having multiple assertions in my tests but I wonder, when you've just moved all your assertions into a custom assertion wrapper. Oh no. Then you're gonna fail what JP will always preach to us, is that the problem is that then if the test does fail you have to run the test multiple times to get to find out if it fails on multiple assertions you have to run the test multiple times to get all the failures. So would it be worth instead of having those multiple assertions in the wrapper just collecting all of the information and putting in a one single container. Yeah, yeah. You couldn't have gone with the softball question? Yeah, so I agree with him and with that. That sometimes it's nicer to just shove it into a container. I think it's very easy to be lazy in that case because it's sort of a non-obvious thing and until someone tells you to do that I think people probably think of that when they try to apply this rule, like okay I'll just shove it in a tuple and make tuple comparisons and they say no, that seems ridiculous and I don't think that's a solution that people are likely to come up with unless someone tells them. So to be perfectly honest, I think that the actual solution for that problem which I understand is that TestTools has this nice other thing which is failures that don't actually stop execution of the test. Right, that's what he always talks about. I haven't actually seen it in action. Yeah, they're pretty great. So if you actually have a bunch of assertions and you wanna also get this thing of I wanna execute all three of these and then I want all of the context then I think that's the actual way to go is to get something like that. Okay, great. Any final questions? Oh, there is one more. Have we got time? Yeah, I think so. Did you have a look into PyTest? Because I think it solves some of the problems you described. Yeah, I'll be perfectly honest. I don't know of the surrounding things that PyTest adds. So I believe you, I'm sure that it has like similar, like other than just the eliminating test case thing, I know it has like fixtures. I don't know of the other components that PyTest has. I mean, you don't need a Metro library because PyTest shows much more of the context. Yeah. Yeah, I know that it shows nice things like it'll show locals in frames, right? When tests fail and things like that. I know it has some nice things. I don't know which particular, like the, I can't imagine it can provide the same sorts of, like it's not gonna give you layer assertions because those are things that you're probably defining in your application. I know that PyTest will give you some nice things, but I don't, I haven't used them. Okay. Yeah. Okay, thanks Julian. That's a great talk. Very informative. We'll be on to the floor.