 In my mind, nobody represents or epitomizes a Ruby hacker more than Ryan Davis, and also I think no library represents, to me, kind of the spirit of Ruby more than Minitest. I'm a huge fan of both Ryan and Minitest, and so I'm excited. Up next is Ryan Davis talking about building your own testing library. Yeah, so I'm going to be talking about making test frameworks from scratch. As said before, my name is Ryan Davis. I'm known elsewhere as ZenSpider, or on Twitter, the ZenSpider, because someone's camping it. I'm a founding member of SeattleRB, the first and oldest Ruby Brigade in the world, and I'm now an independent consultant in Seattle, and I'm available. I'm also the author of Minitest, which happens to be the most popular test framework in Ruby, and I only mentioned that because I didn't know that. I learned that last week, and I'm a bit astounded that we're beating RSpec, and it hasn't been since July. So setting expectations. This is something that I always do. I really want to set these things up front. This is a very code heavy talk that will go into detail about the what's, the how's, and the why's of writing a test framework. It is 320, yeah, 26 slides. It's a little bit more than nine slides a minute. That's 50% more than I've ever done before. So I'm going to be talking at this pace and going as fast as I can. The slides are published. You can find them there. One thing I find about doing more slides, and again, 326 is ridiculous, but the more slides there are, the more I'm able to help connect the dots and the more explanation it provides and the more comprehension you guys walk away with. So first, a famous quote that was not said by a famous person. Tell me and I forget. Teach me and I may remember. Involve me and I learn, and that was not Vendor Brown Franklin. Who actually said this is a bit of a mystery, but it's usually attributed to him? Who knows? The internet says it's him, so it must be. But I think that the quote points out an important problem in Code Walkthrough Talks. So just to back up a little bit, don't get me wrong, not all Code Walkthroughs are bad. I think some of them are absolutely great and they're absolutely necessary for work. And I'm really only talking about presentations. But quite simply, I could write this talk in my sleep by simply working through the current implementation of many tests and explaining each and every line. And you'd learn nothing from it. And you'd forget it almost as quickly as you read it. And some of the many problems of Code Walkthroughs are that they're boring. They're top down. And as a result of that, they focus on the what's and not the why's. There are good reasons to tune out and not learn a thing. So those are also good reasons for me not to do a Code Walkthrough of many tests. That quote that isn't by Benjamin Franklin, the real quote or what is believed to be its predecessor is much better. And this screen is dark. Not having heard something is not as good as having heard it. Having heard it is not as good as having seen it. Having seen it is not as good as knowing it. Knowing it is not as good as putting it into practice. And I'm not gonna mangle the source on that. For those of you who think that's too many words, here's a more concise version. Is Erin in the audience? So for my most ferrity of friends, since that's too many words, I didn't know what to do for, there's no brain emoji, so I did the nerd one instead. So starting from scratch, that is the point of this talk. Working up from nothing is the closest that I can get to allowing you to join me in building up a test framework from scratch. I will try to describe this in a way that you can literally code it up at the same time and understand the steps that you went through to get there. Now would be a good time to open up your laptops if you wanna join in. Further, this is not gonna be many tests by the end of the talk. It's gonna be some subset. I'm gonna apply the 80-20 rule to show you most of what many tests can do in a minimal amount of code. To emphasize that, I'm gonna be referring to this as microtest from here on out. Does anyone get that joke? Couple? Finally, I encourage you to deviate from the path that I'm gonna lay down and experiment. Do things differently and you might understand my choice is better. Finally, this talk is an adaptation of a chapter of a book that I'm writing on many tests. There will be more info on that at the end of the talk. So, where to begin? At the bottom, the atomic units of any test framework are the assertions. Let's start with plain old assert. In its simplest firm, assert is incredibly straightforward. It only takes one thing, the result of an expression, and it fails if that isn't truthy. That's it. This framework is complete. You have everything you need. Thank you. Please buy my book when it comes out. Are there any questions? Okay, no. Although, doing a 326 slide talk in five minutes would be great. I'd like to have a few more bells and whistles, but before I add those, let's figure out what I wrote and why. In this case, I chose quite arbitrarily to raise an exception. If the value of the test isn't truthy. I could have chosen to do something other than raise an exception, like throwing or pushing an instance of something onto a collection of failures and returning. And while there are trade-offs to all of those choices, it doesn't really matter what you choose to do as long as you report the failed assertion at some point. I mostly chose to raise an exception because exceptions work exceptionally well, no pun intended, for my brain. Trust me, that wasn't on the notes. There's an added benefit in that exceptions raised, interrupt the execution of the test, and they jump out of the current level. And we'll see more on that later. So, if you ran this exception, 1 double equals 1 would evaluate the true. That would get sent as the test arg to assert. And it would do nothing in response. That would be a pass. However, if you ran this expression, 1 double equals 2 would evaluate the false. That would get sent as the test arg to assert. And that would wind up raising an exception. At some point, there will be mechanisms to deal with those exceptions and gracefully move on. But for now, the whole test suite grinds to a halt on the first failed test. One problem we have is that the raised exception reports at the place that the raise was called, in assert, and not where the assertion was called. I want it to only talk about the place where the assertion failed, on line five of test RB. So I'll clean up the exception a bit by changing the way that raise was called. Raise allows you to specify what exception class to use and the backtrace you'd like to show. I'll use caller for the backtrace, which returns the current runtime stack where you wind up calling caller. Now we see where the test actually failed, much more useful to the developer dealing with a failure. Next, I'm gonna add a second assertion. Now that we have plain assert, we can use that to build up any assertion anyone could possibly need. At least 90% of my own tests are handled by simply checking a quality. So let's do that next. Luckily, it's incredibly simple to implement. I'll just pass the result of a double equals b to assert and assert does the rest. And here's how you use it. Where two plus two equals four, it passes true to assert, and where two plus two doesn't equal five, it passes false to assert. Assert does its thing and we know how that goes. This is really all I need to do most of my work quite happily. But the way it stands right now, it has a pretty unhelpful error message when a failed assertion raises. First, the backtrace is pointing at assert equal, not at our actual call to the assertion. Didn't I just fix this? Sort of. I'm using caller for the backtrace that includes the entire call stack, including other assertions, so I need to filter the assertions from it. Let's fix that by deleting everything at the start of the backtrace that is in the implementation file. And here, I simply assign to a local variable, I still use caller, but then I use drop while to filter out anything that starts or contains the current implementation path at the start of the array. But it's still ugly. The failure still says failed tests. Let's make it possible for assert equal to supply more information. Let's pull up the error message into an optional argument and use that argument and raise. That allows us to change assert equal to use that with a helpful message that details why we failed. Now we get our message that looked like this. And that's much more useful, though probably not 100% resilient, but it'll do for now. One more assertion. One mistake people make time and again is misunderstanding how computers do with floats. Stop. The rule is really simple. Never, ever, ever test floats for equality. There are exceptions to this rule, but if you stick to this, you're going to do fine every time. So while we have assert equal, we should not be using that for floats. So what should we use then? We'll make a new assertion just for comparing floats. And I should see if the difference between two numbers is close enough, where close enough is simply going to be within one one thousandth. We could make it fancier later, but for now done is more important than good. And so this is what it looks like. It's almost exactly the same as assert equal, except we're using the formula that was stated previously. And this is how it's used. And it simply works right out of the gate. So what does that mean? It means that our assert method is finally general purpose enough that we can write all the other assertions. While writing other assertions you want as fine and good, necessary even, that would take hours and I'm only 25% through my slides. So consider that an exercise for you after the conference. What's your favorite type of thing to test? Hopefully you're not actually doing floats that much. And how would you write an assertion for it? Go do that and then have a cookie. Once you can write one test, you'll want to write many tests and that starts to introduce problems on its own. It'd be nice to keep them separate. There are many reasons why you'd want to break up your test and keep them separate. Organization, refactoring, reuse, safety, prioritization, parallelization, and so on. While it'd be nice to separate our tests from each other and organize them, how do you do that? We could do something really quickly and easily like this. Call the test method, provide it some sort of description and a block with your assertions. And it's equally easy to implement. You implement test, you yield, you're done. This gives us the benefit that you can name the tests and put them in blocks so that you can see that they're separate. But they're leaky. And leaky tests infect results. Now that you can write multiple tests and keep them organized we need to be able to trust them. The problems of these tests aren't actually all that separate from each other. Here at the top you can see that we have a local variable A, we assign to it, we test it in the first test and it passes. The second test modifies it and it passes and that causes the third test to fail. We really want those tests to be completely independent from each other. But the fact that one test can mutate a local variable that is used by another is simply a mistake. This goes against good testing practices that would state that tests should always pass regardless of what was run, what order they were run in, or anything else. Otherwise you don't trust the tests and trusting the tests is crucial. We'll fix this by using methods. I might need to stop using the clicker. There are a number of ways that we could try to patch this up. The simplest perhaps is just not to do it. Instead, just use Ruby. Ruby already provides a mechanism for separating blocks of code and making them immune to outer scopes. It's called the method. Here we can see that we have local variables A in all three cases, but they're all separate local variables. The nicest thing about this approach is this is absolutely free. There's no cost to using this if you are already paying by using Ruby. It is important to also remember that by using plain Ruby is that anyone can understand it. It does have some drawbacks. First, you have to run the methods yourself. That's fine for now, we'll address it later. Another perhaps more pressing issue is that there's code duplication in the previous examples, but there are simple ways to get around that too, and I'm not going to bother going into them at this time. Stick to plain Ruby, and it should be pretty easy. Now that we have multiple tests separated by methods, how do we get them to run? Same way you run any method, you call them. We could come up with a more complicated way, and we will, but this we'll do for now. Methods are a good means to separate tests, but more problems arise. Unique method names are necessary. It's harder to organize, reuse, and compose, so on. Luckily Ruby comes up with another mechanism, classes. Didn't I just say to keep them separate though? Yeah, but it would be nice to be able to organize them, clump them up in units. So how do we do that? Take the previous code, wrap it in a class, and that's all there is to it. But how do you run those? Wrapping the methods in a class breaks the current run. In order to fix it, we need an instance of that class before we can call the method. So we add that to each line, and we're passing again. This change doesn't really do anything. It groups the tests in classes, but it does put us in an ideal position to make the tests run themselves. Right now we manually instantiate and call a method for each test. Let's push that responsibility towards the instance and have it run its own test. By adding a run instance method that takes a name and invokes it with send, it doesn't look like much either. In fact, by adding the call to run, we've made it a bit more cumbersome. But this will make the next step super easy. It also provides us with a location where we can extend what running a test even means. For example, the run method would be a good place to add setup or teardown features. What would you add? Running tests manually is still pretty cumbersome. Let's address that next. Now that an instance knows how to run itself, let's make the class know how to run all of its tests. And we can use public instance methods and then filter on all the methods that end an underscore test. Public instance methods returns an array of all of the public instance methods on that classroom module. And we can use enumerable's grep method to filter on those. We wrap that up in a class method that instantiates and runs each test, allowing us to collapse this into this. This would be a good point to pause and apply some refactoring. What we've got is well and good, but we've only put it in one test class. In order to really benefit from this, we should push it up into a common ancestor and make a parent class, so let's make a parent class and refactor. We simply reorganize the methods into a new class called tests, and while we're in there, we scoop up all the assertions and throw them in as well. This is subclass of the new test, and that's all there is to it. Do that to all the test classes, and they all benefit from code reuse. This makes it super trivial to have a bunch of classes of tests that can run themselves, so let's push that further. The one thing we have left to address is where we manually tell each class to run its tests. So let's automate that too. Since we're using subclasses to organize our tests, the class inherited hook. Every time a new test subclass is created, it'll be automatically recorded, and tests will notify all of them off when told. First, we need some place to record the things that we need to run. Then we use the inherited hook to record all the classes that subclass test, and from there, it's trivial to enumerate that collection until each one to run its tests. This allows us to rewrite this into this. It would be an ideal place for us to put that into its own file, so you can just require that and kick everything off. That's all there is to it. At this point, we're able to run all the tests we want, but microtest is kind of hamstrung at this point. Now the generalized testing is supported to be able to know what happened. On the one hand, silence is golden. You don't see an exception raise that anything ran, so you're not quite sure that they did. I think this is one of those situations where the Russian proverb, trust but verify is a good policy to have. Let's give the framework a way of reporting its results and see if we can't enhance things while we're in there. How do we know what the results of the run are? I personally would be pretty happy just seeing that something ran. Let's start with that as a minimal goal. Here comes my favorite slide I've ever written, ever. Let's print a dot for every test run. I think Tufti would love that slide. So let's add a print and a puts, and we're done. This is a stupid simple thing to do. The emphasis, perhaps, is on stupid. Quite simply, we print a dot for every test and we add a new line at the end of the run to keep it pretty. Doing so, we see this. Now that we see that we ran three tests and they pass, and we have usable information about our test run, but what about failures? What happens when a test fails? Currently, if a test fails, it immediately quits since it was raising an unhandled exception. It's not too terrible, but it does imply that you only see the first problem that raises. That might not provide as much insight as seeing all the failures at once. So let's clean this up. We'll rescue the exceptions and print out what happened for each one. And now we see all the tests regardless of failures. We also don't see loads of backtrace. We just see the failed assertion. And perhaps it's not the prettiest output in the world, it is much better than before. But there's still several things that I don't like about this code. I don't like that the logic for running a test and doing IO is mixed in the same method. It's just messy. So I want to address that in the process, refactor the code to be more maintainable and capable. The problem I have is that the run class method is doing way more than just running. There are about four categories of things that it's doing throughout. The first thing I want to do is separate the exception handling from the test run. I really don't like the test run as handling both printing and exception handling, but I especially don't like the exception handling. So let's address that first. Since test class run calls test instance run, which calls your actual tests, it's too hops up from where any actual exceptions are getting raised. We should refactor this and break up the responsibilities. I want run all tests to only deal with running test classes from the top. I want the run class method to run all of the tests inside that class. And I want the run instance method to run a single test and handle any failures. And then I want something else entirely to deal with showing the test results. So let's move forward with that goal in mind. First, let's push the rescue down so that the instance method returns the raised exception or false if there was no failure. Now we change the class method to print appropriately based on the return value. Now we have exception handling pushed down to the thing that's actually running the test. Having exceptions only raise a single level usually means that you're in a better place to deal with them. And by doing this, we've converted some exception handling into a simple conditional. Next, let's look at the IO. Let's extract the conditional responsible for IO into its own method and call it report. By doing this, we put ourselves into a better position to move that out entirely. So let's do exactly that. We'll extract the report method into its own class called reporter and we'll grab that puts that we had in there too while we're at it. This lets us rewrite run all test into something much cleaner. We pass down the reporter instance to run and we use that to call report instead. The same was a block variable before we need to pass that to report as well. By doing this, we removed all the IO from the test class and delegated it elsewhere. Throughout these changes you should be re-running the tests to ensure that it works the same. But in this case, it doesn't. The class name is wrong. Now that we've pushed the reporting into a separate class. For now I'm going to go with a quick fix route passing in the class. I'm intentionally focusing on fixing the bug not using the right abstraction. Sometimes this is the right thing to do because you need to get stuff done but you pay a price in doing so and we'll see that. So let's add a new argument to report and we'll call it K for class and pass in the current class which is self in any class method to report. This fixes our output back to what we expect. Let's add a third argument to do that and that should be a hint that we're doing things wrong. So let's try to address that now. We don't need to pass the actual exception to report. We can pass anything that has all the information that we need to report on the test result. And what better thing than the test itself? All we need to do is make the test record any failure that it might have and then make that accessible to the reporter. Let's add a failure attribute to test and default that to false. Then we modify run to record the exception and failure and return the test instance instead of the exception. Now we can use the accessor and reporter to get to the message and the back trace. Let's clean up the mess we made with the third argument. Now that E is the test instance we're able to get rid of the K argument and that gets us back to two arguments which is still one too many. Let's try to remove name. With a little tweaking, test instances can know the name of the test that it ran first by adding an accessor and then storing it and initialize. Then pass in the name to the initializer instead of run and finally by removing the argument from run and using the accessor. This means that a test instance is storing everything that reporter needs to do his job and we can get rid of the name argument. There's one more thing that I don't like and that's mixed types when you don't need them. Right now E is either false or the instance of a failed test. But tests already know whether they've passed or not. So false isn't helping. It can be the source of pesky bugs. So let's get rid of it. By adding an insure block we can get rid of that false and make sure that the run method always returns itself. One thing I want to point out is that you really need that return in this case otherwise it might return one of the values from previous blocks. Next let's add an alias for the failure A predicate and switch reporter to use this new predicate method to test. This looks pretty good now. I just don't like the name E anymore. This is no longer an exception but a test instance. So let's rename it. Let's call it result. This makes the code much more descriptive albeit longer. At this point I could call it a night. I really could. But the output is still a bit crafty. I want to enhance the output. It would be nicer if we separated the run overview the dots and Fs from the failure details. Something like this. Let's change report to store off the failure and print them in done. That's pretty easy to do. First we need a new failures array to store them and then we need to make an accessor and initialize it. Then we need to print an F when we do get an exception and store off the result in the case of a failure. Finally, we need to move the printing code down to done and into an enumeration on that array. And our output looks much, much better now. But we're still not quite done. There's still a couple things left that are getting on my nerves. Let's report and done quite a bit. They're no longer doing what they say they do, so let's rename them. Report becomes shovel or chevron. Done becomes summary. And we use those names in test. At this point, I'm pretty happy with the code. But that does not mean that we're done yet. I'd like to add some more enhancements. One common problem is people often write a test that depends on side effects of a previous test for it to pass. If that test is run by itself or if it's run in a different order, it will fail. That goes against our rule of testing. The test should pass regardless of their order. An easy way to enforce this is simply run the test in random order. That's easy to do in our current setup. But I'd rather not mix too many things into this method. So let's start by extracting that code that generates all the tests to run. Now test.run only deals with enumerating test names and firing them off. So we're in a better place to randomize the test. And that's where all there is to it. We could get fancier, push it up so that run all tests will randomize over both the classes and the methods. But this is a good compromise and I'll leave that as an exercise for you. So we're done for now. What did we wind up with? We wound up with only about 70 lines of code. There's a good portion of what Minitest actually does. It's all factored. It has zero duplications of any kind. The complexity score is incredibly low. It flugs in at only 70 or about one per line, I guess. Or about five per method, which is half of the industry average. And it's, even without comments, the code is incredibly readable. The reporter, the test class methods and the test instance methods all fit. Rather readably on one slide. You guys can actually read that, can't you? It's a little fuzzy. It's not bad. It actually runs twice as fast as Minitest because it's doing less. Don't switch to this, though. The worst thing about this talk is that I spent about nine slides for every two lines of code, but that's a price that I'm willing to pay to explain this well. So how did we get there? We started with the atom, then we worked up to molecules. We gathered tests into methods and methods into classes. We taught the class how to run one method. We taught the class how to run all of its methods. And we taught it how to run all classes. At that point, we decided to add reporting, error handling, and randomization as a cherry on top. That's where the idiot pitches his book. I'm hoping to soon publish a small book under Michael Hartle's Learn Enough series. If that goes well, then I'm hoping to write a much more complete book on Minitest. I will have a sample chapter coming out soon for review that I'll provide for free, but it's not ready yet. So please follow me on Twitter for announcements. Thank you. I would prefer to write tests with methods. The question was, do I prefer to write tests with methods rather than what Rails provides for the test block form? And yes, I absolutely do prefer methods. And the problem is that the block form, they're closures, right? So they're closing on all the outer scopes and always hold on to everything that was able to see at that point. Erin? Yeah, it's the closure that I actually have a problem with. It's not the string or the descriptive name. I understand that by doing it my way, you have underscores or whatever in your name instead of spaces. I'm okay with that. Why did I create Minitest? I created Minitest because I inherited test unit from Nathaniel and tell it, and it scared the crap out of me. It was incredibly complex for what it wanted to do, and I didn't understand how all the mechanisms worked, and so I already had, I don't know, 50 projects or so at the time, and I wanted to see how little code I could write to make all of my tests run and pass again, and it turned out to be, if I remember right, 98 lines of code to do all the assertions and the mechanisms that I needed, and thus Minitest was born. And it's now, I think Minitest is now 1,500 lines of code, but that's compared to the, I think it was 6,000 lines for a test unit, and God knows how many for our spec. But even at 1,500 lines of code, it is incredible, 1,500 lines of stuff, that includes comments and blank lines and everything. It's still incredibly readable, it's very well maintained, and it's faster than almost anything out there, which that is actually the most important part for me. Nope. Thank you very much.