 Thank you, everyone, for lasting so long. I know it's been a taxing day. My name is Ryan Davis. I have known elsewhere as Zen Spider. I'll be talking today about making a test framework from scratch. I'm a founding member of the Seattle Ruby Brigade. It's the first and oldest Ruby Brigade. In fact, we came up with brigade in the world. I'm now an independent consultant in Seattle, and I'm available. I'm also the author of many tests, which happens to be, according to the Ruby Toolbox, the most popular test framework in Ruby. And I only mentioned that part because I'm kind of astounded that I'm beating our spec. So setting expectations, something that I always do at the beginning of my talks. This is a very code heavy talk. I'm going to go into detail about the what's, the hows, and the whys of writing a test framework. I've got a few slides coming up to 9.4 slides a minute if I ignore the 30 minutes that I'm supposed to talk. I can go to 35. So I'm hitting 9.4 slides a minute. I've given this talk twice already. It's already been recorded and published, so you can watch it there if you need to. I generally find that adding more slides adds more explanation and comprehension, and generally makes the talk go smoother as long as you don't have AV problems. So the presentation has been published. The slides are up at the URL above, and there is a facsimile of the code that I'll be presenting that is at GitHub at that URL as well. First, a famous quote not said by a famous person. Tell me and I forget. Teach me and I may remember. Involve me and I learn. Who actually said this is a bit of a mystery, but it's usually attributed to Mr. Franklin. Whether it's a legitimate quote or not, I think it points out an important problem in code walkthrough talks. But don't get me wrong. Not all code walkthroughs are bad. Some of them are actually quite good. They're absolutely necessary for work. I'm only talking about code walkthrough talks. Quite simply, I could write this talk in my sleep by simply working through the current implementation of many tests and explaining each and every line. You'd learn nothing from it. Sorry about that. You'd learn almost nothing from it for getting it almost as quickly as you read it. Some of the many problems of code walkthroughs is that they're boring. They're top down. As a result, you focus on the what's and not the why's. They're all good reasons to tune out and not learn a thing. So these are all good reasons for me not to do a code walk through of many tests. That quote before that wasn't by Benjamin Franklin, the real quote that it's based on is much better. Not having heard something is not as good as having heard it. Having heard it is not as good as having seen it. Having seen it is not as good as knowing it. Knowing it is not as good as putting it into practice. I'm not going to try to murder that. Here's a more concise version for those who think that was too many words. And here's a version for tender love who is my most ferrity of friends. So starting from scratch, that's the point of this talk. Working up from nothing is the closest that I can get to allowing you to join me in building up a test framework from scratch. I will try to describe this in a way that you can literally code it up at the same time that I describe it and understand the steps that you went through to get there. Now would be a good time to open up your laptops if you want to attempt this. Many have tried, few have succeeded. Further, this will not be many tests by the end of the talk, obviously. Instead it's going to be some subset and I will apply the 80-20 rule to show the most of what many tests can do in a minimum amount of code. To emphasize that, I will be referring to this as micro test from here on out. There's always two people Google, I love it. And finally, I encourage you to deviate from the path and experiment. To do things differently and you might wind up understanding the choices that I made. Finally, this talk is an adaptation of a chapter of a book that I'm writing on many tests. More info will be on that at the end of the talk. So where to begin? At the bottom, the atomic unit of any test framework are the assertions. So let's start with plain old assert. In the simplest form, assert is incredibly straightforward. It only takes one thing, the result of an expression and it fails if that isn't truthy. And that's it. You have everything you need to test. Thank you. Please buy my book when it comes out. Are there any questions? Okay, no. I would like you to have a few more bills and whistles than that. But before I add those, let's figure out what I wrote already and figure out why. In this case, I chose quite arbitrarily to raise an exception if the value of the test isn't truthy. I could have chosen to do something other than raising an exception, like throwing or pushing some sort of failure instance into a collection of failures and returning. These are trade-offs and there are trade-offs but all of these choices, it doesn't really matter as long as you wind up reporting the failed assertion at some point. I mostly chose to raise an exception because exceptions work well for my brain. There's an added benefit in that it interrupts the execution of the test. It jumps out of the current level of the code. We're gonna see more of this later. So if you ran this expression, one double equals one would evaluate to true which would get sent to the test arg to assert and that would do nothing. In response, it's a pass. However, if you ran this expression, one double equals two would evaluate to false. That would get sent to the test arg and assert and that would wind up raising an exception. At some point, there will be mechanisms to deal with those exceptions and gracefully move on but for now, the whole test suite will grind to a halt on the first failed test. One problem we currently have is that the raised exception reports at the place that the raise was called inside the assert and not where the assertion was called. I want to only talk about the place where the assertion failed on line five of test.rb. So I'll clean up the exception a bit by changing the way the raise is called. Raise allows you to specify what exception class to use and the back trace you'd like to show. I didn't know that until about four weeks ago. I'll use caller for the back trace which returns the current runtime stack where caller is called. Now we see where the test actually failed. That's much more useful to a developer dealing with the failure. Second, we're gonna add our second exception or assertion, sorry, assert equals. Now that we have plain old assert, I can use that to build up any assertion that anyone would possibly need. About 90% of all the tests that I write use assert equal. Luckily, it's incredibly simple to implement. I just passed the results of A double equals B to assert and assert does the rest of the work. And here's how you'd use it, where two plus two does equal four, it would pass. It would pass true to assert and where two plus two doesn't equals five, it would pass false to assert. The rest you already know. This is really all I need to do most of my work quite happily but the way that it stands right now it has a pretty unhelpful error message when a failed assertion raises. First, the back trace is pointing at assert equals. Didn't I just fix this? Mostly. I'm using caller for the back trace and it included the entire call stack including the other assertions that may be up the stack. So I need to filter those assertions from it. So let's fix that by deleting everything at the start of the back trace that is in the implementation file itself. This is still ugly. The failure just says failed tests. So let's make it possible for assert equal to supply more information. Let's pull up the error message into an optional argument and use that argument in raise. Then we change assert equal to use that with a helpful message. Now we get error messages that look like this. That's much more useful although it may not still be 100% resilient but it'll do for now. Let's add one more assertion. Assert in Delta. One mistake people make time and again is misunderstanding how computers deal with floats. The rule is really, really simple. Never, ever, ever test floats for equality. Yes, there are exceptions to this rule but if you stick to that rule you'll be fine 100% of the time. So while we have assert equal we should not be using that for floats. We're going to write a certain Delta instead that's just for comparing floats. What it should do is see if the difference between two numbers is close enough where close enough, our version at least is simply going to be within 1, 1,000. You can make it fancier later but for now done is more important than good. So this is what it looks like. Almost the exact same as the assert equal except using the formula stated previously. And this is how it's used and it works right out of the gate. So what does that mean? That means for now assert is solid enough for general purpose use and we can write other assertions. Writing other assertions is fine and good, necessary even, but that could take hours and I'm only 25% through my slides. I will consider this an exercise for you after the conference but think now about what your favorite thing or test is. How would you write an assertion for it? You should go do that and then have a cookie. So once you can write one test you'll want to write many tests. This starts to introduce problems on its own. It'd be nice to be able to keep them separate. There are many reasons why you'd want to break up your tests and keep them separate. Organization, refactoring, reuse, safety, prioritization, parallelization. It would be nice to keep them separate but how do you do that? We do something really quickly and easily like this. We'll call the test method. It'll take some string describing the test itself. It'll take a block of code with assertions in it and it's equally easy to implement. You ignore the argument and you yield. This gives us the benefit that you can name the tests and you can put them in blocks so that you can see that they're separate but they're leaky and leaky tests infect results. Now that we can write multiple tests and keep them organized, we need to be able to trust them. The problem is that these tests aren't actually all that separate from each other. Here we can see A equals one at the top. First test asserts that it is one and it passes. The second test mutates that local variable and tests and it passes and the third test which expects to be like the first test fails because it has been mutated. What we really want is for those tests to be completely independent of each other but the fact that one test can mutate a local variable that is used by another test is simply a mistake. This goes against good testing practices which state that a test should always pass regardless of what was run, what order they were run in or anything else. Otherwise you don't trust the tests and trusting the test is crucial. We're gonna fix this using methods. There are a number of ways that we could try to patch this up. The simplest perhaps is just not to do it in the first place. Instead, just use Ruby. Ruby already provides a mechanism for separating blocks of code and making them immune to outer scopes. It's called the method and it's free. The nicest thing about this approach is that it's absolutely free. There's no cost to using this that you aren't already paying by using Ruby in the first place. It's also important to remember that by using plain Ruby is that anyone can understand it. It does have some drawbacks though. First, you have to run the methods yourself. That's fine for now and we'll address it later. Another perhaps more pressing issue is that there's code duplication in the previous examples. There are simple ways to get around this too. I'm not gonna bother going into that at this time. If you stick to plain Ruby, it should be pretty easy to do though. So that's an exercise for you again. Now that we have multiple tests separated by methods, how do we get them to run? The same way you run any method, you call them. We could come up with a more complicated way and we will but this we'll do for now. Methods are a good means to separate tests but more problems arise. Unique method names, it's harder to organize, reuse, compose, et cetera. Luckily Ruby comes with another mechanism to solve this. Classes. Didn't I just say to keep them separate though? Yeah I did but it'd be nice to organize them. There's some balance in between. So how do we do that? Well, we take the previous code and we wrap it in a class. Done. But how do we run those? Wrapping the methods in a class breaks the current run. In order to fix it we need an instance of each class before we call the method. So we add that to each line and we're passing again. Now granted, this doesn't really do anything for us. It does group the tests in classes but more importantly it puts us in an ideal position to make the tests run themselves. Right now we manually instantiate and call a method for each test. Let's push that responsibility towards the instance and have it run its own test. By adding a run instance method that takes a name and invokes it with send. This doesn't look like much either. In fact, by adding the call to run we've made it more cumbersome. But this will make the next step super easy. It also provides us with a location where we can extend what running a test even means. For example, the run method would be a good place to add setup and tear down features or anything else you might wanna do. What would you add? Running test manually is still pretty cumbersome so let's address that next. Now that an instance knows how to run itself let's make the class know how to run all of its instances. We can use public instance methods and then filter on all methods that end in underscore test. Public instance methods returns an array of all the public instance methods on a class or a module. And then we can use enumerable's grep method to filter on those. And wrap that up in a class method that instantiates and runs each test. This really doesn't do anything different than we were doing before. It just enumerates over the methods. This allows us to collapse this into this. This would be a good point to pause and apply some refactoring. What we've got is well and good but it's only on one test class. In order to really benefit from this we should push it up to a common ancestor. Let's make a parent class and refactor that. We simply reorganize the methods into a new class. We're going to call it test because we're very creative. Note that we also scooped up all the assertions while we were at it. Now we make all of our test classes subclass that class and that's all there is to it. Do that to all the test classes and they all benefit from code reuse. This makes it super trivial to have a bunch of classes of tests that can run themselves so let's push that further. The only thing left to address is where we manually tell each class to run its tests. So let's automate that too. Since we're using subclasses to organize our tests we can use an underappreciated feature of Ruby, the class inherited hook. Every time a new test subclass is created it will be automatically recorded and tests will notify all of them off when we tell it to. First we need some place to record the things we need to run. Then we use the inherited hook to record all the classes that subclass test. From there it is trivial to enumerate that collection until each one to run its tests. That allows us to rewrite this to this and that would be ideal to put in its own file so you can just require it and kick everything off. And that's all there is to it. So micro test is kind of hamstrung at this point. It runs tests and that's great but now what? Generally it's testing is supported it'd be nice to know what actually happened. On the one hand silence is golden. If you don't see an exception raised then everything worked, right? I think this is one of those situations where the Russian proverb trust but verify is a good policy to have. So let's give the framework a way of reporting the results and see if we can't enhance things while we're in there. How do we know what the results of a run are? I personally would be pretty happy just seeing that something ran. Let's start with that as a minimal goal. Let's print a dot for every test run. As a side note, this is quite possibly my favorite slide I've ever made. Look at that, look at that. It's a dot. Tough to you would be so proud of me. It's something about ink density. So let's add a print and a puts and we're done. This is a stupid simple thing to do. The emphasis perhaps is on stupid. Quite simply we print a dot for every test and then we add a new line to make it keep it pretty at the end of the run. So doing so we'd see this. And now that we see that we ran three tests and that they all passed, that's actually really good information to know. I'm much happier. But what about failures? What happens when a test fails? Currently, if a test fails it immediately quits since it's raising an unhandled exception. That's not too terrible, but it does imply that you only see the first problem that raises. And that might not be the problem that you actually want to deal with. That might not provide as much insight, pattern matching ability that humans have as seeing all the failures at once. So let's clean this up. We'll rescue exceptions and print out what happened. We go ahead and wrap that up in a begin. We throw a rescue in. We capture the variable. We print out the message and we print out the first line of the back trace. Now that we see all the tests regardless of failures, we also don't see loads of backtraces. We're only printing that first line of the back trace. Perhaps this is not the prettiest output, but it is much better than before. But there are several things that I don't like about this code. I'm gonna do it on time. I'm actually doing okay on time, despite the speakers. I don't like the logic for running a test and doing IO is mixed up in the same method. It's just messy. So I want to address that and in the process refactor the code to be more maintainable and capable. The problem I have is that the run class method is doing way more than just running. Here we can see that there's about four categories of things that it's doing. The first thing I want to do is separate the exception handling from the test run. I really don't like that test class run is handling both printing and exception handling, but I especially don't like the exception handling. So let's address that first. Since test class run calls test instance run, which calls your actual test, it's two hops up from where any actual exceptions are getting raised. We should refactor this and break up the responsibilities. I want run all tests to only deal with running test classes from the top. I want each class to run their individual tests. I want each test instance to run a single test and handle any failures. And I want something else entirely to deal with showing the test results. So let's move forward with that in mind. First, let's push the rescue down one level so that test instance run returns the raised exception or false if there was no failure. Now we change test class run to print appropriately based on the return value. Now we have exception handling pushed down to the thing that's actually running the tests. Having exceptions only raise a single level usually means you're in a better place to deal with them. By doing this, we've also converted some exception handling into a simple conditional. Next, let's look at the IO. Let's extract the conditional responsible for IO into its own method and we'll call that report. By doing this, we've put ourselves into a better position to move that out entirely. So let's do exactly that. We extract the report method into its own class called reporter. We're gonna grab that puts two while we're at it and put that in a method called done. This lets us rewrite run all tests into something that is much cleaner. We instantiate a reporter object, we pass it through, we call done at the end. I think I just jumped over myself. I just jumped over myself. We create a reporter instance in run all tests, we use that throughout, we pass down the reporter instance to run, we use that to call report instead. And because name was a block variable before, we need to pass that down to reporter.reporter as well. Doing this, we removed all IO from the test class and delegated it elsewhere. Throughout these changes, you should be re-running the test to ensure that it works the same. But in this last case, it doesn't. The class name is wrong now that we've pushed reporting into a separate class. For now, we're gonna go the quick fix route by passing in the class. I'm intentionally focusing on fixing this bug, not using the right abstraction. Sometimes it's the right thing to do, but you pay the price in doing so. So let's add a new argument to the report method and we'll just call it K. We're gonna pass in the current class, which is self in any class method to report. This fixes our output back to what we expect to see. But we had to add a third argument to do it. And that should be a hint that we're doing things wrong. So let's try to address this now. We don't need to pass the actual exception to report. We can pass anything that has all of the information that a report needs to report the test result. And what better thing than the test itself? All we need to do is to make the test record any failure that it might have and make that accessible to the reporter. Let's add a failure attribute to test and default it to false. Then we modify test run to record the exception and failure and return the test instance instead of the exception. Now we can use the accessor and reporter to get the message and the back trace. Let's clean up the mess we made with the third argument. Now that E is the test instance, we're able to get rid of the K argument. This gets us back to two arguments, which in my opinion is still one too many. So let's try to remove name. With a little tweaking, test instances can know the name of the test that it ran. First by adding an accessor and then storing it in initialize. Then by passing the name to the initializer and not to run. And finally by removing the argument from run and using the accessor instead. We've just shifted things forward a bit. This means that a test instance is storing everything that reporter needs to do its job and we can get rid of the name argument. One more thing that I do not like is mixed types when you don't need them. Right now E is either false or the instance of a failed test, but tests now know whether they've passed or not. And false isn't helping and be the source of pesky bugs. So let's get rid of false. By adding an insure block with an explicit return and I need to always make sure that I point that out. We can get rid of the false and make sure that the run method always returns self. Next let's add an alias for the Canadian accessor, failure A and switch reporter to use this new predicate method to test. This looks pretty good now. I just don't like the name E anymore since it's no longer an exception, but a test instance. So let's rename it. Let's call it result. This makes the code much more descriptive, albeit a bit longer. Okay, at this point, I could call it a night, but the output is still a bit crafty. I want to enhance the output. It would be nicer if we separated the run overview, meaning the dots and any failure indicators from the failure details. Something like this. So let's change report to store off the failure and print all of them and done. This is pretty easy to do. We need a new failures array. To store them, we make an accessor and initialize it. Then we need to print an F in the case of a failure and store off the results in the case of a failure. Finally, we need to move the printing code down into an enumeration on that array. Now our output looks much, much better. But we're not quite done because I have five minutes left. There's just a couple of things left that are getting on my nerves. We've changed both report and done quite a bit. They're no longer doing what they say that they do. So let's rename them. Report becomes shovel and done becomes summary. We'll use those names in test. And at this point, I'm actually pretty happy with the code but we're not done yet. I'd like to add some more enhancements. One common problem that people often have writing tests is that their tests wind up depending on side effects of a previous test for it to pass. In this case, if that test is run by itself or in a different order, it's going to fail. This goes against that previous rule of testing that tests should pass regardless of their order. An easy way to enforce this is to run the tests in a random order. And that's easy to do in our current setup but I'd rather not mix too many things back into this method. So let's start by extracting the code that generates all the tests to run. Now test.run only deals with enumerating test names and firing them off. So we're in a better place to randomize the test. We just do that with shuffle. The button's not working. That's really all there is to it. We could get fancy and push it up to run all tests and randomize across the classes and the methods and get all fancy and stuff. But again, this is a good compromise and I'll leave that as an exercise for you. So we're done for now. What have we wind up with? Well, we wind up with about 70 lines of Ruby that is a good portion of what many tests actually does. It's well-factored, has zero duplications of any kind. The complexity score is incredibly low. It flogs at about 70, which is about five per method which is about half of the industry average outside of Rails. Even without any comments, the code is incredibly readable. The reporter first column, the test class methods, second column and the test instance methods all fit on one slide. That's not bad. It actually runs about twice as fast as Minitest because it does less. The worst thing about this talk is that I spent about nine slides per two lines of code but that's a price that I'm willing to pay in order to make this as explainable as possible. So how did we get there? We started with the atom. We built that up to molecules. We gathered tests into methods and methods into classes. We taught the class how to run one method. We taught the class how to run all of its methods. We taught it how to run all classes. Then we bothered with adding reporting, error handling and randomization as a cherry on top. That's where it pitches his book again. So I'm hoping to soon publish a small book under Michael Hardell's Learn Enough series. Learn enough or learn enough to be dangerous. Either way. If that goes well, I'm going to be doing a complete book on Minitest and perhaps testing philosophy. I don't know. I will have a sample chapter coming out soon for review. It's not ready yet. So please follow me on Twitter for announcements. Thank you and please hire me.