 He's from Portland. He works on the New Relic Ruby Agent team with me and a couple of other fine people. If you use New Relic, you have used Jason's code, less so my code, maybe some of my documentation or a test. Jason writes a lot of things and he's really smart about all the things that you would want to ask him about the agent. He is also kind of an atypical Portlander in that he brews his own beer and mostly just rides his bike everywhere. He has chickens in his backyard, giant beard. We don't have very many of those in Portland, it's very rare. So a big round of applause please for Jason Clark. Hello everyone, glad to be back here at Ruby on Ales. It's awesome to have a chance to talk to you again. So last year I talked primarily about home brewing in the beer side of things. So this year I thought it would be fitting if I talked a little bit more about the Ruby into stuff. So the topic today is testing the multiverse and what I mean by the multiverse is sometimes when you build a gem, it is going to build on top of other libraries that exist. The most common thing might be something like Rails where you have something that you've written that needs to work against multiple versions. So what we're going to do is we're going to talk about what you need to do to make that something that you can test consistently and easily. But rather than just dive into a whole bunch of technical details, I want to kind of set this as a story, some of which might be a little fictional, there may be some elements that aren't true, leave it to you to decide what that is. But in the Ruby space there has been a library that has just taken off in the past few years that has an incredible amount of momentum and everybody's talking about it, everybody's using it. And of course that framework is Ruby on Bails. So Ruby on Bails is the best possible framework that you could use for making command line applications quickly and easily. So this kind of harkens back a little bit to Rails, a framework you might be familiar with for doing web programming, but you know since the Unicorn Puma Wars of 2018 kind of broke things up and caused a lot of schisms in the community, it made this opening for command line apps to take off. It was sort of a renaissance of the command line for us. So all of you will have seen something like this. This is what a standard sort of Bails command would look like. So we have our class, we derive from Bails command and then we provide it with a run method. This is just about the simplest thing that you could do. It's awesome the conventions that it brings. So this would look in practice like this, running on the command line, I've got my Twitter. This is just awesome the way that I can build these command line things so easily. But like any massive shift in the ecosystem, Bails opened up a lot of other possibilities for people to think about and created other ideas out there. And so one of those was the pure metric. Now anybody here familiar with this metric? No? Well, that's good because I made it up. So it would be kind of weird if you knew about it. So it stands for Programmer Input Output Ratio. So this is a ratio that measures the amount of text that you input to the amount of output that that generates and it's kind of golf scoring. You want the lowest possible number that you can do. And so I got really interested in this is like it's a way better way to measure your productivity than lines of code. Like this is how much leverage you're getting out of your framework. So obviously got a great idea. It's time to go log an issue for it. And so I put something in on the Bails project to say, hey, could we track these metrics? Is there something that we could do so I can have all the Bails commands that I'm executing tell me what sort of productivity boost I'm getting out of them. Well, you can see from the comments at the bottom, they're like, eh, this doesn't really belong in the core. But you can go build it yourself if you want to, right? It's open source, that's how this works. And so the straw project was born. So straw is a plugin that builds on top of Bails to let us take metrics out of the commands that we're doing. And it generates a lot of data kind of like this. So the first number in that is going to be how much output. So we can calculate that. The second thing is going to be the status code so I can tell whether things are errors or not. There's all sorts of analytics and parsing that I can do off of this. Now at this moment, you might be wondering like what, how many of these beers I've had previously to this presentation or where I'm headed. But this is the meat of it because as a good Ruby developer, I want to be able to test the stuff that I've written. And so my tests look a lot like this. I'm using many tests because why wouldn't you? In Stanchi, it's a Bails runner, goes and runs a thing. And then I've got some helpers that let me look at the output that I've done from things. Look at the data that I've actually stored. So life is good. My code's well tested. Straw 1.0 ships and people all over the place are able to get these numbers and figure out their productivity and measure what's really going on. So life is good. But life doesn't stand still in open source. So Bails had a new release, came out with a 2.0 version. And all of a sudden, the way that I wrote my tests, the tests are still valid, like not much changed in the way that I plugged into Bails for the second version. But I'd really like to be able to run the test with both sets of dependencies. I want to be able to check that I haven't broken things as those projects that I depend on and build on top of evolve. Well, luckily, there's some stuff in the ecosystem in Ruby that makes this pretty easy. And that is the Bundler project. So Bundler, we're almost all familiar with it. With your typical Bails app, you're going to have the gem file in your directory. And that spells out a number of dependencies that you have and you run commands to install those gems or run commands with it. But Bundler actually has some other tricks that you can play with it. So most of the time, we've all created gem files, just a gem file in your directory. But you don't have to do it that way. You can be more specific about where to get those dependencies from. So to be able to test against multiple versions, what we're going to do is we're going to create more than one gem file. And I'm going to put them down in the test directory and name them sanely after the sort of Bails versions that I'm caring about. So I've got a gem file for Bails 1 and a gem file for Bails 2, and then my other test files. Those gem files look just like the sort of gem file that you've dealt with day in and day out. You can specify whatever dependencies and whatever combinations of things you need. Well, so this is all great, but how do I get my test to actually run and use those things? Well, Bundler provides a way through the bundle gem file environment variable. Now, I'm only exporting it here just so the thing fits neatly on a line. You can do it in line with the command as well. But when I say bundle gem file and point it to that alternate location, when I bundle exec my tests, they're going to load the particular set of gems that are in that gem file for me instead of what might be in my default gem file or just on my system. So we see everything runs neatly. We get our test output. Everything passes. Life is pretty good. Now, this introduces a little bit of difference from what used to happen with my tests, though. When I run the tests at particular time, I can't necessarily tell what dependencies I might be running it against. It used to be I was just running my tests. Now, I might be running them against this version of Bales or that version or this other set of dependencies. So one nice thing, a trick that you can play to help yourself out in this case, is somewhere in your test helper put something like this that gives you a little output, gives you a little more context of what's going on. And so when you've got long streams of output as you start testing against lots and lots of versions of things, you can tell what you're actually running against. So this ends up giving us a little nugget here in our output to help us figure out where the tests are actually breaking if we ever do introduce a regression. Now, we can actually go one step further and use other things that bundlers provided and get the list of the specs that have been loaded and build up a nice little string instead of that and spit that out. And now we have the full set of dependencies that have been loaded by Bundler for us. And depending on what sorts of sub-dependencies the other gems you have might have that will move and evolve over time, this can be really critical to finding inconsistencies and problems that may not duplicate themselves immediately with any other combination of gems. Well, this is pretty sweet. It's nice to be able to test against these different versions, but setting that bundle gem file, bundle exacting, and all of that is kind of a pain. And so we like to automate things in Ruby. So enter our rake commands. So when we say rake multiverse, this will take care of running these commands with the appropriate additional parameters that we want along the way. Take a look at how that's implemented a little. It's just a basic rake task like you might have seen before. It takes a couple of parameters. When we call it without any arguments, we're just going to look for all of the gem files that are in our test directory. We iterate through each of those. And after each of those, we back tick to go run exactly the same test invocation that we would have previously prefixed with the write environment variable to tell it to go use the particular gem file that we selected. So now we can run our whole set. It will find any of those gem files that we've dropped in that directory and run our tests for them. It's also nice sometimes to only run a certain subset, though. And so we also provide a way so you can say bales10, provide the latter portion of what that gem file is going to be suffixed with. And then it will find that by this fileless splatter that it takes on the front of that. But then the rest of how it runs is exactly the same. That just selects a different set for us to go with. Well, so that's great. We now are able to test against multiple versions. And we've got the beginnings of a setup that's going to let us grow this suite as we go along. But in open source, one of the great things about it is that other people are out there finding your problems and your bugs. So this came out. Got this issue. Make multiverse. Somebody, somebody who is that? Oh, yeah, it's my boss. So he cloned the repo down and tried to run the tests. And it immediately fell over because he didn't have the right versions of the stuff. Well, your response to this may initially be something along these lines. I don't know how it works for me. But that's not really a great answer. We do want this to work with a clean installation. And so our problem is that we're bundle-executing and running with a certain gem file. But we've never actually taken the time to tell it to make sure that the things are installed. So there's an easy fix for this. You can remember it with the acronym ABC. Always bundle constantly. This is probably a pretty good rule if you're just working in Ruby anyways. But here it's particularly pertinent. So the easiest possible way that we can handle this is to just run our bundle install in front of the command that we're doing there. Now, you'll see I've pulled the gem file and setting the bundle gem file out of the inline command. Because as this grows, we're going to see a couple of different commands that we're going to run that are all going to take advantage of that environment. And so this will allow us to set that in one place rather than every time we make the command line. So this is a good place to start. But unfortunately, an apologies to Terrence if he's in the room. But running a bundle install every time that I run my tests, it's not that bundler slows. This is doing a lot more work than I used to for my tests. This is taking steps that didn't use to have to happen, just to make sure that everything is OK. And this might go talk out on the network. It might try to download stuff. So it would be nice if we could kind of sidestep some of that effort some of the time. And there may be better ways to do it. But one step that we were able to take with this is here on screen. So we set our bundle gem file. And then rather than running the full bundle install, we tell it minus, minus local. And we run this. And so this will not try to contact RubyGems. It will try to resolve things based on what you have installed locally. And so there's some potential that it'll cut out some of that network crosstalk that goes on. If that succeeds, then we're fine. And we'll just carry on and we'll run our tests after that. If it fails, then maybe we turn around and we've got to do a full bundle install. Maybe we've missed some gems. We don't have everything that we need. So we get our environment set up and then we carry on. And bundle, exec our tests, run our tests with the environment that we were looking for and carry on. So at this point, there may be some of you that are thinking the same thing that this issue that came in from Matt was saying. So apparently there's a gem out there called appraisal. Or appraisals. And it takes care of a lot of this for you. So this is an actual gem as opposed to some of the other things in this presentation, which may be fictitious. And this takes care of a lot of the mechanics of what we've talked about so far. It's built for exactly this case. What you do is you provide an appraisals file. And one of the nice parts about this is what you do here is you provide just the deltas against your core gem file. So this is just the things that need to vary per test. So you notice it doesn't list the rake and many test dependencies because those are already expressed in my main gem file. So it will merge those together. We do an appraisal install. And it generates a set of files that looks very much like the sort of files that we had hand coded ourselves. With appraisal, then you can prefix any command with appraisal, the particular appraisal set that you want to run it for, and then give it the command. So this isn't bound to just running your tests. You could run other things and have it run in that particular environment. Now this is really nice. If this is your destination, if this is as much complexity as you need to it, this is a fine gem to use to get this sort of multi-environment testing configured without having to write a lot of stuff yourself. But we've got a few other things and other places that we're going to land. So we're going to carry on with working in the framework that we've built to this point. But before we can move forward, we've got to look back. So this issue came in. Again, thanks, boss. So for some reason, he's using a really old version of Bales. Like he's using a 0.8 version. I don't even know why somebody wouldn't have upgraded to a newer version. It's kind of crazy. But the truth is that writing the sort of thing that I'm writing, I don't want it to break. Even if it's an old and crafty version that I don't want to support, I don't want things to blow up when people try to mix it together. I want my stuff to be safe. So let's take a look at what it takes to do that and how we can test around that effectively. So our first step is we need to add an environment to test this in. So we'll make a new gem file, add in the dependency that we want, and sure enough, when we run the tests, we get an undefined method run on the Bales runner class that we're calling. So this gives us a little bit of a hint. And we go digging through the history on the Bales project and we find that somewhere before, between 0.8 and the 1.0 release, somebody thought, you know, it used to be called execute. We should call it run instead. And some of our stuff depends on it being called run. So this would be the point where you'd have to decide, do you make your code work with it or do you just want to skip the tests? Like, is this an unsupported version, basically? And that's what we're gonna choose. We're gonna choose to say, you know, Tim, I'm sorry that you're stuck on this old Bales version, but we're just gonna have to move forward. So we write a little helper somewhere in our code base that does an appropriate check for the version ranges and the conditions where we wanna be able to apply things. And so just for clarification, I know that's not the right way to do version checks. You'll see version checks like that in this presentation. You should be more cautious than just doing the string compares, but it reads a lot nicer on the screen. A little aside about that as well, if you're a gem author and the only place that your gem version shows up is as a hard-coded string in your spec file here, please don't do that. Please pull it out into a constant that other people can access at runtime. Like there's a couple of projects that I know of that I'm not gonna call out that have this and it makes it really hard for people building on top of you to do any sort of version checking. So my plea to you, make a constant, put your version somewhere other people can check it. So once we have this check, it's pretty easy for us to skip running all of those tests that no longer apply, that are there to test functionality that we know doesn't work with this version. And so we can just gate our entire test top to bottom and say, hey, don't run this test case if this is not a supported version. But that is not really quite enough. Like I don't, it's not just that the tests don't run. I wanna make sure that I'm not doing anything damaging if I'm loaded with that unsupported version. So there's a couple of ways that we can write tests that check that that's the case as well. The first one is to actually run the underlying pieces and make sure that we are not interfering with it. So in this case, we set our internal writer to a little object that's gonna blow up if we ever try to touch it. So basically anything in straw, if it tries to write out information, is gonna fail this test. And then we do a basic run of the execution of it and make sure that everything came through fine. So this shows us that we have not interfered with the functioning of the library that we're building on top of in the case where we are not a supported version. Additionally, if you're doing things where you're patching methods or there's other signatures that you can tell that your code has injected itself or built on things that you wanna avoid, you can even just use Ruby's introspection facilities to check that you're not doing those things in this particular environment. So making sure that you don't do things in an unsupported place is something that you can test for. You don't have to just hope that everything is gonna be okay when you land in that unfamiliar spot. But having looked at the past, now it's time for us to look to the future. So Bales 3 came out. Bales 3 has this awesome feature to let you unrun commands that you ran. It's like undo for the command line. This is so awesome. Like, I don't know, you don't seem as excited about it as I am, but we want to support it from Bales. We wanna make, or from straw. We wanna make sure that this works, right? So what this looks like in an app is that you just have an unrun, symmetrical to your run method. It's really beautiful the way these conventions work. We write a test for it and the test looks something like this. We spin up our runner, we give it an untweet. We ask it to untweet something for us and sure enough, if we run these tests against an environment that includes Bales 3, everything is happy. The test went green. This is exciting. But, does anybody wanna guess what happens when we run those tests against an older version of Bales? Any guesses? It's sad, the tests don't work. In this case, the output looks something like this. So if you remember from my description of the data, the one there indicates to us that this had an error. So when it tried to run the command, it probably didn't find the untweet command. Didn't know what to do through its hands up. So a lot like the unsupported versions, we need to gate this because there's certain tests and certain functionality that only applies if you're on a certain version of Bales. There's a couple of different ways that you can slice this and it's somewhat up to you which of these fits the best. So one of them, you can simply wrap the method definitions for the tests to begin with and entire block of them and say, hey, if you're not in a supported area, just don't even define these test methods so those won't end up getting run. I've also seen it done before doing an early return from within the body of a test based on that same condition. Your mileage may vary which of these you like better stylistically or if there's some other reason that you might care what to do. But my actual favorite way to do this when I have significant functionality that I need to gate off is I like to make a whole other test file that's related to this. Test files don't have to be one-to-one with the class files. They don't have to be one-to-one with anything. And so what we do is we make a file that's entirely centered around testing the unrun capabilities in bales. And we put that check up at the top. And a key thing about this is I have output that tells me that if we are in a version of bales that is not supporting this, it's explicit that I am skipping these tests. This is kind of important because maybe you get those version checks wrong, maybe something gets skewed slightly and you may not know that you are not running some of your tests. So this gives you output to let you know whenever you're skipping tests and make sure that that's actually what you want to be doing. Well, like all great projects, bales inspired some competitors. There are people with other ideas about how to do things, other vantage points on how to write command line applications and simpler ways of doing things is tends to be where they land. So that was the genesis of the Kruner project. So Kruner code for doing the same sort of thing we saw earlier, it's gonna look like this. Got to admit that's a lot simpler in the bales example. I mean, there's no classes, don't have any method desks. You know, maybe this is more your speed, maybe this is how you'd want to write them. Output looks the same sort of thing, the way that that works for us. And so as the author of the straw project, like I want all of my command line things to be able to tell me the metrics that I'm looking for. I want everything to work with this. But this poses a bit of a problem. Like all of the tests that I've written to date work with bales, they only should work with bales. They're not written to work with Kruner. And I need to write separate tests that work with Kruner. And I need some way to keep these segregated. And so what we're gonna do is we're gonna make a construct that I'm gonna refer to as the test suite. So rather than having just one monolithic group of tests that I run within my gym, I'm gonna have separate suites of them that are centered around a set of dependencies that can all run mostly the same tests. So in this case, it would be the distinctions between running for bales or running for Kruner. So this actually ends up being fairly straightforward with the work that we've already done. So our multiverse task, we add some parameters to be able to pass in the suite that we're running with. And then we break things out so that we have kind of a separate test runner class. Now, instead of just running rake test, we're gonna execute the separate test runner to go and run the things that we're looking for. Because we're getting a little more functionality. There's a little more work that it's doing over there. That test runner looks a lot like this. It takes on some of the responsibilities that the rake test tasks did for us of loading up the right files, setting some of our paths for where things should be. But at the end of the day, this will select a set of tests that are in a named subdirectory and go and load those up and run them for us. So now we can say things like run multiverse for Kruner. And this will run all of the Kruner tests against all of the gem files that we've defined for those. And only the tests that are for Kruner rather than the ones that are for bales. So this is pretty nice. This is kind of the big thing on top of appraisals that is why I kind of stuck with building this out by hand. Because this gives us flexibility to partition things a little differently and gives us these suites to separate out dependencies. But the life of an open source maintainers never quite done, get this in. So all of the time when I've been showing you these commands, I've just been doing rake multiverse. But apparently, if you bundle exec rake multiverse, it breaks, which is kind of weird. I mean, I didn't quite understand what was going on with this. Really grateful that Jonin gave it a plus one. That was important that I had his feedback there. But it was an interesting pursuit, though, to figure out why this happens. So if we look at our test execution, so if you bundle exec your rake command, bundle exec rake multiverse, this task is going to be running within the context of what that bundle exec does. So it's going to load the gem file. It's going to set everything up and load the dependencies for our default gem file there. Well, one thing that I didn't really know that Bundler also does is it influences any sub-shells that come out of that. So our process invocation there, where we're going to back tick to Ruby again, it actually ends up being not just the command there, but effectively this. So it sets the gem file to the same thing that is the gem file that is currently loaded. And then it also includes a Ruby opt to say, require Bundler setup. So before we ever even get to my test runner's code, Bundler has loaded that gem file and taking care of all of the setup and set all the paths. And if you look in Bundler's source, once it has set things up, it's not going to do it again. So when my test runner comes along later on and tries to reset what the gem file is and set it to this other set of dependencies that actually belong with my test, it's just going to look and go, I'm already set up, I'm good to go, but I don't actually have the dependencies that I wanted for that. Fortunately, Bundler provides us a way out. It provides us with clean inf. So anything that happens inside of this block doesn't receive that treatment where it modifies the sub-shell environments. And so this can run clean. This doesn't then end up loading up the Bundler stuff at the process start and we can load the gem file that we want at a later stage during our test runner. So yeah, it's true. Maybe more people would run things if it didn't take so long. And if you're like me, probably one of the first things you'd think of is maybe you could fork, right? I mean, we've got multiple test processes already, we're back ticking out to things. That would be a plausible solution. But unfortunately, I also received this issue right about the same time. Asking for JRuby support. And as we all know, the JVM doesn't fork. So that kind of threw the forking out of the window as a way to handle that in the test. But fortunately, there are some other simple ways that we can get around this. I'm gonna show you an approach using threads. It's not in here. So what we need to do is we need to, what we're gonna do is we're gonna spin up a thread for each of the processes that we're gonna invoke. So we start narrated to track all of those. And then inside of our clean end where we're doing that, we spin up a separate thread. And that thread, all that it's gonna do is it's gonna step over and invoke the test runner for us the way that we wanted. We do need to remember to join at the end of it because we gotta wait for all of those threads to wait for their processes to finish before we can get back. Now this works pretty well for a little bit until we get this issue in. I didn't notice because my tests don't fail, so I didn't notice that it wasn't failing properly, but it's kind of an issue that it doesn't return those status codes right. So we gotta track that. We've gotta do a little bit of manual handling for it. So here's how we accomplish that. So we need to initialize a status. And this will be the status that eventually we're gonna return from our process. We'll take advantage of the fact that a non-zero status means an error. And so for us to error things out, we can just tally up the exit statuses that we get from each of the sub-processes. So the dollar question mark gives us back the last process result that we had back ticked to. Now the dependence in the audience may point out that there could be race conditions here and you're totally right. There are more complicated things that you might want to do to make all of this totally bulletproof, but this mostly works for the purposes of the example. Then we gotta make sure that we exit with that status that we've tallyed up. And that's the last thing. So now we're back to failing properly. We do love Unix. It's the right way to do things, so. But of course, the life of an open source maintainer is never done. Now somebody noticed that they're trying to debug stuff, so that's good. They're trying to pry, they're trying to do some things. So that's helpful. But it actually makes a lot of sense why this wouldn't work if we take a closer look at how we're actually doing our invocation. So here we're saying puts and then back ticking out to this other command. It runs this other process. If there's a pry, some point when a debugger breaks over there, well it's gonna stop, it's gonna print out some of the state and then it's gonna wait for you to put some commands in, right? But our back tick is sitting there waiting for that process to get done before it sends any of the output back to us for us to be able to put it out here. And so as long as that's blocking there, we're not gonna see any of the output that it's given us. Now funnily, you could actually enter commands and the commands will get passed through fine and it'll catch those, but you won't see the output. So there's a couple of ways that you can address this, but the easiest is just to make things run serially in the main process. So our test runner, eventually, we'd wrapped it up in a nice class, we have some things that run. And by default, it's just running with the arguments that we're passing along. We can leverage the fact that we've built this separate test runner, though, to be able to run things directly in process with our main rake invocation. And then all of your pry and anything that's expecting to be in that process is gonna be fine. So that would look a little like this. So I named the task serial because it just runs in process and runs directly there. You're gonna have some limitations that you can only, you know, once you've loaded a gem file here, you've run for one combination and one suite, you're done. But most of the time, if you're debugging, that's probably what you want. And so here we invoke our test runner directly and everything will work nicely and you'll be able to debug your stuff. So that's where we've been. Hopefully this has given you a few ideas about how you can test things when you get into those more complicated scenarios with a lot more dependencies that you need to check your stuff against. We've seen how Bundler supports us a lot. It gives us a lot of the tools that we need to do for these sorts of alternate environments. Seen a little bit of rake, a little bit of tricks there that can let us do some fun things and automate our work more easily. We've seen how you can handle different versions and how you can parallelize your tests and break things up into suites to manage them. I hope that with all of this information, your testing will look more like this and less like this. Thank you.