 My name is Brian Lyles, I'm from Baltimore, and I'm here representing Thunderbolt Labs, but I'm not talking about Thunderbolt Labs today. We're gonna talk about Ruby Simulators. And to get started with this, this is a talk about writing simulators in Ruby. Now, you might say to yourself, who would wanna write a simulator in Ruby? I mean, it sounds pretty prostrous. Ruby is not known as a great scientific or mathematical language. A lot of top minds who are actually creating this kind of software don't use Ruby, or really they don't even use a lot of general programming languages anyways. They use things like Mathematica or crazy things like R. But you know what, we're crazy people, so we're gonna write a simulator in Ruby, or at least talk about writing simulators in Ruby. So why Ruby? Well, the first thing, this is RubyConf. I'm sure everyone in here loves Ruby. I love it. It's actually one of my favorite languages, it's probably, it's like one, two with my favorite languages. So I like to code in Ruby, because Ruby is very expressive. I have not found anything ever that I've tried to code that I could not actually just sit down and hammer out in Ruby. I'm actually, I've done a lot of Java and I've done a lot of other languages. And sometimes I scratch my head trying to figure out exactly how, how would I actually codify this idiom in code? Another thing I like about Ruby is there's no run compile loop. I mean, you write the code and you run it. And if it breaks, you fix it and you run it again. There's not a lot of setup, there's no linker, there's nothing like that. And you don't need, I mean, if you're using MRI and you just have Ruby on your machine or even using JRuby, you don't need much else to get Ruby up and running. So the next thing about Ruby is everything is an object. And I really just enjoy this fact that I'm gonna model the world and I'm just going to apply my OO Hammer everywhere I can because OO is the best way of doing things. And I'm just kidding here. So let's get into past the introductions and talking about the building blocks of simulations. So when you're building a simulator, what we're actually doing is taking these concepts called models and we're giving them inputs and they're gonna spit out outputs. And we're gonna use multiple models to actually model or actually build an effect. And we're going to actually reason on the effect that was built. So before we talk about that, there's some vocabulary words. I hope you guys brought pencils and internet so you can actually look up these words. The first word is deterministic. And I thought the best way to show what a deterministic model would be by actually writing some Ruby code that actually isn't a real model. We've all written code like this. What this is is a model of the world and it has a method answer to life. And what makes this deterministic is that no matter how many times you run the answer to life method on world model, you're always gonna get 42. Very simple example here. And continuing on models. So models can have inputs, like I said. So in this case, we have a triangle class and we wanna solve for the hypotenuse. And I hope everybody in here knows the Pythagorean theorem. So you know, hopefully you can check my math. So a squared plus b squared, yes. Equal c squared. And you notice I'm giving two inputs. The length of a, the length of b, and I'm using the square root of after you added them up and you will always, once again, get the same answer. So once again, this is deterministic. So here's another example. And to talk about, so one of the things that I do with modeling is we are actually building models of infections and things like that. But infections are boring and gross. But you know what? I bet when you were small, cooties was fun. And it's funny, I'm gonna tell a little story here. So I gave this, I gave another version of this talk in Belgium where, you know, everybody speaks Flemish and not English. And I put a slide up there and said cooties on it. And they said who? And actually people were Googling while I was talking to figure out what cooties were. You know, I'm glad I'm black in the United States where you guys actually know what cooties are. So once again, we're talking about deterministic models, models that there's no randomness in these models. You give them input, if you give them the same input, you will always get the same output. And we're actually talking more about our simulation now. So there's a cost for cooties. So you know what? If this side of the room actually gets cooties, you know it's gonna cost like $10 to get rid of, give everybody a pair of cootie shots. So these are things that we want to model. So our model won't be deterministic. There's another vocabulary word and I hope I spelled this right. I'm sure I did, it's stochastic. And stochastic means that these are models that have a little bit of randomness in them. And I only have one slide on this because I think I can illustrate this pretty succinctly here. So what we have to do is that, so if this guy right here in the front row and this other guy here in the front row, the guy on the left with the gray sweatshirt on here has cooties. What is the percentage chance that this guy here that's two seats away from him is gonna get cooties? I'm sure it's pretty high, but according to my model here, if we actually have a ran, so every time you run this, there's actually a chance where you won't get it and there's a chance that you will get it. Notice the chance is 1 tenth of 1%, so it's not very high, but he's been looking at him the whole entire time, so I'm sure there'll be lots of chances for transmission. So now we are experts in models and I wanna say this very simply. Models, we just modeled the world. What we are doing is just coding what we see and what we know and mathematicians will actually have large amounts of differential equations and they use Mathematica and it takes, from what I hear, it takes minutes and minutes and minutes to run this, but they don't have to be that hard and like I said, Ruby's expressive. Everyone in this room, even if you didn't really understand Ruby per se or you're a Ruby, like a Neophyte, you understand what's kind of what's going on here and like I said, the expressiveness is the win. So let's talk about the Ruby that I like. MRI is great with 193 and the latest releases of 193, you have fast run cycles for your test. You have a lot of gems out there, but the problem with MRI is I just don't understand its garbage collector and I just can't get my head around threading and a couple other things. So let's talk about JRuby. So what do I like about JRuby? It's fast and I have to give kudos to the JRuby team over there. 17, the release, it is way faster. I mean, startup is kind of slow still but you guys know this, but when it gets up and humming, it cooks and here's the proof because we all like to use micro benchmarks to prove all of our cases here. So actually this slide is borrowed from actually a slide later in the talk where I actually showed the code of Big Arrays. So what's going on here is you'll notice that I've actually run it twice and the three, so the arrow and the three is actually my prompt. The three means I'm using MRI on 193 and the little diamond means I'm using JRuby. I'm using RBM to switch between Ruby's and the first one, in the first one you'll see that the first line of the run is actually populating, I think it's a million arrays and then querying it randomly a million times and the second time is populating a million hashes and querying it randomly a million times. So you notice up top, it's 42 and 141 but notice at the bottom, it's 41 and 73. Benchmarks, micro benchmarks do lie but come on, that's twice as fast almost. So I mean, that's a big deal and that's why we are actually pursuing JRuby for this exercise. And I know everybody likes pictures, I kind of will explain what this picture means in later on in the talk but look at the slope of this line. This slope of the line is actually an old version of our simulator and you'll notice it is actually because our simulator has iterations so these are actually tracking the time of the iterations and notice the slope goes kind of up and there's some outliers so I drew a little trend line so you can actually see the slope of the line and you'll notice towards the end I was actually getting into Ruby's garbage collection so that's why times are skewed off the line. So, same thing with JRuby. First thing you'll notice is that the slope is much lower. One thing you'll notice on the absolute left part of the graph what that actually is, does anyone know what that is? Take a guess of what that is on the why there's a lot of dots on the left side and they go up, why they're not on the trend line? Does anyone know what that might be? What did you say? That is the JIT warming up. So notice after the JIT warms up, it cooks. I mean it really does move quickly and you know, having a nice JIT and I'll be frank with you, I have not tried this in Rebenius. No, no, there's technical reasons why I haven't tried this in Rebenius but I think even with Rebenius with a real working JIT, I mean we are getting some real performance gains and I'll tell you this code and this code was actually the same exact code just one was running with JRuby and one was running with MRI. So another thing I like about JRuby is the JVM. The JVM is a lot of smart guys over a lot of years writing a lot of neat code. I don't understand it, I don't understand all the ins and outs of hotspot, I don't understand all the ins and outs of garbage collection. I do know that it uses all the memory you have. So right here I have one of these newfangled MacBook Pro retinas and I got it with the 16 gigs of memory. I actually can, I've never in my life said I'm gonna run a process on my Mac that will use all the memory in my box. I just wrote one. So another thing is it uses all the cores and not to say that Rebenius and MRI, I'm not gonna talk about Rebenius anymore because I'm not picking on Rebenius and I'm not picking on MRI but MRI can kind of use all the cores but good old global interpreter lot kind of limits you to one core. So let's dig into that. So whenever you have things to execute on MRI it kind of looks like this. So you got, so each one of those orange blocks is a new instruction. So what happens when you run thread new? Well, not quite what you would hope. So what happens is it does actually allow for parallel execution, it's just on one core and who here actually has a one core machine that you code on? Right, so it's just a waste of money. So with JRuby, same thing, orange things are the blocks to be execution and you run thread new and hey look, you potentially could be run on multiple cores. I mean, we don't know this because the operating system is smarter than us but the potential is there. But you know, I don't wanna poop all over MRI. So I wanna actually give a solution. So if you wanna run on multiple cores on who knows how to do multiple cores, who knows how to break the global interpreter lock and MRI and a C extension. Well, actually no, it's actually not that hard. So you just have to write a little bit of C and what this function does, so this is C, this RB thread blocking region. The first argument and the second argument are, the first argument is the name, is the actual method and the second one are what you're going to pass into it. So whatever you run and use all the cores, that code will actually run outside of the global interpreter lock. So I mean, there are Ruby gems that use this, what C extensions that use this, but you know what, it's not readily accessible. You know, we get this for free for easy without having to write C extensions in JRuby. So enough about JVM. So who here is statistics, who likes statistics, who knows statistics? So everybody who has their hand up probably knows more than I do. But you know what, I can still share. So we have a bunch of numbers and why do we use statistics? We use statistics because we want to actually reason about output and data. So we have this list of 10 numbers and they actually, they are random. So and I graphed them using numbers on the Mac and it looks kind of like this. And if you look at this, you have no idea exactly how these, how this data correlates to each other. So the simple things like the simple, simple tenets of statistics are, let's look at the mean. The mean is the middle value. It's not the median, not the middle value, but the value that would be in the middle of all of them. So the mean is dot 42. And I shouldn't say dot 42, it should say 4200 or 4300. And then we look at the max. And then we also, we look at the mean. And the most important thing is we look at the standard deviation because what we're curious about is how much, so if you're running a, so if you're running a stochastic simulation and your numbers are all over a place, maybe your model's not tuned correctly. So we always look at the standard deviation to make sure that the data that's coming out, at least the numbers are similar. So maybe there's an accepted amount of error. And inside, and I'm actually surprised that Ruby doesn't include this, but there's a nice gem called descriptive statistics that you can install. And what it'll allow you to do, and I actually don't do with this, there's actually two ways to do this. You can actually uninstall this gem, and then you can actually require descriptive statistics. And what it does is it actually extends core extensions. And I know we don't like that. So what you can do is actually require descriptive statistics slash safe. And you can actually say, so I have an array A. I can actually go A extend descriptive statistics, and I actually get those methods like standard deviation, min, max, average, and all the things that array does not already include. So another neat thing about statistics are distributions. And so I was writing some Ruby. And first of all, this is not Ruby. So one thing you're gonna learn when you're writing statistics or writing simulations is that Ruby just does not give you everything you need. Actually, this is, does anybody know what language it says? Yeah, it is R. And you know what, you wouldn't normally see it like this. You would probably see it like this. This would actually give it away really quickly. This is R. What this does is it generates something that looks like this. What is this? Does anyone know what this is? And it's a normal distribution. So we'll use normal distribution. So normal distribution, I guess the canonical example is your professor in college. Someone had to get an A. The class was hard and someone had to get an A. So what he would do is he would actually, he would actually redistribute everybody's grades on this bell curve. So most people are getting Cs and only the top few are getting As no matter how bad their grades were. So what we do is we use distributions to actually model our numbers so they are something that we can expect. And another thing we do, and we were talking about standard deviation earlier. So actually you can actually model, I can actually with R draw the standard deviation. So I wanted it to be 0.5. And so it's actually one. So I wanted to actually see. So if I was actually examining this graph for see what output would, I would actually just only look in the gray block. And the cool thing about this is that this code right here, you can't do this with Ruby right now. There's a project out there called ProtoViz, which was actually, I don't know if it's still going on. I think they encourage everyone. No, there's a project called ProtoViz. So Ruby people, we get projects and we name them RubyViz. And it can actually generate graphs like this. But the problem is that the people who were doing ProtoViz actually retired that project and created something great called v3.js. But we'll talk about that later. So here's another thing. So here's more, here's actually another way to generate a distribution or generate a graph in R. And this right here is a beta distribution. And beta distributions take two values. The two is actually the alpha. So if you look on Wikipedia and you look up beta distribution, it's gonna take two inputs. The two is alpha, the five is beta. And depending on those two numbers, it actually does draw a different graph. Or it actually does draw different distributions. So notice this one actually is more to the left. And I don't know all the fancy technical words for that, so I don't wanna confuse anybody. So moving on, there's also other types of distributions. Actually, there's a whole list of distributions. And this one right here is a Weibull. And I only put this in the slides because I like how Weibull sounds. I just wanna say Weibull, Weibull, Weibull, Weibull all day long. So Weibull with a shape one actually generates a graph looks like this. So how is this useful? Actually, you know it. Someone told me like a few minutes ago what this is useful for and I already forgot. So just know that you can do this. So going back to Ruby because we are talking, we are at RubyConf and we are talking about doing simulations in Ruby. There's actually a gem out there called distribution. And you can gem install distribution. And this is how you would use it. So remember that graph that I had where it was the beta distribution? It kinda went up on the left side and then came and it slipped back down to the right. We can actually generate that distribution in Ruby. And it's actually really simple code. I just put in .2 and notice I have a two and a five there and it actually generates that number 2.73. So what I'm saying is that whenever X on the graph, whenever X is .2, the value is gonna be 2.37. And as you notice, I drew a little arrow there to actually show you that. So how will we generate a graph like this from Ruby? So let's see, more code. So you're gonna notice a little thing about this talk is that I put a lot of code in it. And I just like looking at pretty color code. So this is a lot of code in this talk. So actually what I'm doing here in this right here is I'm actually generating an array that includes all the values of the distribution. And I'm actually sampling it 1,000 times. So because we're using Ruby, we have to actually write, we have to have to generate our graphs in Ruby. And using state-of-the-art Ruby technology, I get a graph that looks like this. So remember the pretty R graph that went up and down? Yeah, I'm just not getting this. Actually what this is, this is Spark, written by I think by Zach Holman at GitHub. And this right here actually is actually on the console. So it's actually ASCII and I just colored it. So this is just the state-of-the-art right here. So don't tell anybody this stuff. I mean, this is new stuff right here. So once again, another distribution. And actually right here, what we have right here is actually there's a slight typo. So when we have distributions, there's the PDF, which is the distribution that I showed you before, but there's also something called the CDF, which is the cumulative distribution. And a good example of that would be, so when a woman goes 40 weeks for having a baby, so actually somebody could create an actual probability distribution for when a lady is going to have a baby, what the percentage is. But you notice that the graph, so if we use our graph from before, notice that right here, notice that it goes up and down. What a cumulative distribution does, it says that you can never really go down. So actually as you near your due date, the graph will actually go up. And once again, using Ruby state-of-the-art technology, I created a graph to show you that. Any questions about that graph? So I mentioned Spark earlier, and if you're on a Mac and you have Homebrew, you can brew install Spark. It's actually, it's neat. You can actually just pass it a list of numbers and it'll just create a graph for you. So, and here's what I was talking about pregnancies before and little sample code here. So another thing I wanna talk about is sampling distribution. So in a lot of cases, what our stochastic models, we're gonna actually wanna sample a distribution, we're gonna just wanna say, I want some kind of random variable out of, so my distribution describes some kind of randomness, and I wanna actually just pull a variable out there. And what I did is I wrote this gem called Vos, and what it does is instead of, so imagine you're rolling a die, and the die is not loaded. So what's the percentage, so you roll a die, and there's a percentage that you have, you have six things, you have six outcomes. So what this does is it's similar, but the die is loaded, not everything is the same. So what my Vos alias method will allow you to do is allow you to sample a distribution in constant time. And come in to find out, so when you write web code for years and years and years, you tend to not think about things like constant time, it's like, I got caching, who cares? Whenever you do things like this, there's no caching. So what this thing right here does is it actually samples, it actually just samples a distribution 100 times, but it uses the Vos alias method, and notice that it's just a little simple DSL on actually rolling a die that is based on that distribution. So let's talk about big data, I mean, because you know, this is actually how I got my talk accepted here, I think I put big data in the talk thing. So, but you know what, I really thought that I could come bigger than that. So let's talk about huge data. And this is actually the new thing that's gonna come. And so let me talk about our simulation. And you'll notice that I haven't really talked about our simulation, because before when I was explaining this, I actually just jumped right in and threw a lot of code to people, and I'm like, I didn't tell people all the building blocks so they could actually understand how awesome I am for writing these kind of simulations. So let's say our simulation has 100 people, and the simulation goes for 3,650 iterations, which will be for our example, 10 years. So if we do a little bit of multiplication, we notice that we have 365,000 actions. So because we are big data and we're using active record, this is what I thought we would do, we would just create an active record class called observation. And every time we created one of those, what's that number? 365,000 actions? We'll just create a new observation. So you know, that didn't work that well. So what the problem is, is what if we actually have 3 billion actions? And this is what will happen if we run 100,000 people over 100 years. We can't actually run active record create 3 billion times. I mean, we just can't do it. I mean, it's slow enough doing it one time. So one of the issues here is now you have to think, so we have all this data and actually, if I actually turn on all the logging out of the system, it actually generates over two gigabytes of data in two seconds. And you just think about that. I mean, this is something running on this box. This is not even big metal. And it actually makes my SSD kind of whine. So it actually goes from, it actually does 400 megabits for like two seconds. And you say, this is kind of crazy. So we really can't use active record. So you know, since this is RubyConf, I figured we would just do with the easy thing. Maybe we'll dump it to Mongo and we'll dump it to Couch. And before I started, I had one problem where I was trying to create a simulation. If I dumped it to Mongo or I dumped it to Couch, now I had like five problems trying to figure out, does my data actually even there? So, you know, I'm outside of the box thinker. So I said, I'll think outside the box. So anyone familiar with these two databases, VoltDB and Druid? So these are like these new, newfangled OLAP all memory databases that have, they're like really awesome. But the problem is that you should look at these and only put these in my slides because I actually wanna show, I don't wanna hear my SQL postgres, blah, blah, blah all the time. People need to look at these new type of databases that are actually can do real-time data and they can actually do real-time transactions on real-time data. I'm talking about if you're an ad serving provider and you're doing, let's say you're doing 100,000 impressions per second, these databases can handle it. So I said, you know what? I'm gonna be a Luddite here. I'm just gonna insert it in a postgres like this. So actually, here's just a little Rails thing. You can never, you can't create a thousand active record objects easily. So what I always do is I actually go right down to the connection and I execute this myself and I just build this up myself. So the next problem with simulations is you gotta worry about memory management and once again Rails made me soft. Once upon a time I was actually a C in a similar program for me and we actually had to think about memory management. But with Ruby you're like, no screw it. I'm just gonna do things like this. I'm going to create a billion people and then I'm gonna put a billion people in an array. Because you know, what's the worst that can happen? So I have this listener here and this actually is just the example. This listener is an observer. So you'll actually see that the new person is actually a callback. So when the simulation runs, there's actually a callback. So every time a new person is created or born, a person gets a pendant to the people array. Seems perfectly fine to me. So this is what happens. The simulator actually runs and then you pass it to the listener and then you actually run the simulator and then you can actually look in the listener and you can expect the people. So what happens if we do this again? So let's say, because I can't run the simulator one time, I actually run the simulator maybe 10 to 15 times so I can actually get a good amount of sample data. So if I do this again, guess what happens? And actually these next few slides are gonna show you something of one of the other reasons why you should use JRuby over MRI and I'm not here just advocating it but these next few slides are awesome. So anybody know what this app is? And I'm sorry it's really fuzzy because it was a small image. What is it? It's visual VM. And what this is actually showing and if you can just look at the, look at the on the left side, the y-axis and this is actually running on this map of how much memory this thing uses. That was actually, this is actually only two runs of a simulation. I was actually trying to do 10 runs. So notice it actually caps off at the first run, it hit seven gigabytes of memory on the heap and then garbage collection came through and it cleaned up a whole bunch of stuff but the second time it went up to almost 11 or actually went up to 11 gigabytes. And after that, so what happens when JRuby runs out of memory is it does something really, really cool. So if you have four cores in your box, JRuby will be like, you know what? You're out of memory, now I'm gonna fuck you. So what it does is since the garbage collector, I believe the garbage collector runs in another thread and it says, well, you know what? That garbage collector is too slow. I'm gonna run something in another thread. All of a sudden your machine is screaming and all the CPUs are pegged and you can't control CD application anymore. Just have to wait. So let's look at this. So this is actually the other side. This actually is the same image. I actually just, I couldn't fit it all because it's wide but you notice that if you look on the bottom, there's a lot of GC on the bottom and what's happening there is the people, actually what's happening is it's actually a memory leak and I'm surprised no one called that out. So what's happening is I'm actually populating an array and then creating another object but I'm never releasing anything in that array so JRuby's like, I'll keep it. So what I did is a simple, is actually a very simple thing is that after we run the listener, after we actually do a simulator run, we just call reset and we set people to an empty array and we do the same thing and we get something more same here. So notice same code, only change was that reset line and notice that what happens is so whenever it runs, it just uses less memory and it actually gets rid of all the people that it never uses. Why persist things that we aren't gonna be using? It's only used for calculations. So just a little reminder, and this is a reminder for Ruby code, we can leak memory like crazy in Ruby code. Rails proves it every single day. So once again, the other side of that graph. So I've been talking for a while and I haven't talked about building a simulator yet. Oh, because I passed dash, yes. Because by default it's 500 mags? Yeah, that doesn't work. Yeah, actually, if I wanna troll myself, yes. You know what, and it's not a fault of the language, it's the fault of me, the programmer. I'm leaking memory. That's, oh, it's okay. Okay, I'm holding memory. You know what, I'm gonna retract what I just said from what Ryan said. I am not leaking memory. I'm actually holding on stuff that I don't need. Which actually makes a lot of sense. So now onto building a simulator. So let's talk about the simulator. So in my simulator, I have a group of people and I have eight people here. And what the simulator does is it actually runs over a period of time. So what happens is this girl gets with this guy, this girl, and then this guy gets with this girl, this girl gets with this guy, this guy gets with this guy, because that's how my simulator rolls. And we actually try to figure out what happens and how cooties are being spread. So just to show you a little example of why we were actually doing this, I prepared a short video. It was just one time. I thought it would be okay. I would have asked him to get tested. But I didn't want him to think I didn't trust him. I didn't know I could catch it on the playground. We were in love, so I didn't think about it. All I did is trade lunchables. Every year, two million kids are infected with cooties. Why didn't you tell me, baby? I thought cooties were something that happened to other kids. Cooties! I just wanted to play tag. I never thought I'd be it. And the numbers are growing. I blame myself. Shh! I made a big mistake. And even though a vaccine is available, Circle, circle, dot, dot, now you have to cootie shop. Most children never get inoculated. Speak to your kids about cooties before cooties speaks to them first. You're getting a little example of my passion for actually solving or actually being able to tackle this epidemic of cooties. So back to our simulation. So our simulation is actually a big loop. You could actually just think about it. Every day we just increment the day and we just run it again. So what do we do each day? So there's a group of people that are actually alive in our simulation. What we do is we look at each person and we determine, hey people, who is ready to actually transmit cooties every day or actually who's ready to pair up and transmit cooties. So what we do is we find people who are compatible and this is what the simulation does and then we group people together. And then what we do is we do some complex calculations to see if cooties are actually shared and that's a technical diagram right there. So and that's actually all the simulator does and you know what, I do have code. I'm, just wait until later on this afternoon, we're gonna, we're actually going to, I have an unembarked version of the simulator that I think I can share with you guys. I will put it up on GitHub so you can actually see what a simulator in Ruby looks like. So moving on, so before you can write this, like I was talking about earlier, Ruby just does not like you putting a thousand or no actually not a thousand, a million items in an array. It just says you just shouldn't do that. And actually you know what, we really should not be putting a million items in an array. It's just there's our computer science classes, our data structures classes told us that there's just much better ways of storing things. And the same thing with the hash. So earlier I was talking about that benchmark that I showed the output for. This is actually the code to make it simple. Once again, I just populate an array, put a million items in it and then sample and then actually query it randomly a million times. The same thing with the hash. And to recap, we noticed that JRuby is faster than MRI. So but you know what, that doesn't really mean something. It doesn't really, it's like benchmarks, micro benchmarks are bad. It doesn't really mean anything in the grand scheme of things. So you notice that this is my actually, notice I got three up there. This is actually a run of, and probably the version of the simulator I'm going to share. This is actually with thousand people over a hundred years, 36,500 days, and you notice it ran in 129 seconds. But if you notice JRuby ran in about the same time. So just because your micro benchmarks are faster does not mean that it will actually double the speed of your app. There are just other things going on. So next thing I want to do is talk about algorithms in Ruby. Ruby is just missing a whole bunch of neat algorithms. We don't have a real heap in Ruby in the standard lib. We don't have priority queues. And actually those are things that we could actually use. And let me show you a demonstration of that. So every day we have an event. When we have, we actually keep a track of events. And if we put events into this array and we pop it off, that's kind of cool, but that's not describing what we want to do. Actually what we're really doing is having a priority queue. So really what I want to do is I want to say, add this new event at priority 10. And then the next thing I have to do, I don't have to actually go through my array to figure out what I'm doing. I actually have to do is say, hey event, pop off the next item and it will pop off the thing, in this case with the lowest priority, which will be 10. So that's cool and all. So I like the benchmark and I benchmark all the time. So here's the cool thing. I actually have an array. And then I have an array where I'm inserting at a position. And then I have the implementation of this priority queue that comes out of that algorithms gem. You know, it's nice for the nice DSL and the nice ability to do this, but you notice, look how much slower it actually is because this implementation of the heap and the priority queue is actually coded in Ruby. And this was actually a Ruby Summer of Code project. I don't remember who did it, but I mean, it's a great effort. But look, we don't, in Ruby, we just don't have very fast data structures. On the tangent, Python, which is another language that some of us love to hate, has NumPy. They have Panda. They have so many nice things. Ruby, Python, because a lot of scientific communities use it, actually puts an effort on making real fast data structures. So, moving on. So once you have a simulator, another thing you're gonna have to think about is, so you're gonna build a simulator and even me, Supreme Coder up here on the stage right now, when I build simulators, the first time I run it, they always have, they're actually wrong. So maybe I don't have the right amount of randomness. So like I said, a model is in and out in the little box here. And what we need to do is we need to be able to train our simulator, like train the values on the side of our simulator to actually return the right values because we already observed this or we just know that these to be the right empirical values. So here's a graph here. And what this is, is here's what I, because I've observed this, this is what I expect my data to look like. So infections per 1,000 people over time should look like the graph, the slope of the graph should look like this. So what we would do, and unfortunately, I really wish I could share this code with you guys, but I'm just gonna talk about it. But one of the things, and this is one of the value adds that Thunderbolt is working on, we are actually working on a machine learning project in Ruby. I don't hear a lot of people doing machine learning, like actual machine learning in Ruby. So what we are doing is actually building ways to train so we can say, okay, I input this, I expected this, but I actually got this. Now what I'm going to do is actually create a large Bayesian network and do something similar to like what spam assassin does or what Google does to your spam. And we're actually gonna traverse the network to see, is this the right value? Is this the right value? If this is this value, we'll just return this number. And so what we're actually doing is the computer is actually learning that whenever it has error, it actually looks at a standard deviation of the error and actually uses that to rationalize what the next value possibly could be. Hey, machine learning in Ruby. It's not fast, but it is machine learning in Ruby. And you know, if we need to make it faster because we're using JRuby, we'll just write it in Java. And moving on, so the last part about a simulator is you gotta have the visualizations. These are graphs I showed earlier. These graphs are this graph and this graph were created with this kind of R. But we also use GNU plot. You should learn GNU plot. Create a graph like this, all you would need to do is create a diagram that looks like this and you would just run GNU plot on that. And like I was also talking about earlier, we definitely are using v3.js. I'm not ready to show this code, but this lets you know that if you're doing visualizations now, and I don't care what platform you are and you're using something where you can use web, really have a nice look at v3.js. So coming towards the end of this talk, we learned a lot of lessons. Ruby lets you iterate very quickly. But the problem is Ruby is not very quick. So what do you do? You write the slow parts in C++ and Java, which is also the win of having JRuby. Take advantage of JRuby and the JVM and only polyglots. If you're just gonna be Ruby, Ruby, Ruby, Ruby, Ruby only, you're not gonna be able to do this. You're really gonna have to learn more than one language to do this correctly. Another thing that I wanna say is that TDD is hard. I don't know any people here if you try to TDD an implementation of an algorithm that you found on Wikipedia whenever they're using sigmas and alphas and all that stuff, that's hard. Just write it and then write the test second. But test is there. Don't beat yourself over trying to be a good developer just because you are trying to follow some kind of standard. Whenever all you're really trying to do is implement a proof that's already working, you just wanna make, just write the test second. Don't even, I beat myself over this constantly. Another thing is that Ruby lacks staff science libraries. For Ruby to be taken seriously in this kind of space, we just need this. I mean, we need the numpy. We need matplotlib that Python has. I mean, I'm not moving this project to Python. We're really dedicated to doing this on Ruby. So we will actually, we will try to build what we can and of course we will open source what we can but we'd have to acknowledge that Ruby is just not great in this space even as a journal purpose language. So that's it. I plan on going 40 minutes, 45 minutes and it's been 44 minutes. So thank you guys for not doing it all during the talk. That's it. So any questions? If they're hard, I will not answer them. I promise you. Right here. So you were saying that we don't have a lot of these libraries and the problem is that we then have to implement them in either C++ or Javaware. So as we have different runtime, we're gonna end up duplicating a lot of work. Like the advantage of just using Ruby is that generally the VMs are pretty compatible. So how about this? If there was a great C extension for algorithms and some of the statistical stuff that I was doing, I probably would not have looked at JRuby. And the reason I'm using JRuby is because actually the real simulator is written in Java and because it's just that much faster now and I can at least take advantage of all the numerical stuff that's on the JVM. So that's. So we can look at the libraries that are out there. I can't tell you the number, the priority queue, the numbers that you have being so much slower is because the code's slow, not because it's too slow. Right. No, what did you say? And what Ryan said is it's not that Ruby sucks inherently. What he's saying is that the reason that it slows because it's probably the code that's in the algorithms gem, it's just not very good. And you know something, it's hard for whenever you code for money and like you're not coding for fun to actually sit and replace things whenever you're actually under a deadline. So that's kind of hard. Jesse? So there's been a lot of movement on CyRuby lately. They actually think out of cram from. So they're working at basically doing this kind of stuff like getting us these tools. So out of CyRuby, distribution gem comes out of CyRuby. There's a statistics gem that comes out of CyRuby. And then what I was talking about RubyVez is part of CyRuby. What I'm saying is that I'm glad that, and I do know that it's speeding up again. It's just that it's not there yet. So it's not a complaint. It's just saying this is what it is right now. We will try something different. So, yes? You proved the mentioned something about testing. How and what do you test in a stochastic model? So, in a stochastic model, there's a couple ways you can do this. So he's asking me, how do I test a model that has randomness in it? There's two easy ways. The first easy way is whenever you create the model, whenever you initialize the object, and at the second, create an options hash, and in there, put a key called rand and passing your own random number generator that can actually only return five. So you know what the randomness is always gonna be. The second way you can do it is you can run it a whole bunch of times and then sample it to see that it's actually within, and like if you're using many tests, like there's the within, is that what it is, within, you can actually use within and actually determine if it's close enough. And actually, that's how I use it in the VosAlias method. I just actually run it 10,000 times and just sample it and make sure that it's close enough. One of the things that I do in my tests is I will have a setup that sets the random scene to 42. So I always know that I've got the same sequence and the number is coming out. That's actually another good way as well. That sometimes has problems across versions. And another thing, the reason why I always stray from that because sometimes you'll do that and then you'll call random somewhere else and you don't realize what the next value is and now you're off. And that's the only reason why I've been bitten by it before because I wasn't careful. So any other questions? All right, well thank you guys. You guys have been great.