 I'm going to be talking about Revidius. But what have you done for me today? A really great year for Revidius. At last year's RubyConf, we announced that we were finally doing a 1.0 RC of the project. And now we're just about ready to do our sort of third release, bug fix release for the first feature release, 1.1.1. But let's do it real quick, because I don't know what kind of audience necessarily I've got here. So we'll do it real quick, sort of like, what is Revidius just to get this out of the way? So it aims to be a modern Ruby implementation with a specific philosophy of trying to use as Ruby to do as much of the work as we can inside the system. So allow Ruby to extend the system. So the quickie example, though, is this one. This is not a contrived example. This is actually the copy and paste implementation of math that we use in the system that everybody uses. It's not just some slide where this is the real deal. This is the only implementation of math that we have. And it looks exactly as you were to write it. This is probably what you would write. So this is sort of the canonical idea of Revidius, that the way that you would write it the first time is probably the way that we would write it as well. So we aim for 187 compatibility and we'll talk a little bit more at the end about future things, one, nine, that kind of things. We run Rails 2.3, we track Rails 3 head. Actually, so we try to make sure that we're always going to be compatible with the next version of Rails so that people don't have that pain. People can, of course, sort of depend on that. A big part of the project has always been to be able to run C extensions. So when we started it off, we really hammered it hard to make sure that we can run C extensions so that people didn't have to go through and re-implement all of that functionality that they use in terms of Mongo, in terms of all those different things. Local Gary, that we didn't want to have to redo again so we wanted to give that power into the system as well. We've got a number of spin-off projects such as RubySpec and FFI. And really the idea is to be dropping compatible. You could be using Revidius and you could stick it right in, you could just drop it right in there and not have to really change much about what year world flow is. Technology-wise, what does it mean to be a modern Ruby and modern VM? We try and bring a lot of technological innovations that have happened in the last, yeah, 2010, so in the last 30 years to Ruby, right? So those are things like optimized machine code, byte code virtual machine, sophisticated garbage question, all those things that people, computer science people have been working on, how can we bring those to Ruby? That's always been the aim. A really, just a quick example of something that we really pride ourselves on is bringing these things in and making them available to Ruby in a way that you guys don't even really have to do anything. You just sort of get them for free. That's the idea of being drop and compatible. So this is just sort of a quick example. So normally in Ruby when you've got an object that's got instance variables, so you don't have to declare your instance variables in Ruby. You can just use whatever ones you want and they just sort of appear on that object, right? In fact, you can even add instance variables in methods that were mixed in and they still all sort of work the way that you would expect. So consequently, in Ruby normally, your instance variables are implemented sort of like this. It's basically just with a hash. We've got the keys and the values. It's sort of what you would expect from just a reciprocal implementation of this. You don't know what the keys are gonna be in the future, so you just want to live it up to some kind of implementation that lets you add arbitrary keys in there. The problem is that now for, if I have a thousand, one of these obs, right? A thousand instances. Now I have a thousand hash tables, even though, and all of them redundantly, have a key with name that points to some value and a key with age that points to some value. Those keys are all redundant. All the hash tables are the same. So we set out to really, because we implement so much in Ruby, we needed to figure out how could we get our memory usage down? So because we have so many Ruby objects, if we improved the memory usage, we improved it just sort of out of the box, memory usage would go down. So what we do is we come up with a technique that lets us turn it from this into something more like this. So now what we've got is the ability to have the instance variables that are just sort of in an array. And then you get this shared hash that is shared on the class. And now instead of a thousand, one of these hashes, you've got one of them. And instead of a thousand, so instead of a thousand hashes here, you've just got one array that maps these values here to the numbers here. By and large, this ends up having a huge memory efficiency improvement. And what we can do, because we do so much of the processing in Ruby, because we've got a really nice sophisticated VM to do these kind of things, what we can do is just spy on all the instance variables that a class might be using and just add those to this hash just one after another. So you don't even have to do anything to improve the memory usage of your program. You just have to sort of write Ruby in a very normal way. And our system can actually figure out the best way to lay these things out of the memory. So just sort of to emphasize the point, lastly, this is actually what it ends up being inside of Ruby. This is not actually a separate array. We get this nice, very simple compact representation of the object with the actual value stored right there. And this has actually a really big, I think we went from something like 120 megs in a very simple case down to something like 80 megs. So again, that was a really big improvement just with this one simple thing. This is just a quickie example of the kind of things that we try and do every single day that we try to bring to the system to really keep building a better VM, keep building it up, but something that you get for free. So one thing that we try to emphasize a lot is that we want to be friendly to developers. If I'm selling something, I'm selling it to other developers. So it's really about developers. Developers, developers, developers, developers. And what do developers love most is they love APIs. They love being able to poke into Twitter and to fiddle with things. So really what we try to do is we build our tools around APIs as much as we can. We actually, if we want a tool, if we want to build a tool, we build an API for that first and then we build a tool on top of it. Even if the API is kind of crafty, even if it's kind of weird, we build some level of version zero API that we can build on that people can use that we can get feedback on that we can do from there. Because that allows our users to build more because they can use those APIs. They can extend, they can take it slower to the next level that we couldn't do with those tools. So a really great example of something that is very rarely an API is our compiler, our Bico compiler, is a full, just normal API driven. And originally, we didn't set out necessarily to make an API. It turned out that this was the easiest way to do it. But it's become really important. And then we've got a double slide in there. But so really the example of this is there's a language called fancy, which was presented at Alaskan. It's kind of just a language that a couple of guys thought would be fun to write. It's different than Ruby. It's actually more small talks than Ruby and they try to just sort of experiment as we kind of all do, just experiment with something new. But because Rubinius had its Bico compiler all API driven, it provided them the ability to just write their language directly to running on top of Rubinius Bico because they can reuse our compiler. They don't have to go through and rewrite all of that logic to figure out how to write the compiler. They can simply write to our API to generate those kinds of really interesting applications that personally we don't have time to come up with. But because we're writing those APIs, people can do it for us. So that's all nice and good, but if Rubinius has a drop in, then why should you even bother to use Rubinius? If it just drops in, it's exactly the same as the things you had before. Then what's the catch? What's the big deal? Why should you spend the time to try and do that? So what we're gonna talk about, okay, these slides do not go the way I thought they did, but that's okay. So because we say we focus on developers, and because we are really Ruby developers, we sit every day and we're working on Ruby code. And occasionally we work on the like C++ or other languages to get what we can't do in Ruby, but by and large, every day we sit there and we're writing Ruby code. And so we have had a lot of experience as Ruby developers, writing essentially a very large Ruby program. Because we've got control of the whole system, we've got a lot of tools that were born from that need. So from those two points of why should you use Rubinius to this idea of tooling, we've actually built a lot of interesting tools that can help you solve sort of day-to-day problems that you haven't got. So we're gonna go over in this next session, it's really these three simple problems that pretty much every Ruby programmer runs into, and how Rubinius can help you solve those specific problems. So the first one is improving an algorithm. So who here has ever gotten code from a co-worker that you didn't like and you were told to improve it? Raise your hand. If your hand is not up, then you should give one of these other people with their hand up a hug. Because the reality is it's always extremely painful, it can be extremely painful. And when you're told to not just say, fix a bug on it, but to actually improve it, that makes it even more difficult. So let's look at some tools that Rubinius gives you to help do that. So the first step of trying to improve an algorithm is always to benchmark. You should never start off trying to improve something, we're talking improve performance-wise, improve something performance-wise without benchmarking it first. So you write out a real simple benchmark script, it looks something like this. You've got your co-workers code and you know that you need to be making calculate awesome score faster. But you don't know what faster is yet because you don't know what the speed is right now. So you go ahead and write this to calculate that out and you get some output that looks like this. And really what everyone in here did is they looked at this right most number. Probably very few people even looked at these other three numbers. And so we're gonna take a minor tangent and we're gonna talk about this output because this output will inform us, not just this right most number, but the rest of them in how we figure out what our solution is. So like I said, we looked at this right most number, let's go back to here, let's break it out into slides a little easier to see. So user is really low, user's low, system is super low, this total looks like it's the addition of these and then real is massive. So what are these individual columns mean? Well user is the time that your program spent running user code. So in other words, the time running loops, the time doing calculations, all that kind of stuff. System is the time inside the kernel. So if you go out to do I.L. That's where you're actually asking the kernel to perform some operation for you. So your user time stops, system time starts. So it'd be obvious therefore that the total is gonna be the sum of those two things. And the real on the end there is the elapsed wall clock time. So what is wall clock? Now that would be the time you would measure on stopwatch. So if you hit enter, you were to start a stopwatch and if it finished, you were to finish the stopwatch, that is what the time would say. And that's why generally people look at that, this bottom number because that's the one that the human experienced. You running this thing experienced the real number, the real, the wall clock time. But what you should be asking yourself is why does the total not equal the real? Because if the only places to spend time were in the system, in the user and in the system time, then why are these not equal? And we'll get back to this in a sec, but knowing why this is a little later on as I explained it, will help you figure these things out in the future. So we've benchmarked it now. We go on to step two, you just need to profile it. So you know it's taking basically 20, well almost 30 seconds to perform this one, this one operation. So where is this time going? So Raminius gives you a built-in profiler that's wired straight into the VM that gives you instrumented profiler results that you always have available to you. So you can just run, so again, what we've done is we've run the program, this time we've packed dash capital X profile. So generally what we do for Raminius is we have these dash capital X options, our sort of internal options, so they won't collide with the normal Ruby options. So in this case, we're saying turn on the profiler. And so we get some output that looks like this. We run into our sample here. And we see, okay, well, this is not very useful, this is probably what you would say. You'd be like, if you only stopped here, you might write us, you might write me an email saying like, Evan, your shit sucks. I ran this benchmark, bro. And look, I've gotten 99% you, I just can't use your basic mark. You close up your laptop and maybe you, you go back to your work, which is sad. Because you need to go a little bit further. You need to figure out why is this, whatever this is, receive time out, why is this taking up this much time? You can't stop here, you have to go to the next level. Ooh, my slides, that's okay. What we do is we run it again, but this time you're passing an extra option. You're passing this dash x, not gonna bamboozlease, this is that though, dash x profiler.graph, right? So this will give us, this will tell the profiler that we want more information about what happened. We don't want just this very simple, like this took up a lot of time, this didn't take up any time. That's just an oversimplistic view of figuring out, we wanna know where is this time going? So profiler graph, give us the ability to figure that out. So now we're gonna get a whole truckload of information, probably about 40 times more information out of profiler graph than we got out of the last one. So it's gonna be a lot of data, right? But before we knew, remember, we knew that it was this thing called receive time out, was where we had the most things. So this big truckload of data, what we do is we find a line for receive time out here and I'll give you the super, the quickest possible intro to this format is that see how this is indented more to the left than the thing above it. The line that is indented to the left is the primary entry. So what this is saying is that kernel sleep called a receive time out and it called it 10 times and it took up 30 seconds for 99.9%. So now we're on to something. So we've seen inside, actually what we're seeing here is how kernel sleep is implemented. It's implemented with this thing. You don't need to worry about it, but you know, okay, sleep is the bomb diggity. That's what we need to be checking to look at. So what you do is you say, okay, where's the entry for sleep? And now you get this entry for sleep. You correlate them by saying this call entry for sleep. Now we look at primary entry for sleep. That's how we have it right here. And now we're getting somewhere, oh, look at this. We've got some co-workers code. We've got something called super girl factor. Huh? Well, what is that action? We got it. Now we're inside the co-workers code. Now we have a place to actually start looking at the code. So we go ahead and we find that method and we open it up and we find this. And you just, you probably want to shoot that, right? But now you've figured it out. You didn't have to go trucking through all their code. One method of trying to try and find this. You were actually able to pinpoint it. So now we're on to step three. You take that line out, you fix it. In this case, you take that line out and fire the guy. But, right, so you fix the problem, right? So now you, now you benchmark it again. And now, oh, now the line is, now everything is looking great, right? Now you get to go home, right? Now it's maybe a Thursday. You take a Friday off, boy. Oh yeah. So let's go back to our little mystery from before. Remember this guy? I'm sure you've all sort of figured out what it is, which is that sleeping is invisible to all but real time. So if in the process, if in the course of looking at your profile results, you see a total that is wildly different than your real. You've got really only two reasons that can be, one, your co-worker is fucking with you and has put a sleep in a method called super hair factor and then you want to get on the phone, right? Probably call somebody like that. Or number two, your server that is running this program is heavily loaded down. So what that actually means is that the time, so your last clock time was click, well, click, right? But your computer was emulating multiprocessing by swapping processes in and out. So if you had a very heavily loaded down machine, even if your code was simplistic and it just had an algorithm that could run very quickly from front to back, if the kernel was so busy with everything else in the system, it couldn't get around to actually running your program. You see a massive skew between your real time and your total time. So that's a really good way actually to figure out if you're running on a server that is having really hard problems, keeping up with the kind of workload that you're trying to throw at it. So that's sort of a bonus to knowing how those benchmark numbers work. So improving an algorithm, solve, go home, take Friday off, you deserve it. So let's go on to the next one. Who here has ever had a slow or hung Ruby process? There we are. We'll leave that for later. So what we've got inside of Arminius is the ability to, we have got some specific functionality called the query agent that lets us figure out and ask questions of Ruby processes. So let's see a little bit how that works. So here we've got just some random whatever it is. I don't know what it is. It's just some server and in this case it happens to print out the time it started on instead of starting at its page. So it's just a server that's doing some normal servery thing, but it's totally hung up. So what we wanna do is we wanna use query agent to inspect what's going on. So what we do is we start that startup again, but in this case we've actually added this dash x agent start option. So we've actually told the system, oh yeah by the way, go ahead and start up the query agent. There's no overhead for doing this. So ideally, we've talked about basically doing this on every process because it's really sort of a no overhead. There's nothing, you wouldn't pay anything for this. But ideally on all your server processes, you would be doing this so that you could, when you run into the scenario you can fix it. So what we do is we've built in, the query agent is built in. We've also built in a thing called just a simple console that can talk to other processes running query agent to figure out what they're doing. So in this case, what we've done is work. We've opened up another terminal on the same machine and we've run an RBX console. And in this case what it actually did was it figured out, oh look at this. There's actually another VM already running on this machine and it was connected to this port, whatever, just some random port. And it automatically connected to it. So it does this because it sees, okay, there's only one other Rubinius there, you probably, that's the one you wanted to connect to. And you can verify that by running the PID and you can see that that's the same PID as we had over here. So you know, okay, I'm connected to my server now. And then what you could do is you could just say backtrace and you get this large backtrace. Let's bounce it up a little bit here. So in this case, what we've actually asked for is a hung process. We have now remotely retrieved a backtrace of exactly what every single thread is doing. In this case, it's only one thread. And now what we can do is just look through this backtrace to figure out why is it hung. What is it doing right now that is causing it to A be slow or B be completely just stuck? In this case, we actually just want to look at these top few lines because we can see that most of this is just sort of Rubinius in intro code trying to load in that script. In this case, okay, we've got this important server who's getting a request and then we're doing a, oh, look at this, hmm. Maybe it's stuck inside some IO request. So this backtrace is, because we've got so much inside Ruby, written inside Ruby, these backtraces can be very detailed and they'll give you, they can tell you exactly what's going on. In this case, we can see that we're actually trying to fill some IO buffers for getting some output. So our process is hung just waiting on some IO. So additionally what you can do is you can add, you can use this query as your protocol to just sort of, now you could maybe, you're running it, it's running slow, so it's maybe running in spurts. It's only running at five requests per second instead of, I don't know, 5,000. You want to figure out why that is. You can easily script that to get many backtraces and then sort of compare them over time and get a snapshot of how, what your process is doing as it progresses over time. So what we've done is we've added an extra option here that will output these lines right here, these query QA lines that give you a little information about what the query agent is doing. In this case, now what we can do is we can do another option to console. What we've said is actually, connect to that specific port. So we did verbose, it gave us the port number. So now we can say, okay, connect to that port and give us a backtrace. And now we get the backtrace and the exits. So now you can actually script this. You can give this to your system administrators and you can say like, okay, if you've got slow processes, this is the tool you can use. Run this thing, it'll grab say 30 backtraces over the course of 30 seconds and use those to try and figure out why is it slow, what kind of information can it give to the developers to try and fix that, this specific problem? Additionally, like I said earlier, we're focused on how we can build tooling to make these things better. So console is actually the sort of sample, the simplest possible interface to the API that we've built for actually, for creation itself as an API. So, this is the Ruby code that you would use to execute that backtrace. So again, there's just a class called Rubinius agent, you give it a port, you give it a host to connect to it. So you can actually connect over a network to do this as well. And then you just say get system backtrace. So creation is organized completely around variables. So the backtrace is actually just available on a variable called system backtrace. Every time you read it, it gives you a live backtrace. So we're not static variables, they can be dynamically generated by backtraces. So you have to take this code and you can integrate it into your monitoring so that now when processes are being slow, you can use some tooling that you've got to go on the machine and say give me backtrace snapshots of all of say our unicorn processes right now. I need to find out what all of them are doing at this exact moment in time. And that is incredibly invaluable for figuring out both why it's broken or just why it's not performing the way that you'd like. So query agent, what is it exactly? It's a socket based API that's actually wired directly into the VM itself. And the VM is what you're talking to. So that way you can do things while the Ruby site is hung, you can still talk to the process and you can still get information out of it. Like I said, it's organized around simple get and set variables. So it's actually a very simple API to use similar to SNMP if any, but you did that in a formal life then. I'm sorry, but this is actually much easier than that. So anyway, slow and hunt process is solved. We've done that, you get to go home again. Good job, now you go home on Wednesday. So let's move on to our last one, which is memory footprint. So who here has had a Ruby process that has had an unknown memory usage bug? It's just been growing and you're like, what is going on with this craziness? So you suspect there's some kind of memory leak, but you can't figure out where it might be. So we have a tool built in called heat dump that will let you figure out exactly where your problem was. So what we're going to do is we're actually going to use, query agent is actually built in, the heat dump can be accessed through query agent, so we're actually going to be using heat dump by query agent in this case. So we started this process that's just doing nothing, it's just sitting there. Say it might be your server process, ideally it means your server process is having this memory error. You want to figure out what's going on. So we come back to our friend console, and what we do is now we actually do, rather than just backtrace, we're actually going to set a variable. And this variable is called system memory dump, and then this is a file name. So again, like I said, these are dynamic variables. So the VM knows that when you set this, it's actually going to initiate a heat dump right now. So now not only can you get backtraces for our hung processes, if you suspect maybe they've grown out of their suck garbage plot team or they're doing something like that, now you can actually get the memory layout of those exact processes to help you overcome those problems. So we run this, and then we jump back over to the other one, and we see that it's printed out that the heap has been dumped to this very creatively named heap.dump file. So this heap dump format is a very simple format that we came up with, that we've written a reference Ruby class for reading this format, and therefore you can use that to mine it out. So it's a couple of simple tools that we've got. A really basic one that I just sort of whipped up actually for these slides is one that will give you a histogram of what was going on inside that heap dump. So this is similar if anybody was in the last talk with Jerry, we talked where they were talking about showing you interesting information about the heap. This is exactly the same thing. So what we've got here is all of the classes, the numbers of classes and the number of bytes of these classes taking out. So in this particular case, we can see, okay, there's lots of rubidious internal stuff, which is fine. And then, oh, look at this, big crazy class. There's a thousand of these. And so what we can do is we know, okay, well, obviously there's, I don't remember intended for there to be a thousand of these classes. I don't intend for maybe there to be one of them. So now what we do is we know exactly where this, there's a thousand of these things, and now we've got really solid information to go on to figuring out why we've got this memory problem. So like I said, the heap dump is a complete dump of your object layout. It models all the references between all the objects, and you can use that to sort of build whatever tools you want on top of it. Stable format, reference reader, like I said. And we have a, I've been working on Rails front-end for this that is much easier to work to visualize these things than on the console called Gage that lets you sort of browse around and actually we'll do a little demo at the end of the talk if we've got time, which we might, depending on how hungry you guys are. So future work on this is tracking allocation sites. So this is actually the product thing that's missing the most out of this feature that we're working on for future releases is the ability to say, okay, I need a heap dump, but I also want to know from now, for say for the next 20 minutes, I need, want you to track who is allocating all the objects in the next 10 minutes. And therefore you can use that to drill down even further to which lines of code which classes are actually allocating a lot of objects. That gives you even more introspection. That's not there yet. That's something we're working on shortly. So memory footprint problem solved mostly. Once we get that track allocation there, it'll be much more useful. But as is, it's still incredibly useful for say, if you've got a big Rails app, you can use this heap dump on a running process, copy it off to another machine, and then mine it for information. So you might mine it and realize like, okay, wow, look at this, we've got 100,000 active record objects that are in memory live at a time. That's probably not what we want in each individual process. So you can use that to guide you or debugging, to guide your deployment strategy, all those kinds of things. So those are our three big sort of problems that I wanted to highlight for you. Interesting and ways that you can use for Binias. So what's the future direction for Binias? So 1.9 support, that's what we're working on. We're actually got three big sort of features running in parallel right now. The first one is 1.9. So Brian has been working on the, starting to work on the syntax and everything associated with 1.9.2 support. Second one is Windows. We've sort of eschewed Windows up to now, mainly because we didn't have any Windows programmers working on Binias. So we've sort of bucked up and bought a license to Visual Studio or whatever. And then... And that's started to get this working on Windows, because we know that there's a huge market out there of people who, you know, they're not necessarily here, or a lot of them aren't here, but we know that getting it working on Windows is something that's important for a lot of people. The other one is the idea of full concurrency. So right now, Binias has a global lock. So in other words, if you're running multiple threads, those threads aren't allowed to run concurrently. So one of the big features that we've been working on for the last few months, that will be in some future feature release, is no getting rid of that global lock. So you can actually run things fully in parallel. And again, we can demo that if you guys want to see that. So the big question is, seeing all these really interesting things that Binias can do, being a drop in replacement, I hope that you're asking yourself really, why not use Binias? Why not give it a shot and see how well it works for your workload? And with that, thank you very much.