 My name is Denis Ushakov and I live in St. Petersburg, Russia, and basically that's awesome because in summertime St. Petersburg looks like this. But I should warn you, if you want to go to St. Petersburg in any other season, you probably would be disappointed because it looks like this. So I'm working at JetBrains for six years and I do lots of different stuff and one of the things I do is working on debugging support for RubyMind. So I decided that would be a good opportunity to tell you about the things I know about debugging API and debugging internals. So if you want to do a debugger, you need to be able to answer two basic questions. The first one is where are we? So in order to debugger to hit your breakpoint or provide you some information, it should know where in the code it's positioned, like the file and the line. And the second question is what's going on? So we need to know whether we have an exception raised and what variable values we have and stuff like that. So let me start with a super small demo. So here's a really, really small program and that's how it would look the most simple scenario of debugging. We put a breakpoint on some line and then we execute our program and it stops on that line and we can see variable values, we can see the global variables, we can see the stack frames and we can switch between the stack frames and also we can evaluate different expressions. So that was the two big questions but there's also one very, very important side question. That's the speed. Because we don't want the debugger to be super slow, we want it to be as fast as possible, ideally as fast as the program without the debugger. So during the talk I'm going to be doing some measurements and I'll take a small, really simple program and with lots of iterations and in order to avoid input-output effects, I would do no output in that program and I'm going to be measuring the execution using the default Ruby benchmark framework. So here is our really simple, really small program with lots of iterations. To set the benchmarking, I'll need to run it without any debugging API enabled and on my machine it takes about five seconds, not bad. So how are we going to understand where we are at the moment? Ruby 1.0 introduces really simple and handy function, it's called setTraceFunk. It takes a proc and every time an event happens, Ruby VM event happens, your block gets called and you get event, file, line, ID, binding and class name. File and line are pretty obvious, ID is the method name that's currently executed, class name is the class of the object in which that method is executed and let's take a deeper look at the two other arguments. The first one is event and there are seven groups of different events. The most basic and the most often happening event is line event. It gets called on every line, almost every line of your code and sometimes it gets called twice for a line but usually like one line, one line event. The second group is call and return. These two are generated when Ruby VM calls a method or returns from a Ruby method. C call and C return. These are basically the same except they are generated when you are executing a C function. Class and end are generated when Ruby VM starts interpreting the class body and when the class body ends. This event is generated when the exception happens and Ruby 2.0 generates two additional classes like B call and B return which are generated when the execution enters the block or leaves the block and thread begin, thread end when you start or end the thread. I should notice that the set trace fun doesn't know about these two new classes. So the other interesting parameter is binding. Basically that's the same you would get with the kernel binding and it captures the execution context such as variables, methods and their values so you can reuse that to perform evaluations later on. So let's add some output to see how our program looks from a debugging point of view. So we see that first of all we call method our action on an object then line event gets generated for that method. Then we have a C call as we are entering times method. You may notice that's a C call because that's a core method and it's implemented in C. Then we get line event for times block. Then we get call event for go to RailsConf method, line event for that method and basically we are returning it from all of those. So does anybody have an idea how long would it take? Yeah, definitely. It takes about two minutes, like two minutes from a five seconds, what's going on? Basically the problem is with the binding. Evaluating binding is really, really expensive. So to evaluate the binding we need to walk the stack to gather all the available variables. We need to store their values and it takes lots of time. So what should we do? We can just keep calm and wait for the rails. To be honest I launched, I was able to boot the simple Rails application with the set trance tracefunk enabled once, haven't tried that anymore. So what should we do? And Ruby 1.8.3 introduces new method. It's called RB add event hook. You may notice that the code below doesn't look like Ruby because that's C, yeah. And like you can specify the function that's basically callback that would be called. And one big difference is that you can also specify events. So for example, if you don't need a line event, you can just specify that you want all other events and that would make execution faster. So let's try executing the same program with the empty callback, but using all events. And it takes ten and a half seconds, not bad, not bad. So generally when we have NC API, there should be a wrapper gem that you can use. Unfortunately, that's not the case. There is a gem, but it's only compatible with Ruby 1.8. So at some point I thought, okay, I'll just do some stuff and fix the compatibility and get it ready for 1.9 and 2 and well, but actually I'm quite a lazy person. I really love being lazy. And thanks to Koichi, I can't be lazy because he did all that for Ruby 2.0. And he brought us new APIs that's called TracePoint and Debug Inspector. So what's the TracePoint? Actually that's the wrapper around the event hook with the good object-oriented Ruby API. So what you need to do is just specify events you want to listen for and you'll get your block called every time that event happens. And you can get from the TracePoint object, you can get all the information you can get from this at TraceFunk. Let's try our program with the TracePoint. So it takes about 30 seconds, which is not as good as event hook, but still pretty good. And that's true unless we want to access binding because it makes the program run as slow as we have with the TraceFunk. So the biggest difference between the TracePoint API and TraceFunk is that the binding is evaluated lazily, so we don't spend lots of time for that. And the second API I was talking about is Debug Inspector and also about being lazy. So when you have TraceFunk or an event hook, you'll need to continuously maintain all your frames, list of the frames, and continuously capture the state of the virtual machine to understand what's going on and to present all the frames at the time we hit the break point. With Debug Inspector it isn't so. When we reach the break point, we can just call the Debug Inspector open, and our callback will receive an object with the backtrace and all the M internal information about that. But you may have noticed that it's also a C API. And as we know from the previous slides, if there is a C API, then we have probably a wrapper jam for that, and it's obsolete. Well, no. There is a pretty handy Debug Inspector jam, and if you want to access all those APIs from Ruby, it's quite easy to do so. So just call Debug Inspector open, and you get your object, and you get your locations, and you can get the bindings from the frames or class and stuff like that. The only limitation of this API is that you cannot use the object that you get from the block. You cannot use it outside that block. So if you, for example, need to access the bindings outside of that block, you need to capture and store them elsewhere. So here comes a small performance summary. We'll get five seconds with a simple program run. SetTraceFunk is really, really slow. It's two minutes. And AdventHook is super fast, and TracePoint is in between. I should add that it's possible to get the almost the same performance with the TracePoint as we get with the AdventHook when we are using the C block. Because unfortunately, when we run the debugger, we have to watch for all events, and as you have seen, this small program generates nine events per two calls, and that's a lot. For else, that's even bigger. So who's using what? So SetTraceFunk is actually used by Debugger B that comes with your Ruby interpreter, and that's why it's so slow. AdventHook is used by RubyDebugBase and DebuggerGems. RubyDebugBase is basically the debugger front-end for RubyDebug and RubyDebugID. And it's also used for AirCalf, which was used to capture coverage for 1.8 until we got some sweet coverage API that's used by SimpleCalf. And TracePoint is, and Debug and Spectre are used by Debase, ByBug, and BindingOffColorGem. So they are 2.0 only, and DebaseGem is front-end for RubyDebugID. But it seems that under the hood, everyone is using AdventHook, because SetTraceFunk generates a new hook, which is executed on every event, and it generates the binding eagerly. And SetTraceFunk is doing the same, but it generates the object, which evaluates binding lazily. So let's continue speaking about being lazy. I love being lazy, but you know what? Your CPU also loves being lazy. If you don't push him to hard, he can do things faster. So with the TracePoint API, we still have to check on every event that we are on a break point line or not. So that's a pretty big performance impact. So I was measuring all the stuff with empty hooks, and you can imagine if you're going to check on every event, it's going to be even longer. So here comes Rubinius. Because in Rubinius, you don't need to check explicitly for a line and a file. What you do is basically, when your script is compiled, you get an execution block, and you get the instruction for that particular line. So here's how it looks, the database section for Rubinius. This hook is called when the script is compiled, and we check in that the script matches our file path, and if that's so, we locate in the execution block for our line and instruction pointer. If we find that, we're just setting a break point, and that makes Rubinius debugging super-super-fast. Like here's the comparison between simple run and full debugger enabled, because we don't have to check basically anything, and our break points are called at the moment we are reaching them. So it would be cool if we could have such API in Ruby. So basically, I think that's all I know about the Ruby debugger. So you may find me on Twitter on GitHub, and you can take a look at the examples and the benchmarks I was using during this talk, and maybe experiment a little bit and see what's going on inside of your Ruby.