 All right, so this is dispelling the dark magic inside a ruby debugger. My name is Daniel, and I watched the eclipse this summer. Who here traveled to see the total eclipse? Anyone? Okay, fair amount of hands. I really felt like it was worth it. It was a great experience. I packed up my family. We drove a few hours down south from where we live in Seattle, down to Central Oregon, converged on a small town in Oregon called Madras, Oregon, along with about pretty much the entire West Coast, I think. It was an epic traffic jam getting out of there. It was amazing. But the eclipse was really awesome. This was a photo that I took right at the beginning of totality. Really surreal, amazing, but surreal experience. I remember looking up at this and it's weird feeling. It's like your body is telling you that it's not supposed to look like that. Like there's something wrong. You're seeing a special effect or something like that. It's just this eerie experience. And I remember thinking that it's, I'm starting to understand maybe the reaction of some of our pre-scientific pre-industrial ancestors that they might have had seeing something like this and concluding that, oh man, the world's about to end, or there's some sort of dark magic involved here. Because looking at this, I realized that the only thing that was really keeping me from having that same reaction was the knowledge of what was going on, right? The understanding of a little bit of the inner workings of what it was I was seeing. And I realized that as a developer, I often have a similar response to some of my tools. In particular, debuggers. I've been developing professionally since the late 90s or so. And throughout that career, I've been very hesitant to use debuggers. Pretty much that entire time. And it's because they felt kind of eerie, kind of spooky. These tools are doing something bizarre to my program. They're casting weird spells on the virtual machine. They're doing something that's stopping my program from running and causing the machine to do something else. It just felt eerie. It was almost intimidating. In general, it just didn't feel comfortable. I wasn't comfortable with these tools. And so I spent my entire, pretty much my entire career as a debugging troubleshooting using PUDES, or printf, or whatever the call happened to be in the languages I was using. I was a PUDES debuggerer. And I know I'm not alone in that. This was an article that some of you may recognize. One of our Ruby luminaries, member of the core team, wrote this early last year. PUDES debuggerer, we call ourselves. Well, it turns out that debuggers really aren't as spooky as they may seem at first. And to prove that, we're gonna spend this hour looking under the hood at a debugger. We'll see kind of how it works, what it's doing. We'll see that it's not just dark magic. It's just Ruby under the hood. And it's actually not that complicated either. In fact, we're gonna go ahead and just implement a debugger right here. We'll feature a debugger. It's actually not that hard. Just take a few minutes. And then afterwards, if we have time, we'll spend some time looking at a real-life production debugger. Some of the techniques that we use to implement it. We'll look at some of the debugging facilities of the Ruby VM. It'll be fun. It'll be fun. So, let's see. Before I get started, a little bit about myself. As I said, my name is Daniel. I've been developing since the late 90s or so. Started with Ruby around 2005. Right as Rails was starting to kind of enter the picture. Joins kind of a series of Rails startups. Then, about four and a half years ago, I decided to switch gears a bit. So, I joined a kind of a small company. I joined this company. I'm part of the cloud platform team. And I work on stuff to help Ruby developers use the cloud platform. So, it's actually been a really good gig. I really enjoyed it. But, because I work at a small company, there's a little bit of small company bureaucracy that I have to get through. So, all the code that you see here, sample code, live coding, is copyright the small company and Apache 2 licensed. Okay, so that out of the way, let's write a debugger. What we'll do is we'll just create something very straightforward. We'll create a library that will let you set breakpoints. It will let you open the command shell, debugger shell, let you inspect the program state, and also step through execution of your program. So, kind of the basic functionality that you'll expect in the debugger. So, let me go ahead and switch over to my editor here. And we'll start by taking a really quick look at a kind of a sample toy program that we'll use to demo our debugger features. So, this is just a quick program that prints out a little hello message. So, the sender and the recipients of the message will be passed on the command line. Sender will be set in an instance variable. Recipient gets set or passed to a method. We have a method that prints out a hello message. Another method that prints it out twice, because why not? And that's it. So, here's how that actually works. Read.read.rb. So, hello world from Ubikoff, okay. That was fun. Let's go write a debugger. All right, so, we're gonna start by managing breakpoints. So, we'll be able to add breakpoints to a program and we'll store breakpoints in the struct. And each breakpoint will have a name and then it'll be a file in line that the breakpoint is in. All of the methods in this class will just make class methods just for simplicity's sake. So, we'll have a method to add a breakpoint. Name, file line. And we'll just store all the breakpoints in an array. We'll need an initialization methods to initialize that array. So, breakpoints. And we'll make sure that that initialization gets called when we require our file. Okay, so, we have a simple library that lets you add breakpoints to an array. We'll go back to our toy program and we'll require that file. And we'll set a breakpoint. Let's see, so, we'll give it a name, my breakpoint. We'll set the breakpoint in this file. And we'll set it over here on line 11. So, all right. So, we've added a breakpoint to our program. Let's see what happens. And nothing happens because we added a breakpoint to an array but we didn't actually do anything with it. We didn't actually have any code that breaks at that breakpoint. So, how do you do that? How do you interrupt a running Ruby program to implement a breakpoint? Well, one thing that you can do is you can use a powerful class in the Ruby Core library called TracePoint. TracePoint, it's part of Ruby Core. It's been there since Ruby 2.0, I believe. It's a Ruby class that lets you listen for events that happen at the Ruby virtual machine level. It lets you register callbacks that will get called whenever those events take place. So, for example, this method listens for the method call event. So, this example listens for the method call event. Whenever a method is called, it executes this block which prints out a little message. TracePoint also passes an object to the block that includes some information about the events that just happens. For example, the name of the method. A number of events are supported. Methods called, return from methods. Exceptions being raised, threads starting and ending. There's even an event for moving to the next line in your Ruby application. And it's this event that we're gonna use for now to implement breakpoints. So, each time we move to a new line in the program, we'll register this callback that searches through our breakpoints and looks for, do we have a breakpoint here at this line? So, let's go back to our code. And implement that. So, we'll create a tracepoint. Make sure that we call this method from my initialization. I often forget that when I practice this. So, tracepoint. Okay, we're gonna trace the line event. And each time we move to a new line, we'll search our breakpoints. We'll look for a breakpoint with the same file as where we were at. And also the same line number. If we find a breakpoint, then for now, let's just print out a little message. It says, hey, we found a breakpoint. Breakpoint, we can provide the, print out the name of the breakpoint. And again, there's a bunch of information that's available in the tracepoint object. For example, the name of the class that we're in. There is also the method, as we saw. Let's also, just for kicks, print out the file name we're at and the line that we're in. All right, so we set up a line tracepoint that when we hit a breakpoint should print this out. So, let's see if that works. And there we go. Remember, we set our breakpoints here on line 11. And looks like we hit that breakpoint. My breakpoint in the method hello twice, on line 11. So, there we go. We can detect breakpoints. Now, once we know we've hit a breakpoint, what do we do? A lot of debuggers provide a little command line tool, kind of a command line interface that lets you start interacting with the program at that point, so you can query the program, see what its state is, see what's going on, and just kind of get information about how the program is operating. So, if we wanted to add a band line to our debugger, an easy way to do that is to use IRB. Now, many of you are probably familiar with IRB as the Ruby REPL, you can invoke it from your shell, run Ruby code on it, and see what happens. But it's also a part of the Ruby standard library, and you can call it from within your Ruby program. So, what we're gonna do is we're gonna use another method on the trace point, binding. This returns a binding object that provides a bunch of context information about where you're at in your program and what the state is. When you've required IRB, it adds a method, an IRB method, to the binding object that opens an IRB shell using that binding as the context. So, that's what we'll do. When we hit a break point, we'll open an IRB shell. So, let's try that. We'll look back at our program, and we hit the break point, and we've got an IRB shell. All right, now here's where things get interesting. We've got an IRB shell open with the current context here on line 11. What's going on here? Well, we have a local variable, recipient, that's in scope. So, we can look at its value, and we can see that it's world, right? We're in the context of an object. We have access to the member variables, such as sender. So, we can see its value. Since we're in the context of this object, we can call methods. Let's try calling hello, and we can see what it does. We can even change the values of things. You can change the state of the program. So, we have the recipient local variable. Let's change its value to, say, New Orleans. Let's change the sender, also, to Unity, say. So, we can change the state of the program. And now, when we continue execution of the program by just exiting out of our IRB shell, the program will continue with that changed state. So, now we've got a new output for the program. So, that's pretty much it. You know, that's the basics of a debugger. Real production debuggers will often include additional features on top of this. Since we have some time, let's try implementing a few of those. So, we notice that we continued program execution using IRB's exit commands to just exit to the IRB shell. Normally, a Ruby or a debugger shell will use a command like continue, or since continue is a reserved word, maybe caught to continue execution of the program. So, if we wanted to offer that as a command, that is add a command to IRB, the easiest way to do that is to add methods to a particular module, IRB extends command bundle. Any methods present in this module get added to IRB as a command. So, let's create a command called cont. We'll defer the implementation of this back to our mini debug module, and we'll pass in an object called IRB context. This is an object that's available to IRB commands and it lets you interact with the IRB session. So, if we go back to mini debug and we implement this, cont pass in IRB context, and again, to continue the program, we just have to exit the IRB shell. So, we'll do that, rbcontext.exit. Now, when we go back into our debugger, we should be able to use the cont commands to continue the program, and there we go. Okay, what else can we do? Let's implement stepping. So, stepping means we, so in our example, we break at line 11. Stepping means that you can execute one line of your program, and then go back into your shell so you can see what effect that line had, and you can do this repeatedly, so you can kind of step by step, line by line, go through your program, and you'll see how its state evolves over time. So, to implement kind of a line by line process, well, we already have something that lets us do something line by line, that's our trace point here. So, let's modify this to implement stepping. Instead of opening our IRB shell when, just when we hit a break point, we'll also open that shell if we're kind of in a stepping mode. So, let's create a kind of a stepping mode, and we'll initialize it here in our initialization, stepping to false. When we hit a break point, we can set it to true, and then as long as we're stepping, that's when we will open our debugger shell. Now, continue means you wanna continue your program and stop stepping, so let's turn it off here. Okay, so we've got a stepping mode here. How do we actually implement step? Well, first we need to create a IRB command for it, so we'll do that, step, and then we'll implement it. Like Kant, like continuing the program, stepping just involves exiting out of IRB, but unlike continue, stepping doesn't turn stepping mode off, so we leave it on so that the next time we hit our line trace points, we'll go back and open the debugger shell again. So that's stepping, let's try that out. We'll go back and look at our program. Again, we are breaking on line 11, so there it is. Now if we step, and we've executed a line, and now we're on line seven, which is here. So see what happened here. We were on line 11, we executed one line, and that was a call to the hello method, so the next line that gets executed is here, line seven. Again, if we step again, now we're on line 12, so we were here, we exited the hello method, and we're back here on line 12. Okay, and again, if we Kant, then the rest of the program will run. Notice what happened here, when we stepped that first time, we went into the implementation of this hello method. That's called a step in. You're stepping into the implementation of something that you're calling. So sometimes that's a useful thing to have. Other times you'll want to use a variation on this called step over. Step over means that if you make a method call, you'll step over the call to that method. So you won't break in the implementation of that, but you'll break on the next line of the method that you're currently in. So for example, from line 11, step in, put you on line seven, step over, we'll step over this hello call, and put you on line 12. So how would we implement this? We need to keep track of methods. We need to keep track of method calls and method returns so we know when we're back on our original method. Well, we have a way to keep track of things like that, method calls and method returns. Again, trace point. So we'll go back here and try implementing step over. And the way we'll do that is we'll just keep track of our current stack depth and every time we hit a method call or a method return, we'll modify that. So we'll initialize it at zero and then we'll create another set of trace points. That's trace points. Once again, we'll make sure that this gets called when we initialize. So this time we're gonna trace the call event. And we'll also trace be call. Be call is a call event for blocks, which for our purposes are kind of like methods. So each time we get one of those events, we will increment our depth. And then similarly, when we return from a method or from a block, we'll decrement our depth. So that will keep track of our stack depth. Now we know at any point in our program what, how deep we are in method calls. So to implement step over, let's go ahead and rename this so that we know what we're doing. Step over. So to implement step over, we just have to keep kind of record where we are, what our stack depth currently is, so that when we go and look at, should we open the shell again? Are we at that same method? So what we'll do is we'll kind of record this target, say target depth. We'll make sure that we initialize that also when we set up, when we first hit our break point. And then it's only when we're back at that target depth that we open our debugger shell. All right, so let's try that out. I hope I did that right. So again, we broke on line 11. Step over and there we go, we're on line 12. So that's step over. We saw step in, there's also step out, meaning that you don't break out into your debugger shell again until you've left the current method, so your one level, one stack depth above where you started. So as you can imagine, you can use that same mechanism to implement step out. I won't do that here for time, but it should be pretty straightforward once we have a way to kind of track our stack depth. So that's pretty much it. We have a debugger. We can set break points. We have a command shell that we can use to look at current program states, change things, see what effect that has. We can step through the execution of our program. This is a debugger that's usable. Let's see what it looks like. It's just about 60 lines of code or so. So it's not that hard. Not that hard. So let's see, since we have a little bit of time, let's go a little bit deeper. My team at my small company, among other things, we implemented the Ruby version of a product that we have called Stackdriver Debugger. This is a debugger that operates a little bit differently from the one that we just implemented here. It's designed for web applications, live web applications, where you actually might have a user on the other end that's waiting for a response. So you can't set a break point and then just an interactive shell because that kind of stops your program and you have a user waiting for a response. You also have to be careful about what you do because if you have a real user, you don't want to break things for that user. But it would still be really cool if you can still set break points, if you can still interrogate the program states and just kind of see what's going on. See, look at the behavior of your program all without redeploying. So that's what Stackdriver Debugger is designed for. Google actually has been running an internal version of this tool for some time. We just fairly recently released a version of it for the cloud platform, so cloud customers can use it now. But as a production caliber debugger, it employs, and it's a little bit different. So it employs a few different techniques and some more advanced techniques. I'll go through some of those so that you can see some of the other things that debuggers can do. So first, threads. Typically in a Ruby web app, you might have multiple threads running. Each thread might be running a separate request. And so when you're in a debugging session, you often want to limit your analysis to one thread, one particular request that you're interested in. You don't want other requests to interfere with that. You don't want to interfere with other requests. And so in particular, if you set a trace point, you want that trace point to be scoped to just one thread. Now, unfortunately, the Ruby TracePoint API applies globally. It applies to all threads at once. And so we kind of have to drop down to C. There is a C API for trace points that does happen to support thread scoped trace points. So there's no way yet to access it from pure Ruby. So you have to write a C extension for this. So that's what Stackdriver's Ruby Debugger does. Pretty much all or most of our trace points are thread scoped using this technique. In general, this is one of a number of kind of advanced Ruby features, Ruby VM features, that are exposed in C, but not yet in Ruby. Given how useful this particular one is for debuggers, it might be something that we should consider maybe exposing this one at the Ruby API level. Second, speed. In the mini debugger that we built, we used a line trace to detect breakpoints. Now, this means that we're executing extra Ruby code for every line of code in our program, right? Of course, this can be slow. This can be a major performance hit, and I'm sure some of you are probably thinking that as we implemented this. So we don't wanna hurt performance of a user application in a live debugger like this. So we put in a lot of optimizations into Stackdriver Debugger just to make sure that the impact on performance is minimal. One example, since line tracing is very slow, it just fires for every line, we actually selectively turn it on and off based on how close we are to a breakpoint. So for example, we can do this by using other trace points. This is kind of a simplified example of what this kind of technique could look like. You can listen to certain events that could indicate that you're moving from one file to another or from one method to another, and then trigger off that to check, okay, am I close to a breakpoint now? Is there a breakpoint in this file or in this method? If so, then you turn on your line tracer. So this is a bit of a hack. In an ideal world, we'd really love to have native breakpoint support in the Ruby VM. And it turns out that we actually kind of do, sort of, but it's a little trickier than it sounds. Different debuggers have different needs, and Stackdriver Debugger is a little bit unique. So for example, Ruby actually has an experimental C API called line trace specify, and it's meant for this kind of debugger that has line-based breakpoints, but Stackdriver, the way that we want to manage our breakpoints didn't quite work for us, and so we actually had to fall back to line trace points for our implementation. Overall, how to design a good API for breakpoints and for debugging in general is difficult because you have a number of different use cases, a number of different kinds of things that debuggers might want to do. We at Google have started thinking about this, how to do this for Ruby, and I'm sure the Ruby core team has been working on this for quite some time as well, so I hope that working together, we can start coming up with how this should really look. Side effects. Stackdriver Debugger, again, it's designed to run on live web applications. In such a case, it's very important so that you avoid changing things. You avoid changing the behavior or the state of the program. So the developer operating the debugger should be able to observe things that are going on in their program, but not modify anything, not change things that could break a user. So to help developers protect against this, we actually wrote a side effect detector and that's part of the debugger. It's not perfect, it's conservative, but you might find it interesting how it works. Side effects in Ruby tend to fall into two categories. First, there is object changes, so you have things like the changes to the value of an instance variable, for example. You also have external side effects, so things like you set some data in a database or you send some bytes on a network connection. We can detect these side effects by actually looking for these kinds of operations at the bytecode level. So we built a bytecode analyzer. Whenever we run Ruby code at a breakpoint, we compile and we analyze the bytecode implementation and then if we encounter any state changing bytecodes such as set instance variable, we throw an exception rather than actually running that code. C methods are another common source of side effects. It's really difficult to analyze the behavior of a compiled C method, so what we do is we whitelist C methods that we know to be safe and then we just prevent all others. So again, it's conservative. Now, preventing side effects, this is kind of, seems kind of esoteric, a little bit of a bizarre use case. Actually, it's conceptually really simple what we're doing. What we're really talking about here is immutability. Immutability for Ruby. Now, some of the other, I've been to some of the other talks and several people have been talking about immutability, often in the context of functional languages. It's very common part of the functional paradigm, so languages like Haskell, Elixir use it. And it provides things like performance gains, safer code, code that's easier to reason about. Immutability makes a little bit less sense for an object-oriented language like Ruby, but for our debugger, we do have a use case for it. Either something straight out immutability or maybe something else similar, like maybe we could have some kind of code jail that prevents one piece of code from modifying the state or resources owned by another piece of code, which, you know, come to think about it, sounds suspiciously like guilds, maybe? Kind of sort of. So anyway, just some additional use cases to think about as we think about the future evolution of our language. So we've gotten pretty deep here. Let's come back and recap a little bit of what we've seen. Again, at the basic level, debuggers are quite straightforward, actually. You set a breakpoint, and then you can observe what's going on, and for some debuggers, you can change things. You can step through the execution and see what's going on, how things evolve over time. We implemented a simple but full-featured Ruby debugger right here. We did it from scratch, it's not that hard. Then we talked about a real debugger, a Stackdriver debugger. We talked about some of the techniques that we used to implement it, some of the challenges that we faced. We saw that Ruby itself has support for some debugging capabilities, some useful APIs that can be used by debuggers, but there are some things that maybe could be added or changed to make things better, faster or safer. Now, if you'd like to experiment with the debugger yourself, here are a couple of suggestions. Bybug may be familiar to a lot of you already. Rails brings in by default. It's a traditional shell-based debugger, so it's good for Ruby command line tools, applications like that. For the web application case, you can try a tool like Stackdriver Debugger. These are both open source on GitHub, so you can kind of look at the source, see what's going on, how things work. In both cases, the core of these debuggers are exactly the same as what we just implemented. They're based on trace points. So now that we've seen a bit about what's going on under the hood, I hope that interacting with the debugger maybe just feels a little less intimidating. It's not rocket science, it's not dark magic. It's just Ruby. So, thank you for coming. We'll give you something. I don't know if we have time. We probably have just a couple minutes for questions if there are any. Okay, so the question is, in fact, let's go back and look at that. In our example program, when we stepped into the hello method, and then when we stepped, again, we kind of stepped over this puts method and exit hello method. So why didn't it step into puts? Is that right? Puts is actually, as I understand it, a C method. So you actually need to use a different trace point. There's a C call trace point that will detect that rather than line trace points. And then, of course, since it's a C method, it's hard to debug that using a Ruby debugger, so we didn't do that. Anything else? It's really hard for me to see, so shouts if you're, okay, I think that's it then. Thank you.