 Greetings, humans, from the 21st century. Today, I'm going to talk about debugging, which is a very good topic all the time. In this case, time travel debugging. Start off with the basics. My name is Brock Wilcox. There's some contact information way down here. I also work for Opturro. We help retailers with their returns. It's great. They paid for me to come here, so, plug for Opturro, yeah! They, however, do not pay me to do mad science unless I can make it practical, so, we'll work on that. Debuggers. You all use debuggers? Yes? There's a lot of different styles of debugging. My classical favorite is log-based debugging. You know, print here, print here, print here, all the way down. And then you know it got to that line, right? But then there are debuggers where you have interactive thing. You can step through your program. How many of you have ever used a debugger? I like to do a little poll here, you know. Ever. Wow. How many of you use it on a regular basis? Also impressive. Sweet. So, and this is Rubyland debugging. All right. My favorite debugger in Rubyland is actually pride-by-bug. So you got the by-bug debugger, which is the new one in 2.0. You guys use that, yes? But mixed in with the magic of pride. A little demo. Since you are an expert at debugging, this might be redundant, but I don't care. All right. So, it shows up. Excellent. I get two screens. It's magical. So here we are. Simple program. We pull in our debugger and this binding pride, which hopefully you're all very familiar with, drops us into this nice repel and tells us what line of code we have not yet executed. Line six here. And you can look around. You can do some math. You can, I don't know, JSON parse something. Why would you do that? Whatever. Okay. And you can look at X, which is not even born yet. So, step. Now, we haven't run this line yet, so X still doesn't exist. And now, finally, we have seven, which is great. You can step through and you can see it outputting things. Here at 17, you can change it. And then you have your current value box. Pretty cool. Very handy. Highly recommended. Anyone who is not using this, add it to your gem file immediately. You can SSH with Wi-Fi. It's fine. So that's private. Anyone know how that works? No? Great. So, in YARV, there's an intermediate bytecode. And there's a great book, Ruby under a microscope, which goes into lots of details on this. Highly recommended if you're at all, even tangentially interested. And this bytecode has a lot of things. It's stack-based. So, you know, you push self, and you push some string, and then you do ascend, and pop out the results. Some stuff like that. In between these things, you have these trace lines. And anytime one of those is hit, that's an opportunity for the interpreter to do some sort of introspection on the current process. It could be looking at variables. It could be saying, hey, is this a breakpoint or not? Things like that. It then, in turn, exposes an API for third parties to tap into this. So, what Bybug does is it has some seed, which hooks into these, and then it can use the YARV API to say, hey, when line such-and-such comes up, and you happen to hit one of these trace things, hit my callback. Or if you're doing a full trace, every single one, you can hit the callback. Pretty neat. Simple. You could have lots of whole talks about how this works internally. Fortunately, this writes it out for you right here. You can see it very clear. It's literally inserted into your code at all of the points where you can break. It adds in hooks. Neat. Yes. As mentioned, Ruby on our microscope, wonderful. All right. It's pretty sweet. Right? Anyone? Yes? Okay. We can do better than this. It's primitive technology. It's horrible forward thinking. We can apply some mad science. Excitement? No? Woo! Mad science. All right. So there's this gem you can get, because I wrote it called Pride Time Travel. Let's have a little demo. This program looks awfully similar, but I added a nice little line here. And when we run it, we can do the same kind of thing. We can say next. See what X is. Next, whatever. But then, like, wait. Oh, I made a mistake. I meant to edit something before. And you can go from line 12 to 11 by typing back. And now on this line, it has not yet executed yet. So X will be 7. Not 42 at all. Okay. Here, 7. But when we go back, we're now on line 8. And so it hasn't been executed yet. And thus, we are, before X has even been born, we've gone back into the past and we can change everything. We can do horrible things. I don't know what. We can assign it to something which will then just get it overwritten when we type next. It's amazing. So you can go back in the past. And that's it. So there's other talks right now. I'm going to go into how this is done, the magic, unveil a little bit. But if nothing else, you go home, add pry, you know, buy bug to your thing, and then you can add pry, time travel, and your code will probably use itself. But you can do it anyway and we'll see what happens. All right. So here's how it works. This is the idea. There's nothing new under the sun. This is... I came up with this one morning and I was like, wow, this is a great idea and I Googled it and other people have done it. So that's how it goes. It doesn't matter how big the clock is. You're trying to hang. It doesn't help. Anyway, so what we'll do is every time you type next or in my case, N or something, we'll just fork the universe. It's sort of the, you know, the multiverse theory reaction causes another universe to split. Well, we're going to, like, embrace this idea and we're going to explicitly fork the universe upon an action. And then we'll take that new universe we peeled off and we're just going to suspend it, freeze it, go off in the cold block or something, cryogenic. It's great. And we'll just kind of save that. Maybe we'll do that several times. Every time we type next, we'll put them in a big list and then later, sometimes back, we'll just go grab one of those parallel universes, unfreeze it, maybe kill off our current universe. You don't want too many of yourself running around. It's very confusing. And then, poof, you're back. Time travel. Previous. It's wonderful. Simple. Very simple. You can apply this to all of your, I don't know, all of your problems probably. Just make other universes and then you can go back. So, all right. So that was the outline. I'm going to dive in a little bit of these essential components. Two. Only two things are needed for time travel. Fork. A lot of you use fork, yes? Have any of you implemented fork? Me neither. Okay, just wondering. All right. So, whenever you fork a process, now fork is a call that's provided by the kernel itself. It creates a whole other copy of your program. It gives it a new identifier, a new process ID. But then it copies all the memory. It shares some things like file handles. It turns out that's a really good thing. And just some other things. But the important thing is that they're now running separately. There's two complete copies. And in Unix, you know, you start off with innits and from innit all other things respond. One at a time through forks. Fork, fork, fork. It's wonderful. One thing I mentioned is that it actually, it copies all the memory. It's light exaggeration. On modern OSs, it will kind of clone a pointer to the memory. And then as you're writing to it, it'll copy things just as it needs to. So, if you do a fork and then don't change anything, you haven't actually used up all of that memory. It reserves the memory, which is a little scary. But nonetheless, you're not actually using it all up. It makes it a lot better. All right. So, first, I want to show you what this looks like at all. So, you have Ruby. You've heard of this Ruby thing. And you can sleep, I don't know, 10 seconds. And here I have a PS tree. Y'all use PS tree, anyone? All right. PS tree is great. Run PS tree sometime. I think on Mac it doesn't come with it, but you can run it anyway. It's in that brew thing. And you get a nice tree. Mine's too big, so we'll go... Okay, that's not too bad. This is my X session. All my shells, all kinds of things. Okay. So, I have this thing that's just watching for any Ruby processes and giving me the trees. So, we can do Ruby, and let's fork, and then sleep. So, there's our parents, and then it forks off our child, and you can see that they have sort of this relationship. Mad science is no fun when you stop there, though. Okay. So, now we have lovely 16 processes. It looks kind of fun because there's one, and then there's three under it. There's four under it. Oh, too fast. 100 seconds. Lots of seconds. And two, three, four. Interesting. Okay. Anyway, so you have lots of children, and the children have children, and everything like that. It's actually even more fun to have... Oh, wrong one there. Let's stick a little bit of sleep in between each one of these. Okay. So, what we're going to see is the first fork. It's two. And then this one here forks off one more, and this one forks off one more. And then this one here forks off one more, and it's child forks off one more, and so on. You can see how it kind of grows out, and it doubles each time. Which is fun. If you do enough of these bad things happening on your machine, highly recommended. Those are called fork bombs, especially if you make it so they ignore all the signals, which we'll get into in a second. All right. So, in Ruby, whenever you fork, and for those of you who have done this before, whenever you fork, you get the process ID of your child. Now, if you are the child, you don't get a process ID. So, this pattern happens a lot. You fork, and then if you got a process ID, you're the parent. If not, you must be the child. And they're both running simultaneously at that point. You notice that whenever they print, both of them are getting to my screen. It's because they both actually have the same standard out, which sounds somewhat dangerous, and it is quite dangerous, but handy in this case because we can actually see what's going on. The other thing you can do is if you keep track of the parent's, dollar-dollar is a nice global that gives you your current process ID. So, if you keep track of your parent's ID in the child, you can actually know who your parent is. What is it? Like that. So, you can actually see what its parent's identifier is. And they match. And if we look at our tree, you can see there's a little relationship here. It's pretty cool. So, that is forking Ruby. Very straightforward. You can do lots of horrible things with this. This is how a lot of things like, say, Unicorn, for example, does forking lots of process. You can do parallel processing. Each of these could then go scraping internet websites in parallel and you can fill up all of your hard drives and all kinds of things. All right. So, the second important ingredient to time travel is signals. This is actually where time travel falls apart in the real world, by the way. But anyway. So, a signal is a registered callback that you give to your operating system. At the beginning of your program, usually the beginning, you can say, hey, operating system, if something happens, please call this chunk of code and you can hand it off. You've probably seen some of these signals. Here's a list. There's bunches of them. Different variants of Unix have different sets of signals. These ones you'll be familiar with if I label them. So, SIGINTS is what gets sent to your program whenever you do Control-C. You can catch that and actually say no. We'll show that in a second. SIGTERM, however. Oh, SIGTERM is wherever you kill something. You could also catch that. SIGKILL, un-catchable. If you kill-9 a process, it dies. It can't stop. SIGHUP is whenever you have a modem disconnect, which I'm sure happens to you all the time, and has thus been repurposed as demon-reloading of configs. Similarly, SIGUZER1 is set aside specifically user-defined, and you can send a SIGUZER1 to any process, and if it happens to have some code, then it will do something. The default that the OS provides is ignore. SIGALARM is also fun. You can set an alarm, and then it'll call your callback. Pretty nice. So, in Ruby, it comes with a nice library called SIGNL, and you can do TRAP, and give it, you can call this SIG, oops, you can call it SIGINTS if you want, or just ENT, Ruby knows all the things. So that's how you actually register your callback. Now, in Ruby, this callback you have to be very careful because while it's running whatever code is in this block, you could get another SIGINTS, which will stop running that code and start running it again. So, don't do too much in here if you could avoid it. And then, the other thing you can do is you can send kill commands. This is just like the kill command line. So, this would be kill, SIGKILL is the same as the dash nine, and then dollar-dollar is our current process. So, when we run this, it will take a nap, I hit control C, not so much. You can do this whenever someone leaves their terminal unlocked. You just do a one-liner, catches control C, tells them no, anyway. So, then after a nice 10 seconds, it killed itself, went away. Very fun. We used to do that in the Unix lab of college all the time. Never leave your screen unlocked, very bad. The other thing you do is you do X host plus so that just anyone can export things to your X display. You Mac people probably don't have that problem, but whatever. All right. I'm sure you all have your favorite signals. A lot of people like kill dash nine because, you know, power, I can kill, and they're dead forever. My personal favorite is actually SIG stop. Now, there's this SIG T stop. What's going on? That was awesome. SIG T stop is one you can catch. So, you hit control Z. Am I bumping something? Okay. If you hit control Z, that's actually catchable and you can say, no, I'm not going to do that. But SIG stop is the same kind of thing and you can't stop it. Can't stop it. Yes. You can't intercept it and say no. And this is more powerful than kill dash nine because it has some of the same effects. The process stops. Nothing happens. No in, no out, nothing, right? It can't block it, but you can bring it back to life with SIG cont, SIG continue. That will give you the power of sort of this cryogenic thing. You can free something and then wait a while and bring it back. So, it's not just the power of death. It's the power of life. You see? Okay. All right. So, fork plus SIG stop and SIG continue equals time travel, my friends. Here's the outline. We'll have two functions. One called snap short for snapshot because I snapshot the universe anyway. Four letters better than... Okay. So, in snapshot, we're going to fork and get a child process. And if you're the parents, you're going to just stop yourself. Just freeze, interact, and stop processing. If you're the child, you should... I messed this up a little bit, but it's okay. You should push your parents identifier in here. So, that trick we did earlier where we saved off the parents identifier, that's what I meant to do here. So, you just save that, hash it away, and you have this nice list of processes that you can get back to later. So, the snapshotting. You've now frozen off a side universe, and it's just there, and you continue on as if nothing happened, even though technically you're somebody else now, but you don't know. Later, when you decide you want to go back to where you were, you can call this your store routine, which will go grab that old universe, that parent universe. Maybe it was the true you, I'm not sure, and resume it, and then you can stop yourself. Originally I actually had this as, SIG kill yourself, and I had really good comments in my code which was like, if you meet your previous self, kill yourself, which is a little ambiguous, but stopping, as I mentioned, is a little more powerful because maybe you could bring yourself back from the dead later. That was a good idea. Okay. So, I decided to... Alright, well, yeah. I'll go over some limitations here first. This is really a profound one. It should probably be obvious, but you never know. If you've seen the movie Primer, you know that in real-time travel you can only go back to when you turned your time machine on. In this case, whenever you've set your checkpoint, whenever you've actually done one of these snapshots. You can't go back to yesterday and fix the bug then. You can't do things like that. Also, it means that if you didn't start a time machine, if you didn't take a snapshot, even as you went, you can't get back to it. So, if you really want to understand this in some depth, the movie Primer is a really good educational video. Highly recommended. Another limitation, but also power, anyway, is shared file handles. So, this Standard In, Standard Out are shared. This can really bite you if you are doing a lot of multi-processing. If you haven't yet, eventually, you'll do some multi-processing daemon and you will, in one of your child, what will happen is, in your main process, you'll start losing your database handle. It will be really upsetting because it will happen randomly and you'll have no idea what the problem is and you'll look at the code and there's nothing there and everything is fine. It happens in a different spot every single time and it makes you insane for hours if not days. Finally, you realize, oh, earlier, I actually forked off another process. That one read from the database and then exited thereby closing the database handle in a completely different process, but it's the same handle and so it goes away. And it will, it'll make you really happy when you figure it out because you'll think you're really clever. But then you'll be mad at the previews of you for being clever in the first place and then the new you will say, okay, as soon as I fork a new child, I'm going to make a new database handle and then I don't have to worry about this problem. Anyway, just so you know it's going to happen if it hasn't already happened to you. So, but other than that, it's great. Another big limitation, another both pro and con is anything that's outside of your program is not going to be snapshot so much. So this is really why this type of time travel can't actually allow you to go back to yesterday and help win the lottery or on a more local thing, won't help you to be all of the records from your database. Which is kind of a good thing. This is basically why this type of time travel works. If we had no external world, if this wasn't a partial multiverse, you would never be able to communicate with your previous selves and bring them back to life. So you can imagine if I actually did build a device that would allow me to fork the universe in some sort of provable way. I'm not sure how I'd prove it. But it was the entire universe. Then I could never get to that other universe probably. So anyway. Deep thoughts. Okay. Another one is threads. Threads while they're inside of processes are relatively somewhat sort of independent. And whenever you fork, you don't get a copy of your threads. You just get your main thread. I have this PS tree thing I did here, right? An interesting thing is that my PS tree is kind of blinged out. I added a new option, which is on by default in the real PS tree, that by default, mine hides threads. So it turns out a lot of things have threads, and it puts these in these ugly curly braces. And whenever I'm doing an example of a lot of Ruby forking things, there's a lot of these weird curly timer things. Anytime you start Ruby, it actually starts at least this one thread that it labels Ruby Timer thread. I don't know very much about it, except that whenever you fork, it automatically makes it again, and you don't have to worry about it. Which is awfully nice for me. However, if you're doing other things and threads, and then you fork the universe, your child universes aren't going to have your threads immediately, and you'll need to spin them back up. So don't do that, I guess. So you have to manage your threads if you are really going to get into a lot of multi-process work. Or you could just use threads, but then it's harder to fork the universe. In theory, whenever you fork, it does that copy and write thing. But after a while, you start touching those pages of memory, and it starts copying them. And if you were to do a lot of snapshots, you might run out of memory. Maybe. Alright, so I'll just wrap this up in a pry plug-in. We're done, no problem. Unfortunately, not quite. So I originally hacked this into Bybug directly. Bybug didn't have an immediately accessible plug-in system when I was doing this, and I started to write one, but that was a yak that I decided should be free, free yourself a little bit. Anyway, and I went to pry, which has a wonderful plug-in system. Unfortunately, the first time I set this all up and I'm like, ah, take a snapshot. I got ZShell suspended signal pry. And my other process was still running, but my shell didn't care. And whenever I would type math, sometimes it would go to my shell, and sometimes it would go to my Ruby, and it was very confusing. So the trick for this is, at the very beginning, whenever you first snapshot, you just make a dummy parent process that just sleeps. It just doesn't do anything. So if we go to our example, low time travel, and we do a single, I just did one snapshot, and we actually have three processes. And you can get a list, and you can see that the two that we care about are these two, and then this parent one is just a dummy one that's just holding on to our shell pretty much. But, that's bad. Great. So, simple workaround. Another problem is that whenever your child, your final one exits, there's a lot of other things going on that really you don't want to just hang out frozen sleeping processes all the time. So the trick I worked out was to send the ultimate parent a signal, and like I said, you can't do this in real life because we can't communicate between our universes. But anyway. And then it would go ahead and kill all the children. And I thought this was working wonderfully until I realized that actually it was my shell's process control that was cleaning up all of these things, and my slide software bypassed that, and then I realized I was wrong. Anyway, but the theory is still sound, which is you need one, maybe the first one that exits to go and forcibly kill off everyone. And the reason I say forcibly is because whenever your Ruby process exits, it calls all these ad-exit blocks. And that's when it can nicely close down its database, it can nicely disconnect things, it can be kind of the world. Which is great whenever there's only one of you, but when there's ten of you, and all of you are trying to write your will, it gets really confusing and bad things happen. So that's simple. Get all of the other universes to kill themselves in a really terminally fatal way, which is a kernel exit bang, different than regular kernel exit or just exit. It will bypass all closed down things. It is about the equivalent of sending yourself a kill-9. You could just send yourself a kill-9. That would be fun too. Zombies, I guess, you know. You never would have thought that time travel would, you know, as you're jumping from one universe to the other, you suddenly get zombies. It's a little annoying. It happens. So it turns out that zombies are whenever your child ex did, but the parent didn't do a wait-pid. That was one thing I kind of skipped in my forking process. So while I was doing this, I actually ran into a lot more orphans than zombies, which are processes that just hang out. Zombies are really dead. They're not doing anything. Orphans, however, are kind of less dead anyway. Nonetheless, zombies are a better headline. So they, fortunately, once your parent process exits, your zombie children end up being owned by Innit, by Process Zero, the source of all processes, and that knows how to kill zombies, which is pretty cool. So it's a self-solving problem. All right. Now we can go back in time. We can go back and try to figure out what went wrong, change it. That's not good enough for me, at least. I don't know about you. I additionally built the ability to build a tree of snapshots. So if we have our same program here, and we step a couple times, we get this list. And I really wish I could go back to 1955 or whatever. Back. And now I'm all the way at the back. And now I can, you know, live life again and do things. And I actually end up with a whole other tree of execution and our current one here, which is pretty cool. These will keep growing forever right now. So if you do a lot of traversing, then you'll fill up a lot of things. I've been typing N for next. Sorry, Elyse's. All right. So I actually have a snapshot command, and it then takes another command, so I at least end to snap next. I might just make it so next automatically does this. But anyway, so every time I'm typing N, it is actually making another snapshot. I can manually make a snapshot, a bunch of them. I don't know why I would do that. And they're all technically children of each other, even though they're all on the same line and they're just sitting there. But you can go back and try things over. I have actually used this to debug one bug so far. Yes, yes. So Rails application, which somebody else told me this doesn't work at all in a Rails application, which is more surprising than the fact that I was able to get it to work in a Rails application. But anyway. And I throw a binding not pry in my controller, and I make a snapshot. And then step down, and a lot of times, whenever you're using Bybug, you can either step or you can next. And you'll do next, and then you'll see there's a problem, and you're like, ah, I wish I had done step instead. Or you'll do step and you're like, man, that was a long route, and I shouldn't have done that, it was dumb. So what you can do is you set your snapshot at the beginning, and then you step in, change some things, step in, and then whenever you get mad at what you did or you have enlightenment, then you can just go back to your previous one and do that. And it took me like, I think 10 repeats. And I could have just, you know, hit reload or something, but that's no fun. So instead I use time travel in order to do this. It's pretty great. Okay. So, yes. So the ultimate, which I have not gotten to, ran out of time, always more to do, is I want to be able to auto-snapshot things and then maybe prune the tree. So this auto-snapshotting, I think the ideal workflow would be you use pry-rescue. You all use pry-rescue, anyone? Yes. So anyone who hasn't pry-rescue, whenever there's an exception, all of a sudden you get a REPL. I think just before the exception happened, which is pretty neat, like a little mini time travel, you can change things. What I want is to snapshot, maybe at each stack level transition, say, so that you can not just walk back your stack, like Stack Explorer does, but you can keep a tree of the statefulness and you can walk back there and get all of the lexical variables as of that point in time, instead of as of now, all the globals, everything. So that is kind of the next step in this. We'll head off schedule, so I'll do some question stuff. Oh no, one more. There's other ways to do time travel in your debuggers. The Unix forking way is awfully simple. I was able to present it to you and I did a lighting talk on this where I did it in five minutes. I don't know if anyone understood what I was talking about, but nonetheless, combine fork, signals, time machine. The other thing you could do, if you wanted to, is get into YARV and keep track of everything that ever changes independently and that has the additional benefit you might be able to log all of that out. Some of the Visual Studio stuff does this where you can almost get a slider and slide back your variable states and then look at things as of that time. With fork it's a little more live, but both are fun. I originally got turned on to this with the OCaml debugger, which has this built in from the beginning. I thought they were doing it the fancy way where they were doing individual, checking what bytecode changes, what memory slots. No, they're using fork, it turns out. It's pretty neat. I have a bunch of references. Anyway, advanced programming for the Unix environment is the best. It goes over fork and all of the things it does across lots of operating systems and it's very helpful. It's the only reason I was able to make this even tolerably work. And then there's a pretty good kind of rant or article set on why reverse debugging is rarely used. Mostly the answer, like most debugging, is because it's not at your fingertips. It's not already there. Otherwise you'd use it. You'd use pry rescue and it'd pop up where you are and then you would see, okay, that exception happened. Now take me back in time to where it actually was caused. And if this was at your fingertips, you would do it all the time. But because there's some setup, especially in more complex systems, it's hard. So if we can actually make this not crash your entire computer every single time you use it, then we can use this every day, put it in production. It would be great. No problems. Elm, if you haven't heard of that one, does the incremental way where they're actually watching what has changed and they have some really cool demos on the internets with their time traveling debugger and a little slider. They have a little Mario jumping and you can slide him back in time, change the variable and see how his trajectory changes. Very cool stuff. And it's a lot less heavyweight but a lot less general purpose than the magical forking technique. Thank you very much.