 I make such an intimidating sound coming up those stairs. Boy. All right everybody, it's story time. Let me tell you a tale. It's 2 a.m. and you get a page. Production is down. You're searching frantically for a bug in your code but you find nothing. Slowly, a realization begins to creep over you. The bug isn't in your code. You will need to venture into the belly of the beast. It's time to enter the lion's den. It's time for debugging Rails itself. Anyway, TLDR, make Aaron fix it. That's sort of his job. Thank you very much. If you have questions, I'll be down. Okay, now, debugging Rails itself. My name is Sean Griffin. I am a 10x hacker ninja guru at Shopify. Please tweet at me while I'm giving this talk. I will appreciate it very much. I recently moved back to America after living in Canada for a few years and I work remotely now. I find it a lot harder to find somebody to pair with so to remedy that, my wife and I made a pair for me. Her name's Ruby. She's the youngest attendee that RailsConf has ever had. If you don't like baby pictures and dad jokes, boy, you are just not in the right talk. So today, my pair and I are gonna walk you through the exact process that we use when we think we have a bug in Rails. So first of all, I want you to know, anybody can work on Rails. Big open source projects like Rails can seem intimidating. It's complicated. It's not terribly well factored. Everything includes 15 million modules, sometimes twice. And it can often appear to be something that the average programmer wouldn't be able to comprehend. Well, I'm here to tell you, anybody can work on Rails. I know, right? It can be surprising to hear, but it's just not as scary as you think. It's just a large legacy code base that's had far too many people touching it. Most of the techniques that we use to debug Rails are the same techniques that you would use in your application. The first step in this whole process is to determine if we even have a bug in Rails, or if it's a bug in our application, or if it's a bug in another gem. And really, we have to answer, what even is a bug? So we should probably define what the term bug actually means. The term originates from the days of vacuum tube computers. Sometimes a bug would fly into the tubes causing the system to malfunction. So there's a bug in the system used to literally mean that a physical bug is located somewhere inside your computer. So unless you're running Rails on a vacuum tube system that has a fly in it, well, then you don't have a bug in Rails, I'm sorry. If you wanted an actually useful answer, I generally go with the definition of a difference between the observed behavior and the intended behavior of a system. One thing that's important to understand is that the intended behavior doesn't necessarily always mean the behavior that you would expect. Saying this violates the principle of least surprise does not make it a bug. I'm going to immediately ask you least surprising to who, because it's a subjective assessment and generally we won't make breaking changes because somebody found the API surprising. A lot of bugs will crop up from a usage of a feature that we didn't anticipate or some interaction between two features that we didn't test for. When these sort of bugs come in, it's often difficult to assess whether they are bugs or not. Sometimes the answer is just going to be no, this is not supported usage, I'm sorry. Other times we do eventually change how we intend a feature to be used. Active record enum is a great example of this. One of the things that you have almost always been able to do with it, but for a long time was never officially supported was have an enum that was backed by a non-integer column. The feature was designed very specifically to be backed by an integer column and every single value mapped to an integer, that's what we can most easily automatically generate. But we also gave you the ability to specify for each enum variant what the actual value is. And there was nothing to stop you from declaring the value to be a string. And if the backing column was a string column, then that would just sort of work. But it was never officially supported. And in Rails 5, there was, or maybe it was 4.2, some of one version of Rails, there was a big rewrite of that feature to use the underpinnings of what eventually became the attributes API. And because this wasn't a supported usage, we didn't have tests for it. The type that we created for this inherited from the integer type and so we inadvertently broke string columns. And so the bug report came in and this was a case where initially our inclination was, well, we kind of wanted to make it work, but it's just never been supported, so let's close it. But then ultimately we decided that, no, this was a reasonable thing for people to expect to be able to do. So it can vary. The easiest way to determine if something is a bug or not is to check the documentation. If the behavior differs from what's documented, I mean, that's a bug, that is 100% a bug. Might be a bug in the documentation, but there's a bug somewhere. Keep in mind though that if the method you are calling doesn't appear in our API documentation, it's not part of our public API. Rails does not define public API as public visibility in Ruby. We define it as appears in our API documentation. All right, so we think it's a bug. What's next? Well, let's see. Buckle up, put on your serious business glasses. We need a reproducible test case. This is something we insist on having for all issues opened on Rails. You can find templates for these in the Rails guides. We have one for each of our gems. This is what the one for Active Record looks like. First thing we do is we create an inline gem file to point at the version of Rails that we're reporting for. This is intended to be a single file thing, so we use an API to put this in that single file. Now, we have two templates, one for the latest release and one for Rails Master. I usually just like to always point at Master for these because there's not a ton of value in reporting a bug that's already been fixed. Even if I am specifically reporting it for an older version of Rails, I'll usually point at, for example, the 5.2 stable branch instead of the actual 5.2 release to make sure that there isn't just a fix on the branch that hasn't been released yet. The next thing we do is create some sort of minimal isolated environment for that gem. This'll look very different depending on which gem you're reporting a bug in. For Active Record, this involves creating a database connection and creating minimal schema and maybe declaring a few models. We're using SQLite for the template because most of the bug reports won't be specific to any single database and not requiring people to have a existing installation of Postgres with credentials is a good thing. Finally, we have the test. In this template or in your reports, it doesn't have to be a literal test. It can just be some print statements and calling a few lines of code, but what's important is that the output of this script needs to be something that we can quickly use to easily identify whether the bug is occurring or not. The test also doesn't, this does not have to be a single file. It's completely fine to just have a full Rails app that we can clone down to reproduce the issue. We prefer the single file if you can do it that way. But what's important regardless of whether you use a single file executable test case or a full Rails app is that the bug needs to be demonstrated with no gems other than Rails. The reproduction script serves two purposes. First of all, it's going to make tracking down the bug infinitely easier since we just have an easy way to verify whether it's fixed or not. However, the more importantly, this forces people reporting issues to actually verify that the bug that they are trying to report is a bug in Rails and not a bug in their application or a bug in another gem. We will reject the report if the gem file contains gems other than Rails or PG, SQLite, the things that you need to use Rails. If the test case has private APIs, that's another reason we'll reject these reports. Bugs are differences in behavior in public API only. You'd be surprised how many reports come in. We ask for a reproduction script and then the issue gets closed because as they wrote the reproduction script, they realized, oh, I'm sorry, this isn't a bug in Rails. Okay, so we know we have a bug and we know it's in Rails. What next? Before we fix it, there are two things I like to find. I wanna find the code that is causing the bug, the line that we are eventually going to need to chain, and then I also want to find the commit that introduced the bug. So put your serious business glasses back on because it's time to go bug hunting. We can find the commit and the problem code in either order, but if at all possible, I prefer to find the commit first. It's generally easier to do and there's a good process to find it as long as we know the code was working in some previous version of Rails. So to find the first commit that introduced the bug, we're going to use git bisect, the greatest tool that was ever gifted upon us unworthy programmers. If we know a commit where the bug happened, a commit where the bug doesn't happen and we have a test script that we can run, bisect will tell us the commit that introduced the issue relatively quickly. If we're gonna be bisecting, first thing we need to do is take our test script and point it at a local checkout of Rails. Rails has internal dependencies like A-Rail, even though now that is actually just in the main repository, but up until 5.2 A-Rail lived in a different repo and oftentimes any given commit of Rails will also only work with a specific commit of A-Rail. So it's not just enough to have the inline gem file point at your local checkout of Rails. We wanna use the gem file that's in the Rails repository so that we know that any other internal dependencies of Rails are also pointed at the right commit. This is how you do that. This is the basic process you'll use to bisect an issue. You go to a commit that has the bug. I think master is usually a perfectly fine place to start. And then you go to a commit that you know doesn't have the bug. If I don't know for sure which version of Rails the issue first appeared in, I'll just go to 4.2. Even if it was 5.1 was the last working version, bisecting all the way back to 4.2 actually only adds one or two extra commits that you have to test. From there, it's gonna take you through a binary search of the commits between those two points. That means that's gonna take you to the commit that is right in the middle of them, ask you if the bug is occurring or not. If it's occurring, it's gonna take you halfway through the half that's left and so on and so forth. And usually it will take about seven steps or to be able to tell you exactly which commit introduced the issue. If you don't feel that you can fix the bug at this point, that's fine. If this is all you can contribute, please do that. This is one of the most helpful things a new contributor to Rails can do. If I go to an issue and there's a comment with a reproduction script and then there's another comment that bisected and says this is the commit that introduced the issue, that makes my job significantly easier. This is all you have to do. Thanks, Claudio. This issue was able to get fixed in a few days as opposed to, I mean, it would have gotten fixed eventually but it makes people also much more likely to pick it up. There's less work for us to do. This is great if somebody opens up a bug report and yeah, this worked in 5.1 and it doesn't work in 5.2, but that's not all bugs. What happens when we don't know what version, when we don't have a version where this worked? What happened if the last version where this worked is so old that our test script doesn't actually run against it? Well, then it's time to put on your legacy code hat. Ruby likes to dress as fat girl when she is working on legacy code. So we're just gonna have to do some old school debugging here. Now, I'm a dinosaur bugger. A lot of you are gonna be rolling your eyes at me and saying that you could just do all of this with pry way faster and that's probably true and if you are good with pry, then you should continue to use pry. My strategies are not as efficient because it is very hard to fight a decade plus of muscle memory of just adding more print statements, but here's a list of things I like to put in print statements. So when I first started working on Rails, I knew nothing about its internals or how anything worked. So I'm gonna kind of give you some examples of how I went about fixing bugs at that point where I literally had no prior knowledge. I would start from whatever method is getting called in the test and work inwards from there. So to do that, we're going to call the method method to get the method object for our method. It's a useful method of debugging when all you have is a method name. Some objects override the method method. For example, a rack request will return, get, or post or whatever the request method is. Some active record models have a column called method. Somebody actually had that in a reproduction script they sent in once and I spent longer than I should have trying to figure out why the method method was just telling me I had the wrong number of arguments. In these cases, we can still call the method method but we have to use this method instead. We get an unbound instance method object for the method method. We call the bind method and then call our method. It helps to be methodical when working with the method method. Let's talk about some other methods besides the method method. Once you have a method object, you can call useful methods like source location to get the file and line number where the method was defined. If this method returns nil, that means that the method was defined in C and what you are looking at is not a bug in Rails because Rails does not have C code or at least that method is not part or that method is not part of the bug. You can also call super method to get the method that super would be calling in that method which is very useful if the thing that you're looking at is save which is defined 20 times. There's also the method method method which isn't a real method. I just like saying method a lot. Now when you're using in this form of debugging when you're just kind of going in from what you can see in your test file and you're just trying to figure out the actual code path it's going through, it's very important to tunnel vision yourself on the problem at hand. You don't have to understand how the entire system works to figure out how to fix a single bug. For example, if you are looking at for a bug in say callbacks and you're digging in through save because that's the method that got called, when you get to the definition of save that has to do with pessimistic locking of active record models, pretty safe to assume that that is not going to be where the bugs actually occurring and just move on from there. You don't have to try and understand every single thing you see if it looks like it's probably unrelated to the bug that you're trying to fix. This isn't always the best place to start. If the bug is very clearly related to callbacks for example, it's often much easier to just see what called the callback rather than starting at save. In this case, I'll often just stick a puts caller in the callback and I'll show you the file and line numbers of every caller in the stack at that point. Or if this is something that's getting called a lot and you just want to have the output of the first time it was called, you can stick a raise in there which will also give you a call stack. Finally, the P method is one of the most useful for this sort of debugging. It calls inspect on its argument, prints that out and then just returns it and you can just sort of litter it all over your code and not have to restructure anything because it returns whatever you pass to it. One thing to keep in mind with P though is that inspects can have side effects. And this can sometimes cause bugs to disappear. Shout out to obscure MRI bugs. I had a recent issue where calling strings start with was hanging forever and calling inspect on an unrelated class somehow made that not happen. In Rails though, most specifically where inspect has side effects is if you call inspect on an active record model that will show you all of its attributes and that causes typecasting to occur. And we've had bugs where things were getting mutated that shouldn't have been and the act of typecasting caused that to change. So once we've tracked down the problem code this way we still wanna find the commit that introduced it. To do this we're gonna use a tool called get blame except we're not gonna do that because this isn't about assigning blame it's about additional context so we're going to be using get context instead. Blaming people is terrible, you should not blame people if you would like to stop blaming people this is some code, you can stick it in your global get config and you will never have to blame people again. So what get context does is show you the last commit that changed each line in the file. Now sometimes the last change that line will not be the commit that introduced the bug sometimes it will be a formatting change or it will have moved something to a new file. We try to avoid these sort of changes in Rails since we use get context so frequently but sometimes they do appear and so in this case we have to do what's called a re-context for that commit which is where we call get context on the parent of that commit. If you use vim the few should have plug-in that most people use for accessing get from vim has some really helpful shortcuts for this if you hover over a commit and then you press tilde that will re-context it on the parent. GitHub also has a really nice UI for this I'll typically just open the commit in GitHub whenever I need to do this. So you might be wondering why we even care about the commit that introduced the bug. The answer that comes back to the definition I used for bug earlier. It's when the observed behavior is different from the intended behavior. This means that we need to know what the intention was. Even if we are fairly certain that we for sure have a bug and we don't care what they were intending in this commit seeing the intent behind the code that broke it is still very useful for writing a fix. One of the things that we want to make sure we do is we want to make sure that we do not regress whatever they were trying to fix when they broke this code in the first place. Hopefully the test suite will catch that but legacy code bases aren't always as well tested as you would like. Sometimes finding the commit is less helpful. If you're lucky it will be a very long commit message that will talk about why they are doing it the way they are what their intentions were. Generally brain dumped anything else that was on their mind. This is why I always advocate for people writing very long commit messages. If you're less lucky the commit will say changed stuff. If you're really unlucky the commit will say initial. So now that we've tracked down the code that's the problem and the commit that introduced it it's time to fix the bug. And I actually don't have much advice to give on this one. The process of once you've found the problem code and you know why it broke hopefully the fix becomes apparent fairly quickly not always but what you're gonna do is going to be incredibly dependent on the bug that you're trying to fix. I know this can be not a great thing to hear that there's just no set solution for how to fix a bug in Rails. But what I can tell you is that the process of fixing the bug is no different than any other legacy code base. You try and understand the code path that's the problem try and understand the problem and hope that the fix becomes apparent. And sometimes that just takes a lot of time. Sometimes you'll feel like you're stuck. Like Ruby. In these sort of situations take a break. Maybe sleep on it. Go home, play some video games. Come back at it with fresh set of eyes. Another thing I like to do is I like to find a pair. Having somebody else come help you work on your code can really change your context on a problem. And it's worth going through the struggles when it starts to feel rough. When you finally fix it it feels like this. Now I know what a lot of you might be saying. This is not about anything specific to Rails. Almost everything you've said can apply to any large legacy code base. And yeah, because that's all Rails is. It's just a large legacy code base. Anyone can work on Rails. We just use the same techniques we use in our applications on framework that's used by more code. By a framework that's used by more people. But the same tools, the same processes, the same strategies, they all work. You just have to get past the fear of open source. If you want more Ruby, if this one's not enough baby pictures for you, my wife's also giving a talk. That's gonna be in room 315 at 2.30 p.m. I highly recommend it. Being married to her I had the privilege of seeing it early and it's a good talk. I wanna thank Shopify for sending me out here, even though none of you made it because you got buried in snow. Thanks to the diesel core team for watching Diesel's channel while I'm at the conference. I know you all are watching on the live stream. So hi guys. I have some Ruby stickers. If you would like a Ruby sticker, come get one afterwards. I will be taking questions down off the stage as soon as I get off. So come ask me questions if you have any. That's my contact information. If you'd like to reach out, thank you very much.