 In Ruby, I apologize from the start that there's a ton of content in this, and so I'm going to run through it sections of it pretty fast, but hopefully we can get a look through it. So to start out, I want to talk about what is a failure, what do we mean by failure? And one good way of looking at that is to take a look at the design by contract view of programming as espoused by Bertrand Mayer, and that view, every method has a contract with its caller. It says the contract says giving certain inputs and giving certain outputs or certain side effects, and Mayer wrote a creative language called Eiffel and Bertrand because Eiffel's exception handling system was strongly influential on Ruby's. So in this view of programming as contracts, every method has a contract and a failure is simply when a failure is when a method is unable to fulfill its contract, and this could happen for several reasons. There could be a mistake in the methods in the way the method is called, now technically this isn't a failure of the method, but it may be the method's responsibility to report the fact that it was called wrong. There may be just a plain old mistake in the way the method is coded, like substituting a string for a symbol. There may be a misunderstanding, maybe there was a case that the programmer didn't know about, or there may just be a failure in some element that's completely external to the program. So for the next few minutes I want to just go through a whirlwind tour of Ruby's exception handling system and go over some things, some things that you maybe do know, and then some things that you might not know. Every exception starts with arrays, well actually it starts with a call to raise or a call to fail. These two are synonyms and they both do the exact same thing. In recent years I've noticed that raise has become a little bit more common than fail and it's really just a matter of taste. I was talking to Jim about this and he has a convention that I thought was kind of interesting. He uses fail for most cases, he uses fail to indicate a failure and then he uses raise only when he is re-raising an exception explicitly, and I kind of like this convention, I'm thinking about using it more in my own code. So there's a bunch of ways to raise exceptions, we can call raise without anything, just raise the runtime error, we can call it with a string, we can call it with a specific class. If you supply the third argument to raise you can customize the backtrace and this is handy for things like assertions where you really want the backtrace to point to the place where the assertion was made, not to the place where the assert method was defined. Now raise is not actually a Ruby keyword, raise is just a method defined on kernel and as we all know in Ruby if something is a method we can redefine it. So there's some things you can do with that, they're kind of fun. Here's just a silly example, we could make a program where instead of exceptions bubbling up the call stack like they normally do, they just instantly exit the program and here's an implementation of that. A possibly more useful usage of this fact is, I've got a gem called counter time, which is basically a list or small talk style error console for Ruby where instead of just the program ending and you get a backtrace when the exception is first raised, you can actually look at the environment where it was raised in, you can debug right there where it was raised. So what does raise actually do? Raise goes through four steps. The first step is to get the exception object, second step is to set the backtrace, the third step is to set the global error variable and then finally the fourth step is to throw that exception up the call chain. So taking a look at those in a little bit more detail, getting the exception object. Now when you look at the way you call raise, you might think that it's, what it's doing internally is something like the first example here where it's calling exception or calling dot new on the last that you pass it, but actually what it's calling is it's calling a method called exception to generate that exception object and on Ruby's built-in exception class, exception is defined at both the instance and at the class level, at the instance level, calling with no arguments just gives you the same object back, calling with arguments gives you kind of a duplicate of that exception and at the class level it's basically equivalent to calling dot new. This is interesting because it means that we can actually define our own dot exception methods and you can almost think of dot exception as like an equivalent to 2s or 2a, it's almost like a way of saying to the object convert yourself to an exception. So I haven't really seen this used in practice much, but this is just a little example where you could tell an HTTP response, for instance, to generate its own exception instead of deciding what exception to raise. Step two, we set the backtrace. This is either going to be set from the whatever the current backtrace is or if you supply the custom one instead of that and that's set on the exception object separately from creating it. In step three, it sets the global error variable, which is the dollar bang variable. It's not really a global, it's actually a thread local variable, but it looks like a global. What this little piece of code demonstrates is that as long as an exception is active, as long as it's not been handled that global error variable is set, but once it's been handled the variable goes back to nil. What this also demonstrates is that if you find the dollar bang syntax a little too inscrutable, you can require the English library and then you can call it error info. And then finally raise tosses that exception on the call chain where it continues to go up the stack until something either handles it or it reaches the top of the program. So now we've raised an exception. We need to hopefully handle it, rescue is how we do that. You can call rescue a number of ways as well. You can call it their arguments, which is equivalent to catching standard error. Notably, it won't catch a whole bunch of exceptions that aren't decided from standard error. You can catch it, you can give a name to the exception. You can also provide specific classes or listed classes to define what you want to catch or what you want to rescue. Now, when you look at the syntax of rescues, you might think that looks a little bit like Ruby's case statement. In fact, the way Ruby decides whether a rescue clause matches an exception is very similar to the way it does case management. So what it's actually doing is it's calling the three-pulse operator on between that class and that exception to see if it matches. And that means that we could actually, we could have a little fun with that too. Instead of providing a class, we could create a little custom match or function, a little custom match or function so that we can then say something like this. We can say a rescue errors with message matching this regular expression. And one real gotcha here though is that for some reason Ruby requires that whatever we pass to rescue, it must be either a class or a module. Then it just calls the three-pulse operator on. It doesn't actually do anything with its classiness or moduleness, but it would be a class or module, which is why in this code I'm creating a new module. Can it just pretend to be a class or module as long as it takes? I'm not certain. In MRI, probably not. I think it may be checking a two-level-level for you to create a thing. So then you've got after rescues, you can put in an insured clause, which is a good place to put everything in the insured clause. It will always execute an error or not to a good place to put cleanup code. One gotcha with the insured clause that was documented by Less Hill is that if you explicitly return, if you have an error, something raises an error or raises an exception and then in the insured clause, something explicitly returns, the exception will be effectively thrown away. It will not be propagated up. And this may not be what you're expecting, so it's probably best to just avoid using explicit returns in insured clause. Ruby is one of the few languages that gives us a retry for exceptions. Retry gives us a way to, in rescuing an exception, to say go back to the enclosing begin, to the beginning of that statement or go back to the beginning of that method and try again. The one thing you want to be careful when you're using retry is not to get into an infinite retry loop, so you have to have some kind of counter or some other way of deciding if you would retry it enough times we're going to give up. So what happens when we raise a new exception during the handling of another exception? Well, if we just raise a brand new exception, the original exception is thrown away. There is no record of its existence. There's no way to find out what it was. And I've kind of, several times I've found out the hard way that Braille's Code does this because I'll be tracking an exception back to its source and then I'll realize it was actually generated while another exception was being handled and I have no idea what that original exception was, so please don't do this. Instead, use nested exception pattern. It's just an exception object with an extra slot on it to refer to the original exception that was being handled when this exception was generated. It's not part of Ruby, but it's very simple to define your own nested exception class. This one's being a little bit clever by using the global error variable as the default for the original exception. So it basically auto-detects whether there was an active exception while it was raised. You can take the error that you caught and re-raise it. If you just call raise on that, it's going to raise the exact same object. It's not going to generate a new one, which is what this little piece of code demonstrates, the same object. You can call raise on the exception that you caught, but you can provide a new message. This is useful for instance for clarifying the message when you know a little bit more context information and you want to add some clarification to the exception message. You can also provide a custom back trace as well. If you call raise with no arguments, it will raise the current, re-raise the current active exception. If you raise with the explicitly the caught exception, does that preserve the back trace? The question is, wait, actually, do you say the question? If you explicitly re-raise as opposed to raise with no argument, does that preserve the back trace? So the question is if you explicitly re-raise as opposed to raise with no arguments, does that preserve the back trace? I believe they're semantically identical. So here's another little, here's some more fun with re-defining raise. In some languages, it's considered not permissible to do this double raise thing where you raise an exception while handling another one. There are reasons for that because you can get into trouble with that. It can be hard to debug. You can also wind up not cleaning up resources when that happens. So if you wanted to mimic one of those languages, you could do it pretty easily in Ruby. You could re-define raise so that it checks to see if there's currently an active exception and says, now you can't do that. I'm just going to exit the program. In this code, I've also defined a little helper method to enable us to explicitly say, I have handled this exception and now I'm raising a new one. What that does is it just sets the global error variable back to nil. There's an example of using it and explicitly saying, I've handled this exception. If an exception continues to bubble up the call stack and nothing rescues it, eventually Ruby will catch it and terminate the program. But before it terminates the program, it will execute various exit codes. This is useful because in some contexts we might want to be able to capture some information about a crash, but it might not be convenient to wrap a big old begin, rescue, and around our entire program. For instance, where would you put that in a rail down? We can still actually attach a handler which will do some crash logging. Here's a little simple crash logger. It's in an ad exit which will be executed when the program ends. It's just checking that global error variable to say, is there an active exception? It logs some information about it. It will log a time stamp, message, back trace. In this case, I'm also logging all the versions of all the gems that were loaded at the time of the crash. I'm sure you can probably think of lots of other information that you would throw into a crash log like this. Once we've rescued an exception, there are various ways we could handle that exception. One of the most simple ways, if we decide it's not a fail issue, is we could just return some kind of error value instead of what the expected return was. In Ruby, typically the error value is nil. Sometimes this is all you need, but nil can be pretty uncommunicative. A related approach is to return, rather than nil, to return some kind of benign value. This is kind of an underused technique, I think, but it's really helpful when you've ascertained that an exception doesn't represent an issue that's worth ending the program or ending the request over. You can just substitute some sort of known safe value which indicates that there was a problem but satisfies the expectations of the caller in such a way that doesn't break the caller. Something else you might want to do in handling an exception is you might want to report it, log into a file, send an email, report it to some external exception reporting service. What you want to be careful of here is you want to be careful of inadvertently making the problem worse. I'll give you an example of this. I worked on a project once where we had a bunch of systems. We had a bunch of basically systems processing jobs. Occasionally the jobs would crash. We had implemented a very simple exception notifier where when a job would crash it would send us an email. It would send us an email using our gmail account. Real basic, but it worked. Then one day we rolled out a version which caused the jobs to crash a lot more often. In fact, they crashed so often that all of the notifications being sent by gmail to us developers became so frequent that gmail started throttling our account, which came through to the program as an SNTP error. Now the crash logging code had not been built to handle SNTP errors. The workers instead of logging the exception and then moving on to the next job, the workers started crashing hard. This was bad enough because now we had a bunch of dead workers, but it didn't stop there. Because we had other unrelated systems that also used this basic email notification system, which they were now using all using the same email account, they were now getting SNTP errors as well. They also had nothing written to handle SNTP errors. Now we had unrelated systems going down because of this. This is a classic example of a failure cascade. It's something to really look out for when you start getting fancier with your error reporting. A useful pattern to deal with error cascades, failure cascades, is the circuit breaker pattern, which is described in detail in Michael Nygard's book, release it. I'm not sure I have time to go over the pattern in detail right now, but basically the idea is to implement your system in such a way that it counts the number of failures and when they go over a threshold, the circuit breaker trips and that system is no longer in order to operate for a while, either until a timeout runs up or until some kind of human intervention. That was our little whirlwind tour of exception handling in Ruby. Now for the rest of this talk, I want to talk a little bit more philosophically about just some advice and some ideas for how to structure your apps or your library's failure handling strategy. First up, just a general rule and a lot of you probably heard this before, exceptions should be for exceptional circumstances. In other words, stating that is exceptions should not be expected. Things like invalid user input are typically something that you can expect, so that might not warrant an exception. You can see that line of thinking in Active Record, save method, the default save method, doesn't raise an exception if there is invalid user input because that's the sort of thing that you expect to happen. One rule of thumb for deciding whether to raise an exception is expressed in the pragmatic programmer. The question to ask yourself is, if I removed, in the best case scenario, if I removed all of the exception handling code in my app, would it still run? Assuming that nothing broke, assuming that it's not running into any problems, would it still run? If it's running depends on exception handling, then maybe you want to revisit how you structured your failure handling strategy. Occasionally you run into situations where you really want to do something like an exception, you want to break out of multiple levels of execution, but it's not an exceptional circumstance. For that, Ruby gives us catch and throw, which are really confusing to people coming from other languages because those are the terms that they use for exceptions. The catch and throw are for non-exceptional circumstances, but basically give you the same semantics where you can break out of multiple levels of execution. Sonata and Rack have a pretty good example of this built in because you can do this thing with the last modified command which works with the browser to short circuit the execution of an action based on whether the browser already has the most recent version of that resource. The way that's implemented internally is that it's checking some browser headers and then it's using throw to terminate that action before it goes any further and then Rack catches that part of the chain. There's sort of internal debate about what exactly does constitute an exception. Is an end of file a failure? Is a missing hash key a failure? What about a 404? The answer to all of these really is it depends. It really depends on the circumstances, but when you raise an exception you force the issue to say I know that this is a failure. Whenever there is a situation in programming where it's one of these, it depends kind of questions. I always look for a way to punt the question to somebody else to kind of punt it down the line. One way to do that with failures is something I think of as the color defined failure strategy or the color defined fallback strategy pattern. You can see this in Ruby in a great example of it is the fetch method. Fetch is defined. How many people know what fetch does? I have a personal mission. I put this thing about fetching in all of my talks and some day I'm going to ask that question to everybody in the room and raise their hands. No, no. Fetch is great. Fetch is defined on hash and array and various collections. The way it works is you pass the key and if the key exists in the collection then it just gives you the value back just like using square brackets. But if the key isn't there then it executes your code whatever you provided in the block to decide what to do in the case of what to do for a fallback. That's a way of delegating the decision about whether this is a failure case up to the caller because the caller decides whether to just return some kind of denying value or whether to raise an exception. It also decides what exception that should be which is really helpful. In earlier feedback for earlier versions of this talk people came to me and said, well that's all fine and good but really when should I raise an exception? So I thought about it a little bit more. I came up with five questions that you might find useful to ask yourself when you're trying to decide whether it's a good decision, whether you're a good place to raise an exception. Number one, we talked about this already, is the situation truly unexpected? Is it something that we can reasonably expect like user input invalid or is it something that we really don't expect? Number two, am I prepared to end the program? Remember any exception can potentially end the program or in the case of a web application it can end the request and so sometimes when you look at it from that perspective, when you think about it from that perspective you realize that this really isn't worth ending the request over. Maybe I should just return some kind of denying value that I can see in the output that something went wrong but it doesn't actually kill the request. Number three, can I punt the decision up? Is there some way of delegating this, delegating the call about whether or not to raise an exception here of the call chain? Am I throwing away valuable diagnostics? If you have some sort of long running expensive operation or an operation that's difficult to run again, maybe it's not idempotent and you can't really run it again easily and if you then raise an exception, if you call that method and something goes wrong and you raise an exception for some trivial reason and you wind up throwing away all of the context and all the information collected by that long running operation, that might not be such a good plan. So in a case where you have some kind of expensive operation it might be good to find a way to either attach more of that information to the exception before you throw it out or to execute in a degraded mode rather than raising an exception. At question number five, with continuing forward results in a less useful exception and this is a case where sometimes it's better to fail fast. So if you have bad input it's often better to detect that as early as possible and bomb out rather than going several lines further and getting some mysterious, can't convert an element to string exception. Exceptions by exception handling by their nature or by its nature can complicate code and this is one of the complaints that you see from people that aren't fans of exceptions is that it really is kind of like go to it. It can turn code into something resembling spaghetti code and particularly if you've ever seen Ruby code that has multiple levels of nested begin blocks where it's try this and if that doesn't work well try this, well if that doesn't work try this it can be really hard to follow. I run across pieces of code like this sometimes and invariably they are some of the buggiest, some of the hardest to test and some of the hardest to understand pieces of code in the system. So I actually think of begin as something of a code smell in Ruby, something that I tried to avoid and Ruby gives us a really elegant way of getting away from the begin rescue end block because in Ruby every method has an implicit begin block that starts the beginning of the method and when you use an implicit begin block and then put a rescue and whatever your failure handling is for that method you get this great dividing line down the middle of the method where you have here's the business logic up here and then here's the failure handling down here and it's this great organizing principle and I find that when I structure my code to take advantage of that and to only have that level of nesting as far as exception handling goes it really leads to more understandable code, more testable code and one way of refactoring code that uses a lot of begin blocks towards this use of the implicit begin in Ruby is to use something that I'm calling contingency methods, it's the best name I've come up with for it, but basically it's the idea of if you have some kind of failure policy extracting that policy into a method of its own and all that method does is yield to the block and then do whatever that failure handling policy is, so do the failure handling and you can take code that if you have some library that every time you call it you have to handle certain exceptions you can factor that exception handling out into a contingency method and then use that everywhere. Certain methods in a program are critical. An example of a critical method is the crash logging method that I was talking about earlier where we were logging methods to or logging exceptions via email and that's an example of code that you really want to work well and it's bad you know it's code you want to be reliable and a useful concept when you're thinking about critical methods like this is to think about them, evaluate them in terms of their level of exception safety. An exception safety is just defines the, it describes how a method will behave in the presence of exceptions and classically there are three levels of exception safety so the lowest level is the weak guarantee. The weak guarantee says that if an exception is raised anywhere in this method the object will be left in a consistent state not necessarily the same state that it was in but consistent means basically that it won't have for instance dangling references to other objects that don't exist or database records that don't exist or something like that then you have the strong guarantee. The strong guarantee says the object will be rolled back to its beginning state so it's transactional so the exception is raised it'll just be rolled back because the method was never called and then finally the no throw guarantee says that no exceptions at all will be propagated out of this method so if you look at this little chunk of code here how many places in this code could raise an exception just go ahead and yell some out if you think of something two, three, eight so it's actually kind of a trick question Ruby makes no guarantees about where exceptions might be raised and there are some exceptions that could be raised literally anywhere so if somebody presses control C while that program is running that signal exception is going to bubble up wherever the Ruby interpreter decides to bubble it up if the program runs out of memory that exception could come from anywhere in your code Ruby has no guarantees about this operation will raise exceptions this operation won't so we have kind of a conundrum we know that certain methods are critical we want to make assertions about their exception safety we know that it's good to understand their exception safety characteristics and yet we know in Ruby that there's no way of knowing where in the code exceptions will propagate so we really need a way of just testing to see if a method meets the exception safety guarantees that we want it to and there's actually a technique for doing this and the way it works is we take some code under test here's a little method that swaps keys in a hash and we put it in a test harness and we basically run some code that exercise or we give that test harness some code that will exercise the code under test and what that harness does is it runs that code once in record mode and records each point where the code calls an external method so in this case it's recorded four calls to external methods calls to hash methods in this case and it keeps that recording around then we make some assertions about the exception safety of the code in this case we're asserting the strong guarantee it'll either be in its original state or the hash will be in the fully swapped state but it won't be left in some sort of intermediate state and what the exception harness does or what the exception testing harness does is it then plays that testing code back that test script back once for every call point that it recorded in the record phase and each time it picks a different call to force a different external call to force an exception to be raised from that call so to force it to behave as if that call failed and generated an exception and I don't have time or space to describe the internals of how you can implement this in Ruby but I have put a proof of concept up on GIST and that'll be one of the links on the talk notes that I linked to at the end so you can check that out it's very easily doable in Ruby here's the handy pattern being excessively vague in rescuing exceptions can lead to problems because invariably you wind up catching some exception you didn't expect and throwing it away and you don't know why the code isn't doing what it's supposed to be doing but you're never seeing an exception because it's just being thrown away if some code isn't raising a sufficiently specific exception the least you can do is try to match on maybe the message that the exception has rather than the class of the exception and this is another guideline this is something that a lot of people I've talked to have said they wish libraries would do is base all the exceptions that they might raise on a single base class so that when you're using that library you can catch that single base you can rescue that single base class and not worry about other exceptions coming out so something to think about when building libraries now I think I'm just about out of time, is that true? so well we got pretty far through it I will just I'll just skip past that last bit and say I'm not going to do a recap because I tried to summarize all that stuff and I couldn't figure out how so anyway hopefully you've got something out of all that hopefully I didn't move through it too quickly there are notes on all the references that I made books slides at that URL there's also another recording a longer recording I did of this talk I'm going to check that out and also a tiny bit of self-promotion I've also turned this talk into an ebook which is currently in beta which has all the stuff that I wasn't able to cover in the time that I have so that's it thank you very much