 Cool, so first of all, thank you all for coming out here today to spend your morning talking about failure, which is a light topic. So I think DHH sort of set you up and now I'm just gonna like knock you down. As was mentioned, I'm Jess Rutter and if you're the tweeting kind, that's my Twitter handle, so there we go. On to the show. All right, when I was 26 years old, I decided that I wanted to learn how to fly a plane and my friends and family were very skeptical. They had such words of wisdom as, you don't even know how to drive a car, you get motion sick, you're afraid of heights and all of this was 100% true, but I actually didn't see the relevance. You see, no one was going to clip my wings. So we're now six months in and I was on final approach for runway 21 in Santa Monica after a routine training flight. I eased back on the throttle, tilted the plane, the nose of the plane up a tiny bit, entered a flare and floated gracefully, more or less, down to the ground and suddenly the plane jerked to the side and my instructor was like, I've got the plane. I was like, oh, you've got the plane and he brought the plane to a stop on the runway and as our hearts calmed down, we got out and we looked at the damage. It was a flat tire. Now my heart rate finally was calm and with the plane safely stopped, a flat tire really isn't a big deal. You just have to tow it back to a maintenance garage and change out the tire. The runway was gonna be blocked for less than five minutes. So I was actually really surprised when my instructor said, hey, I'm gonna drop you back up at the classroom and then I'll come back to fill out the FAA paperwork. My heart rate jumped right back up. Whoa, whoa, whoa, whoa. It's just a flat tire, no one got hurt. Why does there have to be any paperwork? You see, in addition to not being the biggest fan of paperwork, I was also really worried that I had just gotten my instructor in a lot of trouble. But it turns out that the FAA collects details on all events, big or small, even a tiny tire blowout during a landing in my little four-seater plane. They wanna get as much data as possible so that they can work out patterns that can help them implement safer systems. They know that more data means they'll be able to draw better conclusions, but they also know that people really don't like paperwork or getting yelled at. So to make sure that pilots are willing to fill out these reports, they have a policy that if there were no injuries, nothing you did was illegal, and you fill out a report, there's not gonna be any punishment. Now think about the very different approach that we have to automobile accidents. When I was 12 years old, I was riding home from the Saturn dealership in a shiny brand new car. It was the first brand new car that my parents had ever purchased. We're sitting in a stoplight and suddenly we lurch forward. We'd been rear-ended. My dad got out to check on the other driver, an incredibly nervous 16-year-old boy. Now, the other driver was fine, everyone in our car was fine, and the only damage was a small puncture on the bumper from the other car's license plate. My dad looked it over and he said, well, look, I guess that's what bumpers are for. He told the kid to be careful, and then we all piled back into our slightly less shiny car and drove home. No paperwork was filed, no data was gathered. In fact, there's not a single agency out there collecting data on car issues. It's usually handled by local agencies like the police, and they do not like it if you call them up to let them know about something as trivial as a flat tire. Heck, you can have an accident where two cars actually bump into each other, and as long as no one's injured and no one wants to make an insurance claim, this will never end up in any records anywhere. So these two different approaches have actually led to very different outcomes. I looked up the most recent stats available, which were for 2015. And for every one billion miles people in the US travel by car, 3.1 people die. And for every one billion miles people in the US travel by plane, there are only .05 deaths. Now, if you're like me, decimals, especially when you're talking about a fraction of a person can be a bit hard to wrap your mind about, so this is a bit easier. If you hold the miles traveled steady, 64 people die traveling in cars or every one person that dies traveling in a plane. Now, there is something very interesting going on here. We have two different approaches that lead to two very different outcomes, and the key difference is actually how each approach deals with failure. You see, it turns out that failure isn't something that you should avoid, it's a way to learn. Now, before we go much further, it's probably a good time to make sure we're all on the same page when we talk about failure. So what is failure? I think for some of us it might be that sinking feeling that you get in the pit of your stomach when something didn't go right and the person is yelling at you and they have the angry face and you're like, why did I even bother getting out of bed this morning? And I can absolutely relate to that. And as I was prepping for this talk and looking at what different people said, I found a lot of people that were like, oh, failure, that one's simple. It's the absence of success. And I was like, sweet, what's success? And they were like, psh, so easy. It's the absence of failure. Oh, not really helpful. But researchers, they actually have a very specific definition of failure. Failure to them is deviation from expected and desired results. And that's not bad. Now, I honestly think there's some truth in all three of these definitions. But that last one, deviation from expected and desired results, that's something that you can actually test and measure. So we're gonna stick with that one for now. Now, I couldn't actually find any definitive data on this. But I think that as developers, we have more results that deviate from our expectations than just about any other group of people. So you'd think that programming would be the perfect place to learn from failure. But one of the few places that I could actually find people routinely, most of the time, capitalizing on failure, was in video game development. And one of my favorite examples of this is with the game Space Invaders. Do you guys sort of know the game Space Invaders? So it's the old arcade game where you control a small cannon at the bottom that's firing at a descending row of aliens. And as you defeat more aliens, they speed up making them harder to shoot, right? No, that actually was not the game. That's not what it was supposed to be. The developer, Tomohiro Nishikato, he had planned for the aliens to remain at a constant speed the entire time. No matter how many aliens you destroyed, there would not be a speed increase until the next level. And he wrote the code to do exactly that. There was just one problem. He had designed the game for an ideal world. And I don't know how much you know about 1978, but 1978 was far from ideal. And he'd actually placed more characters on the screen than the processor could handle. And as a result, the aliens sort of chugged along at first and only reached their intended speed once enough of them had been destroyed. Now Nishikato had a few ways of dealing with this. He could shelve the project until processor speeds were fast enough. And that might seem silly, but maybe he had a vision and he was not gonna compromise. He could also have modified the game design, put fewer aliens on the screen so that it could run at the constant speed that he wanted. But instead of being rigid and insisting on his original vision, he decided to have people test it out. And they absolutely loved it. They got so excited as things sped up, they would actually project emotions on the aliens. They were like, oh, these guys are getting scared. I'm taking them out and they're trying to speed up because they know that I am about to kick their butt. And it was so popular that he kept that in the game. And the failure was actually responsible for creating an entirely new game mechanic, the difficulty curve. So before this, games would always be the exact same difficulty for an entire level. And it wasn't until you got to the next level that things would actually get more difficult. After this, all bets were off. Things could get difficult at any point whenever the developer pleased. Now, I don't know if the developer here had read the studies, but he was actually capitalizing on a lesson that I see time and again in the research about failure. Failure is not something to hide from. It's something to learn from. In fact, it turns out that failure actually presents a greater learning opportunity because there's more information encoded in failure than in success. Think about it. What does success usually look like? A check mark, a thumbs up, a smile from a manager. And what did you actually learn? Well, there's research on this. Research actually shows that people and organizations that don't experience failure become rigid because the only feedback that they get tells them just keep doing the exact same thing you're doing. Don't make any changes because you're already winning, buddy. Don't change anything. Failure, on the other hand, looks a whole lot more like this. Okay, look at this. Look how much information there is available in this error message. If we read it closely, we can figure out exactly what went wrong. We know which line in the code has an issue, and if we have some experience with this particular error message, we probably know what that issue most likely is. And even if we've never seen it before, we're just a quick search away from pages and pages worth of information about this particular failure. Now that we've had an experience with an approach that didn't work, with a bit of effort, we could actually figure out how to write something that does work. Now, video game development actually has a long and honored history of grabbing hold of mistakes and wrestling them into successes. In fact, the concept of exploiting your failures to make your program better is so important it actually has a name. They call it the good-bad bug. Now, having space to learn from their failures, that actually came in very handy for a group of developers that were working on a street racing game in the 90s. So the concept for the game had players racing through city streets, and they were being chased by cop cars, and if the police caught up with you and pulled you over before you got to the finish line, you lost. There was just one problem. The developers got the code for the algorithm just a tiny bit wrong, and instead of law-abiding police officers trying to pull you over, you had these super aggressive cops trying to slam right into your car, and they would do it at full speed, no matter what you did. The beta testers, they actually had way more fun avoiding the cops than they'd ever had with the racing game. And as a result, the entire direction of the game was switched up, and the Grand Theft Auto series was born. So just, I wanted to think about that for a moment. The core concept of the best-selling video game franchise of all time in history ever would have been lost if the developers had panicked and tried to cover up the fact that they screwed up the algorithm. They made a mistake, but instead of freaking out, they thought, all right, let's see what happens. And they cashed in. Now, there are actually some larger programs where hundreds, if not thousands of hours of work have already been done by product leads and designers and business people before a developer ever gets to write their first line of code. And in game development, the work is encapsulated in a document called the game design document. Now, the GDD is considered a living document. However, it's actually a pretty big deal for changes to be made late in the game. It means that tech requirement pages need to be redone. Art pages need to be redone. Release states have to be pushed back. Budgets might be off. You get the picture. It's a big deal to change this. But that was actually the unhappy reality that the Silent Hill developers were facing. So they'd started out building the game to the GDD specs. But there was one tiny problem. Pop-in. You see the PlayStation's graphics card? It couldn't render all of the buildings and the textures in a scene. So as you walked forward, buildings would suddenly pop into existence and blank walls would magically have a texture. And as you can imagine, that sort of like, oh hi, trees suddenly, distracted people from the game. And a horror game is very dependent on atmosphere that sort of pulls the player into the game's universe. So this was kind of a game breaking issue. Now it would have been easy for every single person involved to start pointing fingers at everyone else. After all, everyone had sort of played their part. From the designers who put just one or two more buildings in the background to make it interesting, to the tech team that decided to make it for the PlayStation instead of the more powerful 3DO, to the business team that determined the release date. There was not a single individual anywhere along the line that had made an obviously bad call. There were just a bunch of tiny issues that sort of snowballed into a big problem. You see, the entire system had failed. But instead of running from the failure, the Konami team sidestepped it. They found a workaround. They filled the world with a very dense, eerie fog. And it turns out that fog is actually pretty lightweight for a graphics card to render. So now it obscures distant objects, which means you couldn't really see buildings and textures on the horizon popping in anymore. And as an amazing added bonus, it is really, really, really creepy. In fact, it was so creepy that this fog became a staple of the Silent Hill series long after graphics cards had caught up and become powerful enough that pop-in wasn't an issue anymore. So that's like another example of success being ripped from the jaws of defeat simply by embracing your failures. Now, these examples from the programming world actually helped to illustrate what was happening at our more high stakes example in aviation and automobile accidents. You see, the aviation system saved so many lives because accidents are treated like lessons we can learn from. Data is gathered and aggregated and patterns are identified. If an accident was caused by a pilot being tired, they never just stop right there. They look at pilot schedules and staff levels and flight readiness checklist to determine what contributed to that exhaustion. In contrast, who do we usually blame for road accidents? Yeah, the driver. Oh, she was reckless. That dude, he does not know how to drive. In other words, airplane accidents are always treated as failures of the system while car accidents are treated like failures of individuals. And with all that judgment going around, it's no wonder that people spend so much time trying to cover up their errors. I definitely stopped at that stop sign. That's the guy who went through it. I mean, they spend time covering it up rather than just acknowledging the failures and learning from them. Now, not everyone in this room is a pilot, but I actually think that we have a lot to learn from how aviation handles failure. If we're willing to use a system to track and learn from our failures as we write code, we're actually gonna be much better off. But that sort of begs the question, what should that system look like? Now in broad strokes, I think that there are three very important pieces that this system would need. And the first one is to avoid placing blame. We're gonna need to collect a lot of data. And then we're gonna have to abstract patterns. So step one, make sure that you understand that you are not the problem. Cool, that is much easier said than done, right? I mean, learning not to beat yourself up over a failure and mistakes, that's probably like a whole talk in and of itself or like a whole lifetime of self-discovery and work. But with aviation failures, one thing to note is that they never just stop at the top level of blame. So there was actually a case where the pilot made a critical error by dialing in the wrong destination in the flight computer and it caused a wreck. So on the cockpit recording, they could clearly hear the pilot yawn and say, I'm super excited to finally get a good night's sleep. Now it would have been very easy for the researchers to stop there and blame the pilot for being tired. But for them it wasn't enough to know that he was tired, they actually wanted to know why. So they verified that he had had a hotel during his layover. But that wasn't enough. So they verified that he had checked in. And then they looked at the records of every single time that door had opened and closed so they could establish the maximum amount of time that the pilot could possibly have slept. And even then, they didn't just say, okay, we've shown that the pilot could not have possibly had more than four hours of sleep total tonight. They looked at the three letter flight computer readout and they were like, wow, you know, now that we're thinking about it, that's an incredibly confusing interface if you're tired or distracted, which a cockpit is a pretty distracting place. Now they're always willing to point out where individuals have contributed to a failure. But they also wanna focus on what went wrong with the entire system. So if your takeaway from failure in code or anywhere else in life is anything like, I'm dumb, I just can't learn this, this probably just isn't my thing, you are absolutely missing out on all the best parts of failure. Now I understand not everyone is gonna be at the point where you can kind of quiet that inner critic yet, but if you just spend some time trying to ignore it and work the rest of the system, given enough time you're probably gonna find that the voice in your head starts to contribute helpful insights rather than just criticism. Now step two, document everything. Even things that seem small. Heck, I think you should document especially the things that seem small. So my flat tire on the runway in Santa Monica was a very small error. But just as we saw with the Silent Hill example, a lot of those small errors and missteps can start to roll up into a major problem. And catching those problems early on in course correcting is going to help you avoid major meltdowns. So how should we document things? I'm a big fan actually of paper documentation, but as long as you have some sort of record that you can refer back to, the form of documentation is really going to be up to you. You should definitely include details about what you were trying to do, the resources you were using, whether you were working alone or with other people, how tired or hungry you were, and obviously what the outcome was. Get specific when you're recording the outcomes. If you're trying to get data from your Rails backend out of your alt store into your React components, and it keeps telling you that you cannot dispatch in the middle of a dispatch, don't just write down, React is so stupid and I can do all of this with jQuery, so why is my boss torturing me? Because that does not help. Trust me, I tried. Now look. The final step is to make use of all that data that you've been diligently gathering. Imagine how powerful that data is as you go through it and start extracting patterns for when you do your best work and when you do your worst work. Instead of vaguely remembering that you struggled the last few times you tried to learn how to manipulate hashes in Ruby, you can actually see that you were only frustrated two out of those three sessions and the difference between the one where you felt good and the other two is that you were well rested for that one. Or maybe you noticed that you learn more when you pair with someone or when you have music playing or when you've just eaten some amazing pineapple straight from Kona, Hawaii. On the flip side, you might discover that you don't learn well past 9 p.m. Or that you're more likely to be frustrated when you're learning something new if you have not snuggled with the puppy for at least 20 minutes prior to opening your computer. And that is a very good thing to know because it's a lot easier to identify the parts of the system that do and don't work for you when you actually have a paper trail and you're not guessing. And you're also gonna have a really nice log of all the concepts that you're struggling with which if anyone in here has ever said, oh, I'd love to write a blog post but I just don't have any idea what to write about, this log of all the things you're struggling with, that's your blog post source right there. Now let's say that you go back and you read this data and you see that you had in your last epic coding session I was trying to wire up my form for my Rate This Raccoon app and it worked. Sort of. The data got where I was sending it but it kind of ended up in the URL which was a bit weird. Cool. You actually have a very well-defined problem to research and it won't be too long at all after reading some form documentation that you realize you were using the get action on that form and get request, put the data in the URL. Post requests are the ones that keep it hidden in the request body. So now you're just gonna need 20 minutes of puppy cuddle time and you're ready to go fix that form. Now I've been focusing on how individuals can learn from failure today and the thing is this is also incredibly important for teams. So in the research on failure there's actually a pretty famous study that looks at patient outcomes at demographically similar hospitals and they found a very strange thing. At the hospitals that had nurse managers that were focused on creating a learning oriented culture instead of a blame oriented culture there were actually higher rates of error but they also had better patient outcomes and they were like that's weird. Here we have hospitals where people are encouraged to be open about mistakes and they make more mistakes but patients are better for it and so they dug in because that was not what they were expecting to find and what they found was that it was just that people were more willing to report their mistakes which meant that the hospitals could find what was causing the mistakes and correct them which meant that patients had better outcome. At the blame oriented hospitals people were afraid of losing their jobs over even tiny mistakes and they would spend a lot of time covering them up. And maybe some of you have been on programming teams where that's the situation like if you break production everyone's gonna yell at you and you have to wear a stupid hat and people are gonna make fun of you. And so you spend a lot of time like oh crap I just pushed a debugger up all right maybe I can do a hot fix before anyone finds out. And the underlining issues actually never get dealt with and if you show me a dev team that has a zero tolerance policy for mistakes I'll show you a dev team where engineers spend a good portion of their time covering up mistakes rather than writing good code. If you focus on blameless postmortems rewarding experimentation and just not being a dick to people because humans make mistakes you are actually gonna have very different outcomes and probably more longevity and less turnover on your teams. Now look like everything else that you try the process that I'm proposing may not actually work perfectly for you the first time around. Thank you. At the risk of going a tiny bit too meta just figure out what about the process isn't working for you and see how you can adjust it. That's right. You can learn from the failures that you're learning from while you're trying to learn from your failures. And as you get more comfortable gleaning info from failures you're actually gonna find that every single bug is actually a feature as long as you can learn from it. Even if you end up deleting every single line of code and starting over again. Thank you. So that's a great question. So it's how to communicate that failure is something to learn from from the engineering side to the business side that might have a different perspective. And that's absolutely gonna be tough because if you are at a corporation that sees like a bug in production as the end of the world and you might lose your job because of it then that's gonna be a problem. And it's gonna be very hard for the engineers to be willing to fail publicly if they're gonna get punished even if it's not from their managers. So part of that is it's up to everyone to try to educate the other people on the team about how this is part of the engineering process. But that's obviously, your work is building things and like fixing bugs and maybe not fixing people. So it's unfortunate. I would say like if you're in management working hard to try to like help business people understand that is great. If not then buffering your team as much as you can from being punished for mistakes while allowing them the space. So maybe it's like fail internally, be open about it, have blameless postmortems but put a nice veneer on top for the C level people that need to think that everything's perfect and that's the best way to do things. Sure, so the question is about the difference between an early career developer making a mistake and being visible and a late career developer making a mistake and being visible. So I actually had a similar question where someone was like as a woman part of my career growth has been looking like I know 10 times more than the next person that was applying for the job and being open about failures could tank my career. What do you recommend? And I was like, I'm not an expert. I recommend you protect your fucking career. But I would say that I think it's great when late career developers are open about failures because that gives room for early career developers to be open about their failures. I'm not embarrassed by not knowing something so when I've been in code readings or anything like that and they're like, oh, are there any questions? I'm always more than happy to raise my hand and say, I didn't understand that thing because for me it's not embarrassing not to know something and so I kind of like am willing to take one for the team and afterwards people will come up and say, oh, thank you so much for asking. I thought I was the only one that didn't know. And so I think part of it is like if you're late in your career and you're comfortable with where you are and you're willing to show that it's okay to fail, then take that one for the team. If you feel secure in where you are in your career, take that one for the team because all of us are going to be much better engineers by being willing to fail and learn from that then we will be covering it up. And so if you think that it might cost you your job, cost you your livelihood, cost you the ability to go down the career path you want to, like don't lose out on the value that you get from at least acknowledging internally your mistakes. You could keep a private diary of what's gone wrong and how you've learned from it. You don't have to publish it publicly but I would say like don't let other people keep you from being the engineer that you wanna be and if the way that you get there is by trying stuff out, writing the code, breaking the code and then going back and fixing the code, don't be afraid to do that. Like absolutely don't let other people's weird ideas about everything needing to be perfect from like the first key press on the keyboard, keep you from learning because ultimately I mean, like you do you, you be good engineers. Like don't be afraid. So the question is about being so cool with failure that people no longer worry about making mistakes. So they're just like, whatever. So I mean one thing is I think most programmers want to build stuff that works. So it's gonna be very rare that you find someone that's like, I don't care that it's not working for people. Whatever, it's lunchtime. Like most of us have, we're like, my baby is broken. My code doesn't work. Like we put all that pressure on ourselves. In the research just at least in Western cultures, I can't speak for other cultures, but Western cultures like put such a stigma on failure that it's very rare that you would find someone that's gone too far the other way. And it's much more likely that you're gonna end up in a place where people are covering up mistakes and making things even worse because of that. I think if you do find yourself in a situation where, so sort of I think the tell if someone is getting a little too comfortable with mistakes as if they keep making the same one over and over again, and that's not what this is about. This isn't about being like, oh, stuff breaks, whatever. It's about saying like when stuff breaks, what can I learn to do it better next time? And so if you've gone too far on the like laissez-faire side of things where you're like, no, nothing's a problem. You just reel it back in and you're like, hey, I noticed that you've pushed a debugger up to production three times in the last week. You wanna like, let's come up with a way that we can make sure that doesn't happen anymore. And yeah, so you just course correct there. Yes? Yeah, I mean, so the question is sort of finding the balance between the extra time cost of documenting things versus the value that you get when it works out well and you get some value from it. And at the risk of this being a cop out, I'd say you're gonna kind of have to experiment and find what works best for you. I don't know if anyone in here has had the experience of like learning something, writing a blog post so you don't forget it or like writing some notes. And then six months later being very confused on a topic, Googling and finding your blog post that you wrote six months ago when you had the problem. And I mean, what a love letter that is to yourself where it's like past you is like, Jessica, you were gonna struggle with this six months from now. I know how to explain it to you. Here you go. And so like in those moments, that time you took to write that blog post, you either learned it better and were able to do it more fluently next time or you have this great resource written by you for you for the next time you're confused on it. And I think in those instances, people really see the value in like the documentation and the learning. Certainly if you spend like a couple months and you're like, I write a lot of stuff down and nothing's getting better for me, then yeah, scale it back and find a different thing that works better. Sure, so concrete steps that senior developers can take to help junior developers sort of document and learn from their failures. Yeah, I mean, it's probably gonna vary by individual and vary by team. I think certainly writing tests is great, like without getting too pedantic about it, having at least some test coverage, forcing you to like think about, especially as a junior, the system that you're about to architect, forcing you to have a safety net for when you make a change later and don't realize that it's gonna break things. Tests are amazing. I'm a big fan of having junior developers and senior developers alike, right? Blog posts and documentation, because it's like we get so much, especially, I mean in this room, I think most of us are Ruby and Rails developers, probably working with other open source libraries and other things, and like maybe we're not all at the point where we can sit down and write a gem or a framework that's going to like help tens of thousands or millions of people, but we can write a blog post that's like, I struggled with this, this is how I got around it, this is what really made it clear to me and that helps the person that's learning kind of solidify what they've learned and then is just like this beautiful gift for the rest of the community. I can't tell you how many times like people have been, so I work at a school where we teach developers and there've been plenty of times where students have Googled an issue and it's been resolved by a blog post that a student like two, three, four semesters earlier wrote when they were going through that same issue and so it's both a great way for juniors to learn as well as like a great gift to the community by people that can't necessarily give through like writing code yet. I mean that, oh sorry, the question is if you've experienced a situation where you thought covering up was a mistake was better than admitting it. I mean all the freaking time, because I don't like being yelled at, I like have such an irrational fear of authority which has zero to do with how my manager actually interacts with me, but every time he's like, oh do you have a minute to talk? I'm like, oh my God, this is it, I'm being fired and everything's the worst and he's just like, oh I wanted to tell you you did a great job on that feature launch and I'm like, so when is my last day? So, I mean the thing is that I've never actually found a situation where it would be better for me to have hidden it. I've found situations where I was really scared and the thing that I thought was going to be bad which is like a manager yelling at me actually did happen because I worked at a crappy place where that's how they treated small mistakes like a typo in a report that the client had said, oh we never look at that because there's just too much data in there and then my manager's like, you made a typo you don't care enough and I was like, yeah it's not the 18 hour work days, it's my level of caring. And so I think like there are situations where it feels like the better option is to hide the mistake, but I've never actually seen it where there's like long term value from hiding the mistake and I mean I may be privileged and lucky in that that it's like never like killed my career. I'm sure like being seen as someone who makes typos in reports certainly didn't help me at that ad agency, but I left there and learned to code and my life is better so screw those guys. Anyone else? Awesome, you guys have been fantastic, thank you so much. Thank you.