 Good morning everybody. Welcome to RailsConf. I'm glad we all made it here despite the weather. I'm curious, how many of you would say you have absolutely no idea how a nuclear reactor works? Raise your hand. Awesome, we're gonna clear that up today. So, when I was a kid, my parents gave me this four volume set of books called How Things Work. My dad is a mechanical engineer by training and he had great patience with all my questions about how various complicated things in the world worked. I think he got tired of answering those questions all the time and so he bought me this set of books in part so that I could find those answers on my own but in part to continue inspiring that curiosity in me. I don't have a chance to look at them very often these days thanks to the wonders of the internet and Wikipedia but they still have a treasured place on my bookshelf because they're a big part of why I am who I am and why I'm curious about the things I'm curious about. I distinctly remember when it was all over the news in 1990 that Comanche Peak Unit One came online outside of Dallas. It was the first nuclear power plant that I could remember in my lifetime that had come online and I remember turning to here, page 68 and 69 of volume one of How Things Work to try to understand how it was that a nuclear reactor made electricity. I think that's a good place for us to start today as well with one of the reactor diagrams at the bottom of the page. As it turns out, the basic mechanics of a nuclear power plant are very similar to any other power plant. You have a heat source. In this case, it's a carefully controlled nuclear chain reaction fueled by uranium but in the case of a combustion power plant it would be natural gas or coal burning. High pressure water circulating through the reactor core carries the heat to a steam generator where it's used to boil water converting it to steam. That steam is used to turn a turbine basically a giant fan in a tube and that turbine turns a generator which is where the electricity comes from. The steam is then piped into a condenser where it's cooled and turned back into water and it makes another trip through the steam generator. There are two primary kinds of nuclear reactors in operation in the United States, the boiling water reactor and the pressurized water reactor. This is a pressurized water reactor because that's the kind that was in operation at Three Mile Island. So it makes it pressurized. Well, the components that I just walked you through are on two cooling loops, the primary loop in orange and the secondary loop in blue. The primary loop consists of the water that flows through the reactor core gathering heat and then through the steam generator boiling water in the secondary loop. The water in the secondary loop boils to steam and the expansion of water to steam drives the turbine to create the electricity. And the interesting thing is that water in these two loops never combines, they are completely isolated from one another. So what makes it pressurized? Well, in a boiling water reactor you have a much larger reactor pressure vessel. And the reason for this is that water actually boils in the core of a boiling water reactor. So you have to allow room in the pressure vessel for that phase change to occur. In a pressurized water reactor, the primary coolant loop is held at about 2,000 PSI. And the effect of that is that the water in the primary coolant loop, because it's at such high pressure, won't boil even at the plant's operating temperature of 600 degrees Fahrenheit. Or at least it's not supposed to boil. And that brings us to March 28th, 1979. Three Mile Island Nuclear Generating Station is a two unit nuclear power plant in London Dairy Township, Pennsylvania. It's about 200 miles from where we're sitting right now. It's built on a three mile long sand bar in the middle of the Susquehanna River, about 10 miles south of the capital of Pennsylvania, Harrisburg. Unit two is a 906 megawatt pressurized water reactor designed by Babcock and Wilcox. It went into commercial operation on December 30th, 1978. And early on the morning of March 28th, 1979, it's running at 97% of capacity. And it has been for the three months since it came online. In the sling of the industry, this reactor is running hot, straight, and normal. These men are the ones who are at the controls of Three Mile Island Unit Two for the overnight shift on March 28th. Bill Zewi is the shift supervisor for units one and two. He's the most senior person on site. Fred Sheeman is the shift foreman for Unit Two. He's Bill Zewi's second in command. Ed Frederick and Craig Faust are the control room operators on duty. They're the ones that are actually sitting at the controls of the reactor this night. And everything at the plant's running perfectly normal this night, except for a small problem in the condensate polishers that the previous shift hadn't been able to solve. And what are the condensate polishers? There's a set of eight filtration tanks that filter the water coming out of the condenser before it goes back into the expensive and delicate steam generators. Now these I should note are not the actual steam generator or the actual condensate polishers from Three Mile Island. As you can imagine, it's pretty difficult to find a picture of a specific component of a specific nuclear power plant. But this is what condensate polishers look like. Now these tanks are filled with sticky resin beads that absorb everything but water. So any flecks of rust or dirt that happens to be circulating in the coolant water will stick to the resin beads. The problem is these tanks like to clog and they have to be periodically backwashed much like a pool filter if you know how a swimming pool works. At Three Mile Island, the backwash system wasn't quite powerful enough to do the job it was intended to do. And so the swing shift the night before faced with a clogged polisher tank had turned on a secondary system using pressurized air to try to break the clog in the number seven condensate polisher tank. And so at 3.59 in the morning, Fred Sheeman is down in the basement of the Turban Hall. He's climbed up on a ladder looking in the viewing port of the number seven condensate polisher tank to see if they're making any progress on this clog and everything gets incredibly silent. As you can imagine, thousands of tons of water pushing through pipes makes a bunch of noise. So the silence was disconcerting. Shortly after that there was a rumble and Fred Sheeman barely jumped free as a water hammer came through and knocked the feed lines free of all eight of these condensate polisher tanks. Now what had happened is over about 10 hours since the swing shift had started using the pressurized air system to try to break this clog a leaky one-way check valve had allowed water to force its way from the condensate polisher tank into the air supply line. And at 3.59 in the morning, that water finally made its way to the manifold that fed the pneumatic control valves for these condensate polisher tanks. All eight valves closed simultaneously. Obviously this is not good. But to help us understand why, here's a schematic of Three Mile Island unit two. Now it looks a little bit more complicated than the diagram we were looking at a minute ago, but it's the same thing, all the same components are here. I've colored the primary lupin orange and the secondary lupin blue. Let me run through the components real quick. Here in the center is the reactor core where the nuclear chain reaction takes place and generates heat. Next to it are the two steam generators where water from the primary loop boils the water in the secondary loop to create steam. Over here in the turbine building are the turbine and generator where the electricity is actually generated. Below them is the condenser where the water is turned from steam back into liquid water. And right below that is the condensate polisher. And it's completely blocked. And what that means is that there is no water to be pumped through the secondary cooling loop. And so the main feedwater pumps trip offline. It's 36 seconds past four in the morning. The official start of the accident at Three Mile Island. Two seconds after the main feedwater pumps trip, the turbine senses it's not gonna be getting any more steam. So the turbine and the generator both trip offline as well. And the plant's main safety is open inventing all the remaining steam in the plant out into the early morning sky. Now this steam is not radioactive, it's completely safe, but it makes a noise that can be heard from miles away. It's a tremendous amount of steam. In the control room, Ed, Fred, Rick, and Craig Fouls are getting their first indications that something has gone wrong. An alarm horn announcing the turbine trip starts going off, and several alarm indicators start to flash. A few seconds after the turbine and generator alarms go off, the pressure on the reactor vessel is starting to climb rapidly. Now this pressure spike is expected. Without the secondary loop to remove heat, the primary loop is heating up and as water heats up, it expands. The good news is the plant is designed for exactly this thing to happen, and it's taking action to resolve the situation automatically as soon as the alarms go off. The reactor's pressure control system is the first thing that kicks into action. There are two components to the system, and they're both important to the accident. The first is the pressurizer. The pressurizer does three jobs at Three Mile Island. The first is it regulates system pressure. Because the primary coolant loop is a sealed system, changing pressure in the pressurizer changes the pressure of the entire system. It works essentially like a piston. There's a steam bubble at the top and water at the bottom, and as the water expands, it increases the pressure on the rest of the system. The second thing the pressurizer does is allows the operating crew to measure the water level. When Babcock and Wilcox designed this reactor, they designed it without any water level instrumentation in the reactor core. They did that to save money, and they could do it because the pressurizer is the highest point in the system, and so if there's water in the pressurizer, you can infer that there's water in the primary coolant loop. The third thing it does is absorb its pressure shocks. Steam is significantly more compressible than liquid water, and so that steam bubble at the top of the pressurizer is able to absorb any quick pressure spikes that happen in the system, much like the one that's happening right now. So the steam absorbs the initial shock, but the pressurizer itself is really only designed for small pressure adjustments. It's not designed to handle situations like this where the reactor is more than 100 PSI over its standard operating pressure. It would take the pressurizer a few minutes to make that large of an adjustment. So what does the system do about big pressure changes like this one? That's where the pilot operated relief valve comes in, and if you've heard anything about the three mile island accident before, you've probably heard of this component. This is the one that gets all the press. In the event of a big pressure spike, the pilot operated relief valve will open and release coolant into a drain tank on the floor of the containment building. And by releasing a certain amount of coolant, it lowers the pressure of the primary cooling. So the pilot operated relief valve opens four seconds after the turbine and generator trip offline. Few seconds later, the reactor computer senses that the reactor pressure is still continuing to rise even with the pilot operated relief valve open. And so it takes another defensive action. It scrams the reactor. Now the chain reaction that's taking place in the core of a nuclear reactor consists of uranium atoms and neutrons flying around. And these free neutrons like to bond with uranium nuclei. And when they do, the uranium nucleus splits or fissions. When it does that, it releases a tremendous amount of heat energy along with two other free neutrons. And those two free neutrons that it releases go and bond with other uranium nuclei and that continues the chain reaction. So the primary means of controlling this chain reaction is a set of cadmium rods that can be inserted into the core of the nuclear reactor. Now normally they're raised and lowered smoothly to make small adjustments in power. But in the event of a scram, the control mechanism that raises and lowers them actually releases the rods and lets them free fall into the reactor core. This happens in about three seconds. And it shuts down the chain reaction instantly. The problem is that shutting down the chain reaction doesn't entirely stop the production of heat in the reactor core. Immediately after a scram, the reactor is still producing about six and a half percent of what it was before the scram into K heat. Now over the first hour after the scram that'll decrease to one and a half percent. But that entire first hour, the reactor is still generating plenty of heat to damage the core if it's not carried away. So it's essential to continue cooling the reactor core in the hour after a scram. Few seconds later, back in the control room, a light on the console turns from red to green to indicate to the staff that the pilot operated relief valve has been signaled to close. So all the defensive actions that the reactor has taken have worked. The pressure spike has been contained and everything's back to normal. At this point, everything feels very much under control to the operating staff at Three Mile Island. And that sense of control would last them about two minutes. Because two minutes later, the world is thrown into chaos when the automatic emergency core cooling system kicks in, specifically the high pressure injection system that dumps about a thousand gallons of water per minute into the core of the reactor. This was unexpected and very confusing to Bill Z. Lee and his crew. The plant had gone from a state that they understood very well to one that they had never encountered before as soon as high pressure injection kicked in. The reason this confused them so much is that they were watching the water level in the pressurizer. And the water level in the pressurizer was continuing to rise. Seeing the water level in the pressurizer rise told them that there was plenty of water in the system. So why did the system think it needed more water? And so Fred Sheeman made the call to turn off high pressure injection after it had only been running for about two and a half minutes. Had he not done this, the accident would have likely been a minor inconvenience. The plant would have been back online later that week. But he didn't. He turned it off. We're now five minutes into the accident. And there's something at this point that's perplexing Bill Z. Lee. You can see him in the polka dot shirt been over the table. Water level in the pressurizer is continuing to rise. So obviously the primary loop has plenty of water in it, but the pressure of the primary loop is continuing to drop. Now this is a problem because if the pressure drops too far, the water will start to boil. And if the water starts to boil, it won't be able to carry heat away from the reactor core efficiently. He has a hunch about what's going on. He suspects that the pilot operated relief valve might be stuck open. And that's why the system's having trouble maintaining pressure. So he double checks the indicator on the control panel and it shows that the pilot operated relief valve is closed. And then just to double check, he checks the temperature of the outlet pipe of the pilot operated relief valve. That reading comes back at 228 degrees Fahrenheit. And so Bill Zeewee moves on. There's a problem with that decision though. The plant's operation manual indicates that any temperature over 200 degrees Fahrenheit on the outlet pipe indicates that the pilot operated relief valve is open. And the procedure for dealing with that is to shut the manual block valve ahead of the pilot operated relief valve. Had Bill Zeewee closed the block valve, he would have stopped the incident and it's tracks. But he doesn't. He leaves it open. We're now six minutes into the accident. Five minutes later at 411 in the morning, another alarm goes off. This one for the sump in the reactor containment building. Now the sump is a big pit at the bottom of the containment building. It's meant to catch any water that happens to leak or is vented from anywhere in the system so that it can be safely contained in case it's radioactive. Now what's happening is that so much water has been released from the stuck open pilot operated relief valve but it's overflowed the drain tank on the floor of the containment building and filled up the sump. Now enough water in the sump to fill it up is a very clear indication that there is a leak in this plant but the crew miss it. They don't catch what's going on. The core is in serious trouble at this point but the operators aren't done. Just after 5 a.m. the floor of the control room starts to rumble. It's subtle at first but it quickly becomes impossible to ignore. What's happening is that the primary coolant pumps of the reactor are really designed only to pump fluid and steam bubbles have started to form in the core. And so as these pumps start pumping steam in addition to the water they were designed to pump they start vibrating suddenly at first but it gets worse and worse. And the operators know what their training says to do when this happens. These pumps are very large and very expensive and in order to keep them from tearing themselves apart and creating a coolant leak when they start vibrating you're supposed to shut them off. And they hold up on this as long as they can but finally 15 minutes in Bilsey we cannot stand it any longer. And so they shut off the first set of pumps. This helps for a little bit but 30 minutes later the vibration is returned and they shut off the second set of pumps. It's less than two hours since through my island unit two was running at nearly full capacity and it now has no coolant circulating through its core. Doesn't take very long for the effects of no circulation to make themselves known. At six a.m. precisely two hours into the accident a radiation alarm like this one in the containment building goes off. And that tells us a couple of things. The fuel in through my island uranium fuel was contained in fuel rods like these. They're sealed. This allows the water to circulate around them in the reactor core but not to absorb any of the uranium so the water itself doesn't become radioactive. Well a radiation alarm going off tells us that at least one of these fuel rods is damaged. So the water is able to get to the fuel somehow and there's fuel leaking into the water. And if one of these fuel rods is damaged the other thing that tells us is that the water level in the core is now below the top of the nuclear fuel. Enough water has boiled off that the core is now exposed. By this point plant leadership has started to make its way into the plant. Gary Miller is the station manager. He is the chief executive of Three Mile Island. This is his plant. George Cunder is the technical support manager for unit two. He's the one that's in charge of all the technical personnel the nuclear physicists, the health physicists, the chemists, et cetera. Almost as soon as they walk in the door they join a conference call with Leland Rogers the site representative for Babcock and Wilcox the designers of the reactor. As they're talking through what they know of the plant Leland Rogers says they closed the block valve, right? Block valve, the valve that Bill Z we decided not to close earlier. And so George Cunder yells to someone in the control room and asks is the block valve shut? And you can hear some commotion in the background of the call and a couple of seconds later the answer comes back yeah it's shut. And so at 6.22 in the morning in response to a question from Leland Rogers the crew finally closes the block valve sealing the leak of Three Mile Island unit two and stopping it from losing further coolant. Now this would have been a great decision 20 minutes into the accident but doing it now actually makes things worse because at this point the only way this reactor has of cooling itself is by boiling coolant off out through the pilot operated relief valve. By closing the block valve the system is now sealed and all of the heat contained in the system has nowhere to go. With the block valve closed the heat in the core intensifies rapidly and it takes about eight minutes for the top of the core to begin to collapse. Subsequent calculations would show that by seven in the morning the core is two thirds uncovered and temperatures in the hottest part of the core are around 4,000 degrees Fahrenheit. Hot enough not only to melt the cladding around the fuel but the uranium itself. At seven 20 in the morning the radiation alarm in the dome of the containment building goes off and it indicates a reading of 800 REM per hour. To give you some context for that if one of the operators of Three Mile Island had been standing on 800 REM per hour radiation field they would get their maximum legal yearly dosage of radiation in about 20 seconds. So this is a big radiation field. The crew had largely been in denial about core damage up to this point but this alarm finally shakes into their senses. They finally understand what's going on with the plant at this point. Immediately after the alarm they try to turn the high pressure injection pumps back on but to turn them off after about 18 minutes because they're filling up the pressurizer and they're still worried about having too much water in the system. It wasn't until 8.26 in the morning after the situation continues to worsen that they finally re-enable high pressure injection largely out of desperation. They're not sure what else to try at this point. It would take until 10.30 more than two hours for high pressure injection to finally fill the pressure vessel back up and get the core covered with water ending the initial accident sequence. Over the next few days there would be continued worry about a nuclear release at the plant and so they'd keep monitoring the situation on the ground and they'd keep flying helicopters over the plant with radiation detection equipment on board but the redundant containment built into the plant did its job. There was never a radiation release from Three Mile Island. There would be public worry about a hydrogen explosion from the hydrogen that was released as the fuel cladding melted but that turned out to be unfounded as well. On Sunday, April 1st, four days after the accident President Jimmy Carter and his wife Rosalind would visit the plant to reassure the American public about the safety of nuclear power and that the situation at the plant was under control. He would later convene an investigatory commission that would generate this report on the accident and that's where I've drawn a lot of the facts for the story. Three Mile Island unit two would be written off as a total loss, less than three months after it went online. About 20 tons of melted uranium ended up in the bottom of the core. Another 10 tons ended up suspended in the middle of the core and this is what they found when they began the cleanup in 1983. You're looking at severed melted fuel rods that ended up at the bottom of the reactor vessel. The final cost of that initial cleanup was just over $1 billion, billion with a B and it took 14 years and they're still not done. This is a picture of Three Mile Island today and you can see unit one on the right still puffing out billowy steam clouds. It's still operating, still generating electricity. Unit two is the one on the left. It will finally be decommissioned and dismantled when unit one is decommissioned and dismantled, currently scheduled for 2034. So what happened? How did these four men miss so many signs that their reactor was in the midst of a loss of coolant accident? The accident that reactor operators trained for. Why didn't they just leave the emergency cooling system enabled? Why didn't they close the block valve sooner? They not know what they were doing? Maybe we're looking at this the wrong way. Sydney Decker's wonderful book, The Field Guide for Understanding Human Error is an in-depth guide to investigating and understanding what happened when things go wrong. In it, he introduces the concept of first stories and second stories. The story I've just told you is very much a first story of Three Mile Island. First stories focus on the humans in a story and what they should have done differently. A first story almost always lays a blame for an accident at the feet of the humans involved and the decisions that they made. They're called first stories because they're the first story that comes to mind. They're easy to find. There's a couple of problems with this though in the form of biases that we all have. The first is hindsight bias. This is the phenomenon where when you review an event after it occurred and you know the outcome, you exaggerate your ability to have predicted and prevented the outcomes. This is just something that your brain does automatically without you even knowing it. Sometimes referred to as the I knew it all along effect. An example of that here is that well all that water in the sump had to be coming from somewhere. I don't know anything about nuclear reactors and I would have figured out that there was a leak there. The second is outcome bias. Once you know the outcome of a situation, you carry the full weight of that outcome into evaluating every decision that led to it. It makes you more willing to judge those decisions and more likely to judge those decisions harshly. A good example here is that turning off the emergency core cooling system early in the accident is obviously a stupid decision when you know that the outcome is a partial meltdown. So what should we do instead? We should look for the second story. In a second story, human error is seen as an effective systemic vulnerabilities deeper within the organization not a result of bad decision making or failure to follow instructions. So how do we get there? We have to dig into decisions from the perspective of the people that made those decisions. We have to work to consider the messy reality that they were faced with, not the clean room conditions that our mind gives us in hindsight. And we have to look through the lens of positive intent with the belief that everyone involved made the best decisions they could with the information they had. So let's see if we can find some second stories from Three Mile Island. Let's start early in the accident sequence. Why did Fred Schiemann make the call to turn off the emergency core cooling system five minutes into the accident? We'll find our answer in the pressurizer. In his deposition to the presidential inquiry, Fred Schiemann says that he turned off the emergency core cooling system because it was causing the water level in the pressurizer to rise and he was afraid that it was going to quote go solid unquote. Now what does that mean? Well remember that one of the jobs of the pressurizer is to absorb pressure shocks in the system. And steam is significantly more compressible than liquid water. And so letting it go solid is to allow the pressurizer to fill with water and eliminate the steam bubble at the top which reduces the shock absorption capabilities of the pressurizer. Okay, so we didn't want the pressurizer to go solid. Why does his concern for the pressurizer overcome his concern for what's happening in the core? Well to find that answer, we have to go all the way back to Admiral Hyman Rickover and the nuclear navy. Because it turns out, Bill Zeewee, Fred Schiemann, Ed Fredrick and Craig Faust were all former naval nuclear reactor operators. And the naval reactor training created by Admiral Rickover had drilled into these men that keeping the pressurizer from going solid was the single most important thing for a reactor operator to focus on. And the reason for that is that a 1960s era submarine reactor produced 12 megawatts of thermal energy. Three Mile Island Unit two in order to produce its 906 megawatts of electricity has to produce 2,841 megawatts of thermal energy because of losses and inefficiencies in the system. This is pretty common for nuclear reactors. They're all about this ratio. When you scram a reactor, the primary heat production stops immediately but like we talked about, there's still decay heat that you have to cool. In a submarine reactor, that decay heat is trivial. It works up to about 780 kilowatts. You can essentially ignore it. It's not gonna do any damage to the reactor core even if you lose complete circulation. Three Mile Island Unit two immediately after the scram is still producing 185 megawatts of electricity. That's plenty of heat. That's plenty of heat to melt the nuclear fuel. In a submarine, a water hammer with no shock absorption is literally the worst case scenario because it could strand your boat. In a power reactor, far worse things can happen. So carrying that mentality into the operation of a power reactor is a very dangerous mindset. And it's something that was unrealized before the accident at Three Mile Island. So Fred Sheeman, faced with a rising pressurizer, inferred that the system was already full of water and allowing the emergency core cooling system to continue injecting water into the system would overfill it, risking a full pressurizer. And so in an effort to keep the reactor safe, he turned emergency core cooling off. Let's look at another decision. Why did Bill Ziwi not close the block valve when he checked the outlet temperature of the pilot operated relief valve? Do you remember the outlet temp was 228 degrees Fahrenheit and the operations manual for the plant called for the block valve to be shut for any reading over 200 degrees Fahrenheit. It turns out that at Three Mile Island unit two, the pilot operated relief valve had been leaking ever so slightly since the plant went into service. Now this was considered a minor problem and something they weren't going to address until the next refueling shut down. And so it was something that the operator said, just learned to work around. They had to constantly adjust the pressure of the primary loop ever so slightly but it wasn't a big deal. But the consequence of this is that the outlet temperature of the pilot operated relief valve was regularly over 200 degrees Fahrenheit. And so they had completely desensitized themselves to this rule that they should close the block valve. Bill Ziwi saw the 228 degree reading and he thought about the fact that the pilot operated relief valve had just been venting scalding hot water. That seemed like a perfectly reasonable temperature to him and so he didn't shut the block valve. But there was another factor at play as well. He also had confidence that the indicator on his control panel told him that the pilot operated relief valve was closed. What Bill Ziwi didn't know is that the indicator on his control panel was merely indicating the signals that had been sent from the computer to the pilot operated relief valve, not the status of the valve itself. When the light turned red, it showed that the computer had told the valve to close. When it turned green, it said that the computer had told the valve to open. But the only way to know the actual position of that valve was to infer it from the temperature of the outlet. There was no way to know if it was actually open or closed. And so Bill Ziwi, assimilating all the information he had at his disposal and considering the full pressurizer, left the block valve open so that in case the pressurizer did go solid that the pilot operated relief valve could still respond if there was a pressure spike. He left the block valve open in an effort to keep the plant safe. Let's do another one real quick. Why did the crew not respond when the sump alarm went off? How did they not know from a full sump that they had a coolant leak? The answer to that one was really simple. They would got the alarm. The control room relays alarms to them in two ways. First, you've got these series of alarm lights around the perimeter of the room. There's this set and then there's a matching set on the other side of the room. But there's a few problems with them. First, they're really noisy. There's 600 lights in total. And when the plant's operating perfectly normal, when everything's running as it should be, between 40 and 50 of these lights are illuminated. So there's constant background noise of having 50 lights on the alarm panel always lit up. Second, there's no rhyme or reason to their placement. One of the most important alarm lights, the one for reactor coolant pressure, is right next to an alarm that indicates a stuck elevator in the containment building. And third, they don't indicate any chronology. You can't tell from looking at them when they went off or what's new since the last time you looked at them. They did have an answer for that though in the alarm printer. Every time an alarm goes off, it's sent to this printer so that there's a log of the alarms. The only problem is it's connected to the computer with a 300 bod serial connection. Very, very slow. And so less than an hour into the accident, there's more than 100 alarm lights illuminated and it would take the printer more than two and a half hours to get caught up with all the events that had occurred. There's no way the operators can prioritize the flood of information coming at them. And so they miss the sample. So how do we implement this idea of first and second stories with our teams? Dr. Decker has some helpful advice for us and the first piece is actually hidden here on the cover of the book. You'll notice the scare quotes around human error. And the reason for those is that human error is never the cause. When we're trying to figure out why something went wrong, we agree on a baseline rule that human error is not the cause. Human error is always a symptom of some underlying systemic problem or problems. So blaming an issue on human error does nothing but keep us from figuring out what actually went wrong. A good way of helping frame the conversation in these terms is to ask what is responsible for an outcome, whose fault it is. Second, understand why it made sense. The people that you work with don't come to work intending to do a bad job. Chances are when they make a decision that you don't understand or agree with or they miss something obvious, there's a good reason they did what they did. Take the time to understand why it made sense to them because it made sense to them, it'll make sense to somebody else later too. So the only way to prevent it from happening again is to understand why it happened in the first place, why they made that decision. Third, seek forward accountability, not backwards. Our instinct when things go wrong is often to find who is responsible and punish them. When we try to move our organizations away from blame and towards finding second stories, one of the most common objections is, but what about holding people accountable? Well, it turns out that removing punishment actually frees people up to candidly share their stories of what happened so that you can learn from them instead of them being tempted to sweep them under the rug to avoid punishment. This is where we get the idea of blameless postmortems. I'm sure a lot of you are familiar with that idea. But also, the act of telling the story of what happened and giving their account and owning their part in it is often the only accountability well-meaning people need. Any punishment you might dole out doesn't actually help reinforce the lesson that you're trying to teach. It's far better to give people the opportunity to tell their story, to understand what happened. Backwards accountability looks to blame someone for past events. Forward accountability seeks to help people focus on the work necessary for change and improvement going forward. Beauty of this technique is that it's so broadly applicable. There's always a second story if you're willing to do the work to look for it. It works when someone drops the production database, when the team misses an important deadline, when a key team member chooses to leave, or even when sales misses their quarterly target. There's always a second story to be found that will help you understand the situation more deeply. It requires honesty and building trust, but it's worth it because finding the second story is such a powerful way for your team to grow and improve. And it allows you to treat your teammates with the humanity and the dignity that they deserve. Turns out that who destroyed Three Mile Island isn't even a fair question. What destroyed Three Mile Island is a much more helpful question for us to ask. And thankfully that's the question that the President's commission asked. Check out the subtitle of their report. This report is full of second stories. And those second stories revealed weaknesses in reactor design and operating training in the nuclear industry throughout the world. By getting past human error to the real causes, the President's commission made the world a tangibly safer place. If you take the time to find second stories for everything, not just when you have an outage, you'll make your organization a safer place for the people who work there to do their best work. And you'll fix the things that impact your delivery speed and delivery quality. Best of luck.