Resilience In Complex Adaptive Systems: Operating At The Edge Of Failure
Systems seem to run at the very edge of failure much of the time. The combination of high workload, limited resources, pressure for additional features and capability, and inherent software, hardware, and network fragility is a noxious kettle of stuff always about to boil over in the form of outages, degraded response, or functional breakdowns. For insiders the surprising thing about our systems is not that they fail so often but that they fail so rarely! This good performance in the face of adverse conditions is called resilience. An important conclusion from resilience studies is that it depends critically on human operators and their ability to anticipate and monitor the system, react to threats, and sacrifice some goals to protect others. This talk will introduce resilience and a model of system dynamics useful in analyzing failed and successful event management and offer an explanation for why our systems run at the edge of failure.