 I use R to measure information security risk in a way that takes into account available data as well as systematically reduces bias in estimates obtained from subject matter experts. The resulting risk reports suggest what data breach mitigation strategies are most effective as well as how much money is reasonable to spend on them. In other words, I try to answer the question on this slide, have a new technology, what's the risk? The industry standard method for performing risk assessments in the absence of appropriate data is to list the risk scenarios you're trying to avoid and then to assign each of those scenarios a score representing the probability of the bad thing happening as well as the score representing the potential impact of the bad thing happening. The probability score is then multiplied by the impact score resulting in a risk score. Shown in this slide is an example worksheet where risk scenarios are on the left and probability and impact scores are on the right. Interest risk analysts sort the scenarios by risk score and suggest mitigating them from highest to lowest score. Sometimes they create what is called a risk matrix to visualize where their current risk is versus where it'll be once you do things like installing de-virus or enforce strong passwords. Shown in this slide is such a risk matrix where risk scores are translated to words like high, medium, and low. Also in the slide is someone shrugging. This is to illustrate the very dubious nature of what I've just described. Most people watching this talk know enough about math, science, and logic to realize that these industry standard practices are bogus. This slide shows the risk matrix in the previous slide being dumped into a trash can with the heading not real risk assessment. When performing a real risk assessment, we collect ranges instead of scores so that experts can include their degree of uncertainty in our models as well as any variability that they characterize. This slide shows an arrow pointing away from the bogus ordinal squaring methods and toward an example of the more appropriate range estimates. We can use these ranges collected from the experts to simulate data sets that become an approximation of our risk reality. Then we can integrate new information as it becomes available and update the models and world changes. This slide shows the collection of the estimates and in the same tradition, ETRO of the spreadsheet represents a risk scenario. The left side of the spreadsheet are the risk scenarios. The right side are the probability and impact estimates. But in this case, critically, the probability estimates are fractions and the impact estimates are in dollar amounts. As the method translates risk perceptions into data sets, you can use the vast and ever-expanding universe of R to further reduce the expert bias to integrate more complex data, more sophisticated analyses, and to even automate the process. I'm working on an R package called Unsure that facilitates the whole process as well as generates this dashboard. This slide is the dashboard I provide stakeholders who ask for risk assessment reports. On the left is an executive summary of the risk. In the middle are the GG plot illustrating the relative risk of different scenarios across the top are tabs for more visualizations as well as commentaries explaining each visualization. At its core, the Unsure package consists of three functions and a flex dashboard are marked down template. The first function called combos.R generates risk scenarios based on the pieces of technology that stakeholders provide. For example, you might say your project involves three laptops, a server, and a backup server. The combos script then takes those, combines them with scenarios in words like hacker or malware, and that creates the scenarios we're trying to avoid or to mitigate loss from. This slide shows a snippet of combos, the code snippet of combos.R and it's really just doing text manipulation. The second function is called MonteRLO.R. It is a loop that takes your probability and impact estimates and outputs a simulated data set. This slide shows the MonteRLO function code. It is just a loop that generates random numbers within the estimate ranges provided by the experts in that spreadsheet shown previous. The third function called rpert.R is the default mathematical distribution function that further refines the random number generation. Finally, yield numbers between the ranges provided by the expert but also following the beta pert distribution. Finally the unsure package includes an rmarkdown template that when rendered with flex dashboard provides an informative summary of the risk assessment findings.