 Hi, I'm Carlos Sinelli from UCLA, and this is a talk recorder for USR 2020. In this talk, I'll present the R-Package SenseMaker, which implements a suite of sensitivity analysis tools for ordinarily squares. But before going into the details of the R-Package, I would like to motivate the importance of sensitivity analysis with a well-known real example, the debate on cigarette smoke and lung cancer that took place in the late 1950s and early 1960s. Back then, observational studies found a strong association between smoke and cancer. Smokers had nine times the risk of known smokers to develop lung cancer. So this naturally leads to the question, is this association causal? Well, not everyone agreed with this hypothesis, and in fact, one of its most fierce opponents was Sir Ronald Fisher. Fisher argued that we could not rule out that an observed common cause explains the observed association, and theoretically, Fisher is right. Observational data alone cannot distinguish between the first model and the second model. So how can we move this debate forward? An important piece of the smoking and cancer debate was a sensitivity analysis, and it consisted of the following hypothetical exercise. So let's suppose for a moment that Fisher's hypothesis were true, that is, cigarette smoking does not cause lung cancer. Now we can ask ourselves, how strong would this an observed confounding need to be to explain all the observed association? That what was cornfield computed, and it concluded that if smokers had nine times the risk of known smokers to develop lung cancer, and this is not because cigarette smoking is a causal agent, then an observed confounder would need to be at least nine times more prevalent in smokers than in all smokers. The opinion of the experts at the time was that no gene or hormone could possibly be the tightly linked to smoking, and Fisher's hypothesis was just to be impossible. So the logical conclusion of our sensitivity analysis is that an observed confounding cannot explain all the observed association, and there must be at least some causal path between cigarette smoking and lung cancer. So to sum up, why do we need sensitivity analysis? Smoke causal inference with observational data makes untestable assumptions about the absence of an observed confounder, and the truth is that hardly anyone believes that those assumptions hold exactly. So we need tools that make it easy to routinely discuss the sensitivity of our estimates when our assumptions are called into question, as it happened in the smoking cancer debate. So our goal today in this presentation is to learn how we can routinely perform these types of analysis for all our estimates using the R-package SenseMaker. SenseMaker implements the tools developed in our GR-SSB paper, making sense of sensitivity, extending omitted variable bias. So now let's see another example, and this time the example comes from political science. The R-package SenseMaker comes with this real data set from a survey with the Fourier Refugees in Eastern Chad. So first we need to load the package with library SenseMaker, as usual, and then call the command data dark forward to load our data. Details of the data set can be found in the help documentation. Our research question here is to understand how exposure to violence changed individual attitudes towards peace during the dark forward conflicts of 2003 to 2004. Specifically, the direct exposure to violence make individuals angry and thus more likely to ask for revenge or did it make them worry and more likely to ask for peace. Now this is a causal question, so we cannot answer it without assumptions. And the main assumption here is that government bombings and attacks by the militia were indiscriminate within village, with one major exception because there was targeting base on gender due to sexual assaults. In other words, village and gender are sufficient for control from family and we can estimate the cause effect of direct harm with the following regression where we have the outcome piece factor then the treatment indicator directly harmed and two confounders which are the gender and village dummies. And we can run this regression in R using the LM function and save the results in this object called Darfur model. And according to this model we find a large and statistically significant effect suggesting that those words directly harmed became on average more per piece, not less. The previous estimate however relies on the assumption of non-observed confounders and of course as usual not all investigators may agree with this hypothesis. For instance, after you write up your results and submit it to a journal you will find out that reviewer too thinks that although within village bombing was largely random you still should have adjust for whether individuals lived on the center or the periphery of the village. That is you should have run a regression that further adjusts for the covariate center. Or to make things worse reviewer three argues that not only you should have adjusted for center but also for wealth and prior political attitudes as these are likely confounders as well. And the problem here is that none of these variable center wealth or political attitudes were measured. So the question we want to answer is how different would our inferences have been how to include them in the analysis as we wish we had included or at least as the reviewers wished we had included. And this is what SenseMaker can compute for you in particular SenseMaker can help you answer the following questions. First, how strong would the particular confounder or group of confounders have to be to change the conclusions of a study? Second, in a worst case scenario how vulnerable is the study's result to many or all unobserved confounders acting together possibly non-linearly? Third, are these confounders or scenarios plausible? Or more precisely how strong would they have to be relative to observed covariates for example female in order to be problematic? And finally how can we present the sensitive results concisely for easy routine reporting? So let us go back to our DAR for example and answer those questions using SenseMaker. Here I'm repeating the code we already had loading the package loading the data and running our original island model and now the only extra step we're going to do is to begin the sensitive analysis by applying the function SenseMaker to our original model DAR for model. And the most important arguments of the function call are first of course the model and here we include the region regression in our case DAR for model. Second we need to tell SenseMaker what the treatment variable is in our case it's directly harmed. Next we need to include the names of the covariates that will be used to bound the plausible strength and observed confounders. In this case we put female because due to the nature of the attacks we know there was targeting based on gender and it is hard to imagine observed covariates as strong as gender. And finally we tell SenseMaker how many times stronger the confounder is related to the treatment and to the outcome in comparison to the observed benchmark covariate in this case female. So here we are saying that we want to investigate the maximum strength of a confounder once twice or three times as strong as female in explaining treatment and outcome variation. There are other arguments to the SenseMaker call and here I'm making use of default arguments and you can check the details in the help documentation. So after calling SenseMaker we can explore all the sensitivity results with the print summary and plot methods. The print method of SenseMaker provides a quick review of the original estimates along with three sensitivity statistics suited for routine reporting. But here instead of showing you the print output in R I'm showing you a nice latex table that you can obtain with the command ovb minimum reporting. And these simple sensitivity statistics already tell you a lot about the sensitivity of your estimate. So let's start with the part of our square of the treatment with the outcome. This quantity measures how much residual variance of the outcome your treatment explains after taking into account what the observed covariates explain. But what is perhaps not that well known is that this quantity is also a sensitivity statistic just like we had in the smoking cancer debate. A part of our square of 2.2 percent here means that in an extreme scenario even if confounders explained all remaining variation of the outcome they would need to explain at least 2.2 percent of the residual variation of the treatment to bring down the estimated effect to zero. So this quantity shows the bare minimum strength that confounders need to have to explain away the observed association. But the partial square may be a too extreme scenario. So this leads us to the second sensitivity statistics which is the robustness value. And here our robustness value of 13.9 percent means that if confounders explain 13.9 percent of both the residual variation of the outcome and of the treatment this is sufficient to explain away the effect. On the other hand it also tells you that if you think confounders cannot explain 13.9 percent neither of the residual variation of the treatment nor of the outcome then you're safe. This confounder cannot explain away the point asthma. And we can also compute robustness values that account for sampling uncertainty and in this case would reduce to 7.6 percent. So know here that in this table we already answered the first and second questions we had posed. These metrics quantify how strong confounders need to be in order to change our original conclusions. But now we have the hard part of sensitivity analysis the possibility judgment. We need to judge whether the values of 2.2 percent and 13.9 percent are good news or bad news. And I know the important thing here to note is that statistics itself cannot answer this for us. This is where expert knowledge needs to come in. But what SenseMaker can do it can help you leverage expert knowledge regarding the relative importance of variables. So in particular we can compute these bounds. The bound strength of an unobserved confounder with the same strength as an important observable variance that you have measured. For instance in our case the bounds here in the lower corner of the table show that a confounder as strong as female can at most explain 12 percent of the residual variation of the outcome and 1 percent of the residual variation of the treatment. Since both of those numbers are below the robustness value of 13.9 percent we concluded point estimate is robust to a confounder as strong as female. In addition since the 1 percent is less than the partial square of 2.2 percent we concluded point estimate is robust to worst case confounder as associated with the treatment as female. Now note that all these results are exact for single confounders and are conservative for multiple possibly nonlinear confounders including functional form specification of the observable variance. So as we have seen these minimal sensitivity reporting answers most of the questions we had posed but we can further refine our sensitivity analysis with visual tools to explore the whole sensitivity range of point estimates and t-values and SenseMaker can help us help us with that as well. And the first plot I want to show you here is a default plot a sensitivity control plot of the point estimate. To obtain this we can simply run the command plot therefore sensitivity. Here in the x-axis we have how strongly the confounder is associated with the treatment and in the y-axis how strongly the confounder is associated with the outcome. These two axes are measured in terms of partial square which indicates the percentage of residual variance of the treatment of the outcome that the confounder explains. Now for each pair of partial square we have a contour line indicated the adjusted estimated effect and this is the exact point estimate you would have obtained if you could run the regression with that confounder. So starting from the bottom left we have the original estimate of 0.098 which assumes no confounding and as we move along the diagonal confounding is assumed to be stronger to the point of eventually flipping the sign of our estimate which is represented here by the red contour line at zero and the red diamonds here indicate the maximum strength of confounders if it were once twice or three times as strong as female. And as we can see such confounders are not strong enough to bring the estimate down to zero although of course they could substantially reduce the effect size. Now we can also make the sensitivity contour plot for the t-value for testing the null hypothesis of zero effect and to obtain this you can simply add the option sensitivity of equal t-value to the plot command. In this plot the axes are defined as before but now the contour lines indicate the adjusted t-value. Another here the statistical significance is still robust to confounding once or twice as strong as female however we cannot roll out the confounding three times as strong as female would make the estimate statistically insignificant. Finally the last plot I want to show you is an extreme scenario plot and you can obtain those plots by adding the option type equals extreme to the plot call. So here the x-axis still shows the partial square of the confounder with the treatment but now the y-axis shows the adjusted estimated effect. Then we consider different extreme scenarios for the partial square of the confounder with the outcome represented by different curves for example the solid line here represents assuming the partial square of 100 percent then the next line 75 percent and the other line 50 percent and the red tick marks on the bottom show the bounds of confounding once or twice as strongly associated with the treatment as female and as we can see confounding once or twice as strong as female would still not explain away the point estimate even in these extreme scenarios. So in this presentation I only showed you the basic functionality of SenseMaker and there is a lot more you can do and if you're interested in applying these tools to your own work I suggest reading the papers. There is the making sense of sensitivity paper which is the theory paper and the paper sense maker sensitivity analysis tools for OLS in R and stator which is the software paper and finally I want to point out we also have a shiny app that you can explore those things in the web. Thank you.