 I'm gonna get rolling here. Okay, this meeting is being recorded. Great. So I'm gonna get rolling here. Welcome to causal inference and use our 2020 workshop. We're very happy that we were able to do this still. It's great that the organizers did so, so, so much work to get everything rolling online. And we were really lucky to work with our colleagues at Our Ladies LA, which I am, I'm one of the organizers over at the LAR users group. And so I was, you know, I really pushed for that because they're amazing. And so I was really happy with that. And they've done a really awesome job getting everything set up while Lucy and I kind of, you know, figure out how to do this online and all that. So, so we are very, very grateful for them. So thank you to LA, Our Ladies for hosting us and welcome to the workshop. We're gonna get rolling here. So, Lucy, you wanna introduce yourself real quick? Sure. So I'm Lucy Dagestino McGowan. I am an assistant professor at Wake Forest University in North Carolina in the mathematics and statistics department. And my research is broadly, well, my training is in biostatistics. And I do a lot of causal inference work in particular with propensity scores, propensity score weighting and different methods in that regard. And so I am really excited to be here. I love R and I also love causal inference. And so this is kind of like the perfect world for me. And I also have a podcast called casual inference that talks about some of the kind of causal inference techniques, but in hopefully a casual way that sort of tries to define the terms for a lay audience. Yeah. Highly recommended by the way. So I am Malcolm. I am a clinical research data scientist at Lavango Health where I work on both kind of classical academic type of research and also business oriented goals. And we use several causal inference techniques there. My training is in epidemiologies. In epidemiology, I'm an epidemiologist. And luckily in epidemiology does something cause something has always been a big question. And so I have long since been interested in causal inference methodology from that perspective as well. I also love R and causal inference stuff. And so yes, it did work out quite well for this. And so that's why Lucy and I wanted to do this for sure. So I think let's get rolling here. So we're really gonna focus today on number three here, explain. So I like this idea that there's really three practices of analyses and these practices are all interrelated but they're also distinct. When we're setting out a goal for each of these, we need to make sure that we're clear about those goals. So the main activities that we're doing in analysis are describing, predicting and explaining. And you may do one of these more than the other in your work. As an epidemiologist, explaining is a big thing but also describing and predicting are important. All three are important for most people but maybe your job requires that you focus on number two. You're making machine learning models that do prediction and that sort of thing. But you're curious about this number three, how do these tools all intersect? And so one of the things that we want you to encourage you to do is really think about when you're sitting down to do an analysis, which of these three you're trying to do and how can you use the other two to support that goal? And so unfortunately what happens in the scientific literature is these goals get all mucked up together. Somebody sits down and they are just kind of like describing their data, they're kind of like picking around, cherry picking their data and they're like, oh, this is kind of interesting. And so they fit a model for it to see like, okay, what's associated with this and that? And like, oh, that's interesting. Now I have a prediction model and then they write up their model and they say, well, this is just a descriptive study, but maybe this actually causes that. Maybe I can interpret it that way. And so these three ideas often get really mucked up in our actual day-to-day practice. And so today we're gonna be focusing on number three, explain and we're gonna make a big distinction between our usual tools and what those are giving us, what kind of answer those are giving us versus what's the actual answer that we wanna get for whether or not something causes something else. So a big part of this is that we use regression for all of these tools, but a normal regression model estimates associations. Regression is amazing. It's one of the most amazing tools that we have, but it can't help you distinguish between cause and association. It doesn't know how. And it's not supposed to. That's not what it is supposed to do, right? So when we're doing number three, we're trying to explain what we're really trying to do is we're trying to estimate a causal effect, not an association. And in the causal inference world, one of the most useful frameworks is the counterfactual framework. And the counterfactual framework is asking a much more specific scientific question than association. It's asking, what if we're gonna talk a lot about quitting smoking today? And unfortunately, when I was a young man, a young foolish man, I smoked cigarettes and eventually I quit, I wised up. And so in this world, in this life, I quit smoking, right? And maybe that has had an effect on my health, right? There's also a counterfactual world, a world that doesn't exist, why I didn't quit smoking. And if those two things, if those two worlds, that's the only thing that's different. If something happens, I can be pretty sure that it's because of the fact that I did or did not quit smoking, right? So we're gonna be talking a lot about that example today. But we're gonna focus a lot on this. And so that's hard to measure because that world doesn't exist, right? I did quit smoking. The world where I didn't quit smoking, luckily doesn't exist. So in general, when we're actually doing this with data, we're gonna ask this question at the population level. Okay, what would happen if everybody in this study, this data set that we were doing was exposed to something versus what if everybody in the study was not exposed to it? And so the difference there, the difference in those two counterfactual worlds, that's our causal estimate, right? And counterfactual is a whole complicated literature on this. So this is a very simplified version of this. I encourage you to check this out if you're interested. All of this stuff has a lot of like mathematic basis for this, a lot of statistical research into it. So I encourage you to look into it if that's something that you're interested in. That's really the kind of the heart of the question we're getting at, right? And so the big difference between one, two, and three here is the assumptions that we need to make for them to be correct. And unfortunately for number three, we need to make a lot of assumptions that are unverifiable, right? So we can do some things to get closer to verifying it, but there are some things that we can never know. And there are several of these, several important ones. Lucy will talk a little bit about this later on. The one that we're gonna really focus on today is the assumption that there's no confounding in our study. And if you haven't heard this term before, the idea here is that the effect that we're seeing of X on Y is only due to that relationship, X on Y, right? It's not being distorted by some other association. That is, you know, some other variable that is associated with both X and Y, which can unfortunately give us this classic situation where correlation does not equal causation. And we'll talk a little bit about like, wait, why not? You know, what's causing that to go wrong? Why doesn't correlation equal causation? So the tools that we're gonna be focusing on today are really one and two here, causal diagrams and propensity score waiting. These are two tools of many. A third closely related tool is propensity score matching, which we'll kind of touch on a little bit, but we're mostly gonna be focusing on these first two tools, which are really, really great, really useful. And two of the more, these three are more common than some of the other tools that you'll see, with the exception of maybe this first one, which is something we're not really gonna talk about today, but is one of our best tools for causal inference, which is a randomized trial. And again, without getting into too many details, it's mostly because it deals with these assumptions that we need to make really, really, really well. It makes it much more reasonable to make those assumptions. There's also a set of methods called G methods. Inverse probability waiting is related to them. And another set of methods called instrumental variables, which is there's a set of methods related to this, like difference of difference, regression discontinuity. These are very common in econometrics and program evaluation and that type of work. Whereas randomized trials, G methods, those tend to be, you see those more in individual level, health research in particular, but in other studies as well. But these are all closely related. We're just picking these two because we like these two. They're very, very useful. These are all very, all very useful and they're closely related. They're kind of like brothers and sisters that don't get along. Like there's the one group that thinks like, oh, we're the best, we do this, right? But they're all very, very closely related. And if you kind of take a bigger picture, you see that they come from the same basis of causal inference that we need to do. And so we just picked these two because we like these two and they're very useful. But there are these other tools as well. So as I mentioned, we're gonna be using RStudio Cloud for these exercises. This link is here. It's also in the chat. So if you haven't opened that up, I encourage you to while I'm kind of switching over to the next section. And I also wanted to bring up a few resources that we have available for you that are more broader resources on, more broad resources on causal inference. This is not our stuff. This is more about causal inference. So we're gonna kind of go over quickly these elements of what's different about causal inference versus other aspects of analyses. But there are several, several great books. These are a few that I like. There's many more. Maybe Lucy has some additional suggestions on top of this list. I'm sure she does. Cause we actually come from slightly different schools of thought, which is great. I really like the causal inference book. It's called Causal Inference. It's by Miguel Arnaud and Jamie Robbins. It's really good, very epidemiology oriented. If that's interesting to you. The book of why is a great introduction to causal diagrams. It's interesting, but it's meant to be very friendly as well. And then mastering metrics is one that I read recently, which is very focused on econometrics methods, very good introduction, friendly introduction to instrumental variable methods. If you think that that might be of interest to your work. Lucy, do you have any other recommendations that you like to give out for these types of books? Yeah, that's a great question. Well, so I am, as Malcolm mentioned, we do come from kind of different schools of training and so for those of you not as similar with the causal background kind of, there's, so the training that Malcolm comes from is like uses DAGs and things like this that are what you're going to learn today. And then my training comes from the like Rosenbaum and Rubin line of causal inference, which we're all doing the same thing. We just kind of use different words and stuff. So I think that the recommendations that you made are great. I referenced a couple of things from Rosenbaum and Rubin in my slides that I think are. Yeah, but those recommendations are really good. And I think that especially for coming at it from an introductory level, I think those are like especially Miguel's book is one of the best out there. Yeah, yeah. And it does look at a different, it's not just, you know, they're very linked to this g-method school of thought, but it doesn't just look at those methods, although it does emphasize them. Yeah. Yeah, so that is a really great introduction. And a nice thing is if you use other programming languages besides R, their website has code for a lot of the stuff in SAS and Stata and a bunch of other tools. So that's really nice as well. Okay, so this first section, I do have an exercises are marked down file for if you open up the exercises folder in the cloud space. It doesn't actually have any exercises in it though. It's actually complete as it is because this one is gonna be a little less interactive. We're calling this the causal modeling whole game, which is a term that I learned from Hadley Wickham where the idea here is that we wanna show you, we wanna take a step back for a second before we get into the nitty-gritty of the code and show you what are the broad strokes of what we're looking for here. So I will have code here. I'm not gonna dive too much into it because we're gonna talk about those a little bit later in the workshop. I'm really gonna be focusing on like, okay, what am I doing? What am I looking for? And most importantly, like, where are we going? You know, how do we win this game, so to speak? You know, what's the end game here? And so this is a little bit of a spoiler alert because you'll see a lot of the examples that we will use throughout the workshop. But this is to give you a bigger picture. So we will dive into almost all of these examples a little bit more and interactively. So if something doesn't quite make sense, so don't worry about it, just make a little note and you'll have an opportunity to dive in a little bit later in the workshop, okay? So the broad strokes that I'm going for here is that first I need to define my actual causal question. Then I'm gonna draw my assumptions with causal diagrams, which is one of many methods of dealing with this, these assumptions that we need to make, but it's very, very useful. Then I need to actually model my assumptions. Today we'll be doing that with propensity scores. Again, there are other ways of doing it. This is what we're gonna use today. We're gonna do a little diagnostics. It's always good to do sensitivity analyses of your work and make sure everything, kind of a sanity check, make sure everything makes sense. And then finally, where I'm going is I wanna actually get not an association, I wanna get an estimate of the causal effect that I'm interested in. Often in the epidemiology world, we end up writing papers that are like, well, it's associated with this, we kinda skirt around the language. But here I'm saying explicitly, I'm gonna make these assumptions and if these assumptions are correct, I'm gonna estimate my causal effect. And that's very interesting. I'm not saying definitively this causes it. I'm saying if my assumptions are correct, this is an estimate of that effect. And that lets you criticize my assumptions, which is a good thing. So the main scientific question in this example that we're looking at is, do people who quit smoking gain weight? We know smoking causes lung cancer, thanks to observational studies using causal inference. We know lots of other health impacts of quitting smoking. We know that even if you are a smoker and you quit smoking, it does reduce your risk of lung cancer, which is good news for me, because I did smoke when I was younger. And so another thing that has been observed though is that people who quit smoking tend to gain weight. And so the question here is, is this causal, is this actually an effect of quitting smoking on gaining weight? Or is something else to blame, right? So I'm gonna hone in on this question a little bit more in a bit here, but I'm gonna jump right into my data, which comes from this package CI data. This is actually an example from the book, Causal Inference. They use this example a lot. And this is from the NHS study, which is a longitudinal study. And in this dataset I am, so first of all, a bit of people who dropped out of the study, rather amazingly, there's actually a Causal Inference technique to deal with that. I encourage you to look into that. It's really, really cool and it's discussed in that book. But we're not gonna deal with that today. I'm just gonna remove those people for now. And this dataset has what I need in it. It's two time points and it has this variable Q smoke. This is the second column here that is zero if you kept smoking and one if you quit smoking. So all 1,566 people here are smokers that were followed over time. And we also have some demographic variable or some health variables and we have their weight. So we can look at this question with this dataset. So the first thing I wanna do is use number one. I wanna describe, right? I wanna actually know, like what is the real difference here? The real observed difference here. And even though I'm not getting a causal effect here, this is real, right? This is what really happened for these people. Whether or not it's statistically meaningful or clinically meaningful is another question. Whether or not it's a causal effect is another question. But describing this here, what I'm seeing is, okay, it looks like maybe zero is kept smoking. So it looks like maybe a few more people kept smoking and that those people, maybe that peak there is a little bit to the left of the people who did quit smoking. So this is what I was mentioning is that sometimes you observe this in these groups where it kind of looks a little bit here like the people who quit smoking gained a little weight, a little more weight, but there's a lot of variation here. So it's really hard to tell. So if I dive into the numeric summaries here, that looks about right. The mean is about two and a half pounds heavier for people who actually quit smoking, which is one, right? So this is a description, right? This did happen in this population. We know that unless there's a significant data or coding error here. And so our question that we wanna get at though is this actually causal, right? Maybe there are other things that are associated with both quitting smoking and gaining weight that are distorting our effect. And so that's where drawing your assumptions becomes really, really useful. And so this is the main question that I'm getting at out of this analysis. Does quitting smoking affect change in weight, right? And so this is the start of my causal diagram, which is the next section. So we'll dive into this a bit more. And what I'm gonna assume is drawing from this book about the confounders that they used is that, in fact, there are other variables that are associated with both quitting smoking and changing weight. And so some of these are demographics. Some of these are health variables. Some of these are related to how much you smoked or how much you ate at baseline, that sort of thing. But each of these additional arrows here, for instance, exercise has an arrow going from quitting smoking to changing weight and to changing weight. Each of these arrows that's linked to both of these variables is something that's gonna mess up my effect, right? So if I just do a regression of quitting smoking and changing weight, it's gonna give me the wrong answer. If this diagram is correct, these assumptions are correct, it's gonna give me the wrong answer. So, why I might want to actually draw out my assumptions like this, and by the way, these are, I'm calling these assumptions, right? I'm not saying that this is correct. And in fact, I encourage you to think about whether or not this diagram actually makes sense because I think that there are probably some things that don't make sense here. There's some things that maybe are not connected that should be, maybe some variables that should be there that aren't. And so I encourage you to think about that because this is one of the best things about drawing your assumptions is that people can criticize them. And that's a good thing for you to get the right answer. So, the best thing about these types of diagrams is it can help me answer this question. What do I actually need to account for in my analysis to get the right effect? And we'll dive into what exactly is happening here with these diagrams that can do it, but the tool that we'll learn today, GGDAG, and the underlying tool, Bagady, can actually answer this question. If this diagram is correct, I know exactly what variables I need to control for. And in this case, it's all of them, right? Because each of these is individually associated with these two. And that's not always gonna be the case. In fact, it's very rare that it's the case that every variable in your diagram needs to be controlled for, and we'll see that a little bit with the examples in the next section. But I will take this diagram for its word for now. I think I could probably improve it, but I'll take it for its word right now. And I'll take what we call an adjustment set for its word right now. So, I'm gonna include all of these variables up at the top here, up in the label up here as what I actually need to model for. So, I'm gonna start where I would often start, which is using linear regression, right? I have this variable weight 82.71, which is these participants' difference in weight in these years, 1971 and 1982. That's my outcome. I put my question in here, quitting smoking. That's the answer that I want. And I put all these other variables in as well, right? And so this is just a normal multivariable linear regression. And this is gonna give me an association, which is that for quitting smoking, adjusting for all these factors, it's associated with about a 3.46 increase in weight compared to people who didn't smoke in this population, right? But to model my assumptions here, this isn't quite what I'm getting at, right? What I'm getting at is this counterfactual. What if everybody in the study quit smoking versus what if no one in the study quit smoking? So, to model this, I can't use this approach. I'm gonna use approach called the propensity score. And this is one of many approaches to deal with this, but this is a nice one. So, I'm actually gonna switch up my model. I'm gonna switch to a logistic regression model, where my outcome is now quitting smoking, right? So, before it was my study outcome, which is weight gain, but now what I'm modeling is the probability that you quit smoking. And now all of those variables that I had before are included in this model. So, this model is gonna give me the probability that you quit smoking, which is the basis for the propensity score that we're gonna use. One subtlety is actually what we want is the probability, Lucy will talk a little bit more about this. It's actually the probability that you would get the treatment that you got, which is a subtle difference, right? So, it's not the probability that you quit smoking. It's if you quit smoking, it's the probability that you quit smoking. If you didn't quit smoking, it's the probability that you didn't quit smoking. So, a little bit of a subtle difference there, right? And so, if you know the broom package, you might recognize this function. This is augment. It's taking my model and it's running predict on the inside. It's getting my probabilities, which is called thatfitted in this package. And then I'm using mutate from dplyr to make an inverse probability weight. So, I'm taking one and I'm dividing it by my propensity score, which is this probability of either, if you quit smoking, if you didn't quit smoking, it's the probability that you didn't quit smoking and vice versa, right? So, we'll dive into this a bit more later in, later on and talk about like, okay, what is this effect? Are there other options that we have for modeling this and that sort of thing? But this gives me a data set that now has this variable weights in it, right? So, I'm gonna take a look at this. I'm gonna take a look at my weights. One simple thing is just to look at a density plot of my weights. And so, you know, there's lots of ways of looking at this. I could also separate this by treatment groups of whether you quit smoking or not. But I might look for extreme weights. Does somebody really, is somebody really unusual? They really seemed like they should have quit smoking and they didn't or vice versa. And I have a few methods of dealing with that. There's definitely some skew here. There's some people with higher weights where the mean looks like it's more around two. But I actually think this is not too bad. And so, I have some techniques for dealing with this. You can truncate your weights, setting the high weights to some percentile. There's an approach called stabilization that you can use that can have positive effects on your model. But I actually think this isn't too bad. So, I'm gonna keep going and kind of keep it simple for now, right? So, you know, if we kind of go back to our assumptions here, what I'm trying to do with this model is I'm trying to eliminate all of these arrows going into quitting smoking, right? I wanna change it so that there's actually no association between all of these variables and quitting smoking. So, a good way to do this is to look at the differences in those variables between the two groups. So, between the two groups, people who quit smoking and people who didn't quit smoking, we can use, this is a nice measure called a standardized mean difference that Lucy will dive into later. That is exactly what it is. It's a standardized way of looking at differences between groups, right? And so, just the observed differences here, we have quite a few differences in these different variables. So, we have all these variables that we modeled on our y-axis. And these are the unadjusted differences. So, the good news is that our weighted model, when we look at these differences, they're substantially reduced, right? And so, this is what we're looking for. And, you know, in a randomized trial, ideally, this is what you would see as well because randomization to a treatment means that there's nothing else associated with it. So, there should be no confounding by anything else, right? And so, that's what we're trying to replicate here is that experience. One quick note, again, Lucy will dive into this a bit more. This does not help me with things that I didn't measure that aren't in my model or that I don't know about, right? Randomization does help me with that, actually. But one common mistake that people make with propensity methods is they think that it creates exactly the conditions of a randomized trial, which it does not, right? We cannot assume that these differences are eliminated for variables that we didn't include in our model that are important or things that we just didn't measure, right? So, we only have access to what we observed for these observational models. Okay, and so, finally, I can go back to my linear regression model and actually model the difference in weight, but now I'm gonna try and get my causal effect, that counterfactual question. If everybody quit smoking versus if nobody quit smoking, what's the difference in change in weight, right? So, going back to my linear regression, I still have weight as my outcome and quitting smoking as my first variable here because that's my question of interest, but you'll notice that all those other variables are gone, right? It's just these two variables in this model, the outcome and the exposure, gaining weight and quitting smoking. But I do have this new argument, excuse me, weights. And so, this is the, this set of data that we just modeled with our propensity score. These are our inverse probability weights. And so, these weights are going to adjust our effect such that it's answering that actual counterfactual question, right? And then, we just run in regression as normal and I can get that effect here. So, it's actually pretty close to what we got before. And for a simple analysis, you sometimes see that, which is good news that, for those analyses that are not using causal methods that should, but in more complex causal questions, often you get quite different answers. And so, you have to think about that. We do have an additional problem that we're gonna dive into a little bit here, which is that because we have these weights, our standard error is now actually artificially small. So, if I were to look at my p-value here in my confidence interval, they're too small. They're artificially low. And so, one way to deal with this is to bootstrap it. And you can also fit a robust estimator in that sort of thing. But today, we're gonna talk about bootstrapping because it's nice and some recent improvements in the R ecosystem have made it quite easy to work with this. So, we're gonna fix these confidence intervals with the bootstrap. And the bootstrap assumes in R, you're wrapping your whole analysis in a function. So, it may be tempting to do this, right? We already fit our model. We already fit our weights, our propensity score. So, we'll just run our regression over and over again on some bootstrap data sets. And if you don't quite know what a bootstrap data set is, we'll talk about that in a little bit more detail later. But this actually isn't quite right. We need to bootstrap our entire modeling process. So, this is an important distinction. Our modeling process now has two steps. Before it just had one step, which was to fit the final regression model. We have to bootstrap both of these steps, our propensity score model and our causal effect model, our inverse probability weighted model. So, today we're gonna show you how to use R sample to bootstrap these. This is a package from the tidy models ecosystem. And there's also a package called boot in R, which is why I've historically used, but it's got some quirks. And R sample is actually a pretty nice improvement over that experience for me. So, it's got a function called bootstraps and it pretty much takes care of these details for you. And it will guide you, there's a great vignette that I'll point you to or that guides you about how to actually get your estimates and then your confidence intervals, which is pretty straightforward. We have this function bootstraps and then we have this function int underscore something. I'm using int underscore t, which is a t statistic based confidence interval. And this will get me some proper confidence intervals. So, before we, the problem was, that we were getting artificially narrow kind of confidence intervals that were wrong. And so the whole issue we're trying to deal with here is that if we're using things the way we normally do, we actually get the wrong answer. And so, this is one of the additional steps that we need to take to get from the answer. So, often you'll see something like this, where you bootstrap confidence interval or if you use a robust estimator, your robust confidence interval ends up actually being wider. And so, normally when you look at an analysis like this, you might think, oh great, like my OLS, my linear regression model, it's more precise, right? That's not the case here, unfortunately. It's that it doesn't know how to account for the weights correctly. And so, it's artificially low because it's creating an independence in the model that the regression model doesn't know about, right? So, you do need to use a bootstrap for these types of models or a robust estimator. I don't think we have any examples of the robust estimators anymore. We just are using the bootstrap for this, but you've got plenty of options for that as well, right? So, to kind of wrap up this kind of super quick, super quick introduction to like the whole process of what we're doing is we went for this question, does smoking increase weight? And we now have an actual causal effect estimate, right? Not an association. We're saying if our assumptions are correct, we're estimating the causal effect, right? Feel free to criticize those assumptions, but if those assumptions are correct, this is an estimate of our causal effect, which is that if you quit smoking, you had about a three and a half pound gain in weight compared to if you didn't quit. And so, it just shows, goes to show that smoking is a complex subject. It's associated with a lot of other things. Gaining a little bit of weight is way better than getting cancer, so I recommend quitting smoking. It's got way, way, way, way more negatives to it than positives. So, but this is an interesting thing about the complex nature of health research. And so, I am going to wrap this up this section and take a couple of seconds to see if we have any big questions in the chat. While we do that, I wanna note that again, we have a very detailed version with all of the code that you need of everything that I just went through. I went through it intentionally quickly here because we're gonna dive into all of these aspects throughout the workshop. But if you wanna review this later or today or now or whatever, you have that option, right? If you wanna come back next week and you're gonna be like, wait, what do I actually need to do again to get this causal effect? This is all really like concretely laid out for you for a simple analysis, right? So, I encourage you to go and review that later. So, I think, oh, I wanted to note, so again, causal inference is a great resource. I have this resource here called causal inference notebook which has a lot of our code associated with the analyses, including the NFS analysis for causal inference. And then our sample has a great vignette on how to actually get bootstrap confidence intervals. It's very straightforward. I really like this improvement in the ecosystem which is really nice. So, I think let's take a moment and see if there are any big questions before we move on to a more interactive session. Anything important come up? Let's see, I need to, it's a little bit hard for me to see the chat, so if somebody- Yeah, we had some questions, but I think I started answering them and I actually think that they're gonna be better answered like as we progress. Yeah, yeah, yeah, yeah. So, I think we can keep them on hold for now. Yeah, so part of this was to entice you a little bit to see like, wait a minute, what's that? What's the deal with that, right? And so we're gonna dive in a little bit more. Okay, great. So, what I want you to do now, so we're gonna go into, that was a broad overview of the kind of aspects that we're gonna do. The rest of the workshop, we're gonna bounce between a little bit of learning and a little bit of interacting with exercises. So, we're gonna move on to the second section which is causal diagrams in R. So, this should be clearly labeled in the exercise folder. It's O2-DAX, that's exercises.RMD or something like that. So, at this point in the workshop, I'll give a little bit more information. I'll talk a little bit more, but then we'll actually start interacting with the code a little bit, right? So, the subject of this session is causal diagrams in R. So, I already showed you very, very briefly a diagram. Maybe you've seen something like that, like a theoretical model that somebody might include in their research. Maybe you've actually seen a causal diagram. Maybe you've seen related methods like structural equation modeling or Bayesian networks that have a very similar layout. But you might still be wondering, like, wait a minute, why did we do that again? Like, how did that actually help us in our analysis? So, the big thing here is that, as I mentioned, we have to make some big assumptions to actually get a causal effect. And causal diagrams, which are also called causal-directed acyclic graphs or DAGs, they're called directed because you'll see there's arrows in them. They're called acyclic because they don't go in circles and the graphs. So, that's where DAG comes from. I often like the simpler term causal diagram. I think that's nice, but we're gonna be using packages that use the term DAG a lot as well. So, you should know that term. So, the key benefit of causal diagrams is that you can explicitly lay out your causal assumptions about how these two factors that you're studying are related and how other factors are related to it. And then you can analyze those assumptions. A really important thing about DAGs is that they have, while they're quite simple to use, they have this rigorous mathematical background where you can, by making certain assumptions, you can make a lot of progress in an understanding how you actually need to model your question. And sometimes we're simply, can you actually answer your question? Sometimes you can't. Sometimes you write a DAG and you go, oh crap. I can't actually answer this question. I don't have that variable or that's impossible or some other issues going wrong here, right? But luckily, in many, many cases, it can tell you exactly what you need to do if your assumptions are correct, right? So the basic idea here is that you need to actually specify your question, right? So does quitting smoking, at the specific time point between, in this case for this study, it was between 1971 and 1980, does quitting smoking at that time period cause an increase in weight by the time we get to the second period, right? It's a very specific cause of question. You may be wondering like, okay, wait a minute. Where did that diagram come from? That's where the biggest question I get. Like, wait a minute, where did this model come from, right? It comes from domain knowledge, right? So some of it you know and some of it you don't know. Some of it you need to talk to your colleagues about. You need to read the literature. You need to iterate. I have never successfully presented a diagram and everybody go, yep, that's it. You got it. It always gets some constructive criticism. Usually it's constructive about like, wait a minute. I don't know that that's quite correct. So usually you have to iterate your domain knowledge a little bit and then you actually have to draw it, right? So each variable in our analysis is as a node. It's in these diagrams we'll use it'll literally be a circle with a piece of text. And then we also have these causal pathways that we're making assumptions about. And each of those causal pathways is drawn as an arrow in a causal diagram, right? So when you look at my causal diagram you can know exactly the assumptions that I'm making. This variable is associated with that variable because there's an arrow going from one to the other. You know that I'm making an assumption about that. Whether or not the assumption is correct is another question but you know that that's the assumption I'm making. So today we're gonna use the GGDAG package which is a package for using GG plot two to draw and analyze DAGs. So briefly I won't go too much into the design of this package but this package really connects two packages, two sets of packages. One is DAGITY which is an amazing tool for analyzing diagrams. It has these great robust algorithms that work really, really well for analyzing diagrams, causal diagrams. And then we have GG plot two. Like this is a group of people I probably don't have to convince by and large that GG plot two can make some really, really incredible beautiful flexible plots. And then GGDAG also makes use of another GG plot extension called GGRAF which is for making network diagrams. And if you're interested, we're gonna talk too much about this but one thing that GGDAG does is that it turns your DAG into data and uses this data structure called a tidy DAG that you can use in GG plot two and GGRAF but also that you can analyze with the normal tools that you use. For instance, if you use dplyr you can apply those to this tidy DAG data. So step one is to actually specify your DAG and for GGDAG, we have this function DAGify where you can write your assumptions as R formulas, right? So these are the types of formulas that you would write for LM or GLM or other types of modeling tools. So here I'm specifying a DAG where I'm making two assumptions. I'm telling DAGify I'm making two assumptions here. The first assumption is this. Smoking causes cancer, right? Specifically lung cancer, but I just wrote it as cancer. I'm making this assumption smoking causes lung cancer and I feel pretty confident in that assumption that's pretty well established. I'm making another assumption too which is that smoking causes coffee drinking which maybe I'm not so certain about we'll deal with that in a little bit, right? But the idea here is that I think that smoking is associated with coffee drinking, right? Maybe that's maybe being a smoker doesn't actually cause you to drink coffee but we can deal with that in a second. We can kind of clarify that assumption in a second here. But we'll take this for now, right? Smoking causes coffee drinking, right? And so this is related to a specific health question that has appeared in the literature many, many times which is is there an association between coffee and lung cancer? Because if you just look at the descriptive data you will see that very often there is, right? However, I actually have no reason to believe that that's true, right? So I'm not making that assumption here. In this diagram I'm just assuming that smoking causes cancer and that smoking is associated with coffee, causes coffee. Right, so I'm making those two assumptions and then I'm gonna use this GGDAG function which is a quick clock function in GGDAG to actually visualize this diagram. So these are the causal assumptions that I'm making, right? Smoking is associated with both of these things coffee and lung cancer. Coffee is not associated with lung cancer, right? If I wanted to show that DAG it would actually look something like this, right? So now my first assumption has changed. I'm saying both smoking and coffee are associated with cancer, right? Both of these things actually cause lung cancer. And if I didn't know if that was the case or not I really didn't know if coffee caused lung cancer. I would probably put this in like this, right? Because that's my scientific question here. But here I'm much more skeptical. So I'm not actually putting that in. But if I were to draw it, it would look like this, right? So we still have an issue here, right? Even if coffee does cause lung cancer we have this additional variable smoking that is associated with both of them. And this is a classic confounder. This is a very simple example of confounding because smoking is associated with both of these things and it's going to distort the estimate that we actually get from trying to ask the question does coffee cause cancer? And so it's made even worse by this situation where actually there is no effect, right? If in this case if we do an analysis the answer that we should get is that no there's no association, right? Coffee does not cause lung cancer. But because of the smoking variable that's not the answer we're gonna get in our data if we just observe it, right? And we can't randomize people to smoking, right? That's unethical, we know smoking is dangerous. We can't put you in a trial and say, okay, you're gonna smoke cigarettes for five years and you're not, right? That's unethical, we can't do that. So let's dive right into an exercise here. So we've got to o2dagsexercises.rmd under your turn one. What I want you to do is kind of clarify this dag a little bit. As I said, smoking causing coffee doesn't really make sense so we're gonna clarify that a little bit and we're gonna work with dagify a little bit and then we're gonna actually draw it. And if you get done quickly, this might be a simple exercise for you if you're familiar with these types of tools. There are a couple of what we call stretch goals and those are for if you get it done a little early and you want a little bit more challenge, right? So I'm gonna start a timer here. These are a little simple. So I set for these for three minutes but if we need a little bit more time it's no problem, we can take a little bit more time and I'll just talk less. Malcolm, well, folks are working on that. We just had a question about the X and Y axis scales and my understanding is those are actually just, they're sort of random and they don't actually, but I thought maybe you would- Yep, yep, that's correct. Yeah, so that's a really good question. And often when you look at daggs they don't have an axis in them because the reason that they are in the place that they are is because there are algorithms for placing those nodes in a way that's relatively readable. And so they actually don't have any meaning. It's an algorithm that's deciding their placement. Every once in a while the axes do have meaning. Maybe the X axis is time and you organize your nodes that way but very often there's no meaning to those. And so you'll see later on that one of the stretch goals if you get there is to add themes to your dag and many of the themes for daggs just completely eliminate the X axis because it doesn't convey any information. It's really a good question. By the way, these countdown timers are more for us than for you. So if you need more time, it's not really, it's no big deal. This is more so that we give you enough time and that we keep on time ourselves. Yeah, I always find when I don't put timers I underestimate the time. And so I'll give like five seconds and be like, okay, are we done? Yeah, me too. Yeah, it's a problem when you already know the answer. Yeah, so one important thing to note is that with daggs we actually aren't working directly with the data yet. So we're drawing our assumptions and in these exercises we're using the same variable names as the data set but right now we're not yet working with this. It's a very, very common question with daggs is often people ask like, wait, how do I actually know? Like how do I actually get my estimate now? And so that's not what a dagg is gonna do for you. Right, yeah, so this question about terminology, right? So when I say associated, I'm using a slightly more vague assumption that they just have a statistical association and we're not making any assumptions about whether or not that association is due to a cause due to some kind of bias or something else, right? And so that's a benefit of saying I'm very specifically interested in a causal effect rather than an association. Okay, oops, I restarted the timer. Oh, I added more time to the timer, oops. Okay, I'm gonna run over to the solution here but if we need a little bit more time for the next exercise, just throw a message in the chat and I'm happy to pause a little bit here, right? The other place is in the participant window where you can say like go faster, go slower. I think you can, and that we can be monitoring that too. Yeah, yeah, happy to. It's tempting for us to go fast for many reasons and talk too much. So if you need us to do either, we love both our and causal inference. So we'll talk about this all day if we are left to our own devices, right? So what I'm doing here is I'm adding this first assumption which is that smoking causes cancer. We talked about that a little bit. We've also added this new one addictive, right? So this is that somebody, there's something behind these two, right? There's something about a person that is their addictive behavior. This again needs more clarification for a real good DAC but we're at least recognizing that smoking doesn't cause coffee drinking, right? There's something else that's associating the two, yeah. And then we're also adding some labels here. This is mostly for visualization purposes and then we're actually going to visualize it, right? So we now have this extra node in our DAC but again, we're assuming that coffee does not cause cancer, right? Addictive behavior, having these qualities there that cause addictive behavior, whether that's like something genetic or something else, maybe we have some kind of a psychometric scale that measures this pretty well, how prone you are to addictive behavior or maybe we need to clarify this more. That's probably the case here, right? But we have both of these now, right? We have smoking and this addictive behavior variable, whatever that actually is in our data. So let's look at this dive a little bit more and think about dealing with causal effects and associations. So getting back to this question, right? An association, we're assuming that's just an statistical association. We don't know anything about the causal effect element of it, right? And you hear this all the time. Correlation does not equal causation. All right, great. Everybody loves to say it. Makes people feel like, oh, I'm being thoughtful about this question. Correlation doesn't equal causation. But wait a minute, why is that, right? Why doesn't correlation equal causation? Why can't we just get this answer, you know? And the real issue is if we wanna know this path, X does X cause Y, right? The real problem is that this is only one of many pathways that are what we call open in a diagram, right? So we have these other paths that cause associations. So if we kinda swing back, oh, actually let me talk about this function ggdag paths. So ggdag paths helps to identify the open paths that are distorting your effect estimate. If you're assuming that X causes Y, that should be one path. If you're not, then that path doesn't exist in your diagram, right? And we'll also show you the other open paths. And those are the things that you need to deal with in your model, right? So luckily ggdag has this option ggdag paths that will visualize this for you. And so if I go back to my smoking and weight gain diagram, remember I had a diagram that kinda looked like this that had a bunch of variables around my two questions, putting smoking and gaining weight. The real problem is here. I care about panel one, right? It's labeled the one up there. Those are, if you remember from the whole game, those are, that's my causal question. Does quitting smoking cause weight gain? Here's the problem. Those other nine variables are causing nine other open paths between these two. And the math behind this works out such that this is what causes the statistical association between these two variables, even if path one doesn't exist, right? If path one does exist, we still have a problem because all of these other nine pathways are gonna distort the effect that we're gonna get, right? All of these paths mixed together and you don't, you actually don't know what you're getting, right? You're just getting an association at some population level. And maybe you're not even getting that, right? So we actually need to deal with every single one of these open pathways if we wanna get a causal effect. So we're gonna dive right in here. We're gonna look at that diagram that we just created with gg-dag paths and also the underlying function now, which is dagpads that I want you to look at. So we'll take another few minutes here. Just let us know if you need a little bit more time or if you're finishing this in one minute and you wanna get onto the good stuff, fit your actual model and we'll get there soon. This is good stuff too. Yeah, this is good stuff. This is good stuff. This is a, I have a life, a research life before daggs and I have a research life after daggs. It's really changed the game for me. We had a question before that, I sort of briefly answered in the chat but I thought would be worth maybe just kind of talking about while everybody's working, Malcolm. So someone was asking about how do you prove that what you've done in a dag is true? And I think this is like a fundamental thing that people that are first learning daggs think about. Like what are you proving, is it provable? And I think something to mention is that the whole concept of a dag is that you're kind of putting your assumptions down on paper and then you're assuming that those are true and then given that assumption, your causal effect that you estimate, if you estimate it appropriately according to your dag is true. But, and that is all provable mathematically given that those assumptions are true. So you can't really prove a dag. It's not possible to prove that you, for example, measured all of the important confounders. That's not a measurable assumption. And so the dag doesn't really serve as something that's provable. The dag serves as your baseline, like a picture of your baseline assumption that if that's true, then your downstream things are provably true. Does that sound kind of like what? Yeah, so in practice you have to deal with that, which usually means, as an epidemiologist, I research health questions. And so for me, often that means I go and read the literature. I wanna understand more about what we know about this question. I talk to physicians. I wanna know what's their experience. I talk to other researchers. I wanna know their experience. I show people my dag, let them criticize it and iterate. So we have to do the best we can with it because it's an unverifiable assumption. There are some statistical techniques where the daggedy package has a tool where you can fit your dag as a structural equation model and estimate how well it fits your data. But that has problems too. That's only one way of looking at it. And it's a very limited scope in what it can help you with. Ultimately, it's an unverifiable assumption. It's great though, because it makes it explicit. Because I think we make these unverifiable assumptions all the time in causal inference. And all this is doing is giving it a very clear picture of what those unverifiable assumptions that you're making are. If you just looked at my propensity score model, my outcome model, you can, from that, you could glean what my unverifiable assumptions are, but this makes it much clearer to kind of anybody. Absolutely, yeah, absolutely. And what happens in the research is, you look at a model and you say, oh, we control for this, this and that. And that's actually hard to critique. You can kind of think like, oh, well, maybe there's other things that are associated with the two things that you're studying. But it's actually really hard to critique this. And that's why people end up in a situation where they throw the whole kitchen sink at the model. Anything they think might even possibly be associated, they just throw it at the model. And if you dive deeper into the world of DAGs, you'll quickly find that, actually, that's a really big problem. That can bias your estimate just as much as not including anything. So yeah, so making DAGs is a practice. And it's something that you can never verify, so you need to iterate with your colleagues in what's known. Okay, so I wanted to open you up to the fact that DAGs in GGDAG are actually represented as data. So DAG pass is a function that GGDAG pass calls this function to make a plot. But it actually makes this tidy DAG data structure that you can work with. And it tells you a little bit of information about it. It's telling me my scientific question, my exposure and my outcome because we set those in exercise one. And then it has this data set. This is actually what's getting plotted in the GGDAG function, right? So what's different about this compared to just a normal tidy DAG is that now we have rows for each of these paths as well, right? So in this case, actually it's pretty simple. There's only one path, right? And you'll note that that path is not from coffee to cancer because we are not assuming here that in fact, coffee causes lung cancer, right? So there is an open path between coffee and cancer through smoking and this addictive variable, right? And so this is a confounding pathway. So when we talk about confounding, this is the issue, right? This is the issue. It's these open pathways. So this is confounding in action, right? So the real thing that we need to do now is okay, when we have this setup or we have this setup, we need to close these paths, right? So what does that actually mean closing a path? It means that we need to account for it in our analysis such that we're only estimating the path that we care about, right? So one of the best ways to do this is to randomize because randomization breaks all of those paths, even the ones you don't know about, which is incredible, right? The problem is that we can't randomize everything, right? Can't randomize you to smoke. That's unethical because we know it's dangerous. And some things aren't practical. Some things would take 20 years to do and it would cost just way too much money and it's just not feasible to keep people in a study that long. So some things aren't practical and some things aren't ethical. So we can't always use this tool. So the tools that we use are stratification, filtering your data just to a subset, adjustment, putting something into a regression model, waiting and matching, which we'll talk about today and many, many other methods that we can use, right? So what we're trying to do here is we have an observed data set that those arrows aren't broken. Those pathways aren't broken in our DAG, right? Randomization does this for us, right? So we need to try and replicate that effect in our model. And that's what we're trying to do here is we're trying to get as, we're not trying to replicate randomization, right? This is, that's often you hear that with people, they'll say, oh, well, if I match, I'm emulating a randomized trial, which is not quite right, unfortunately. What we're trying to do is get to the same place that randomization gets us, which is that our assumption of no confounding is reasonable, right? And so in a DAG, this is called an adjustment set, right? This is the set of variables that we need to account for. And every DAG may have, any given DAG may have many adjustment sets, equally valid adjustment sets, or it may have not. Occasionally, you will find in a DAG that if you run GGDAG adjustment set, GGDAG will tell you, you cannot get an answer to this question with this DAG, right? There's no way to block all these paths. That does happen, right? So before when we ran in the whole game, we ran this function and we got this, right? There was only one adjustment set and it was all of these variables. When I did that, it closed, when I include all those in my model, it closed all nine of these other pathways and it just left me with this one, right? So we're gonna take a look at this with our smoking DAG now, all right? So we're gonna try and figure out what we need to do to close these paths. And then also I want you to write what you would need to do for a model. Like if you were writing a linear regression model or a GLM model or something like that, what formula would you have to write in R for those adjustment sets? If you assume that the variable names are the same as in the DAG. Now, can we add another question? I thought we could address while everyone's working on this. Terrence asked if the backdoor and confounding is the same thing, if those mean the same thing. Right, yes. And yeah, I think like one definition of confounding that I've seen is that basically a confounder is a variable that can be used to block a backdoor path between an exposure and outcome. Yes, right, right, which is a def, it's a little bit of a technical definition, but I like it and it leads you to an interesting place which is that one of the problems with these models in the literature that you just kind of throw the kitchen sink at is they're like, oh, this might be a confounder, that might be a confounder. The problem is those that there actually isn't, we'll see something interesting in this exercise. There actually isn't a confounder that is solely responsible for confounding, right? The real issue that we're trying to deal with is confounding, those backdoor pathways that are distorting our estimates. And the set of confounders that you use to close those backdoor paths is often not fixed. Often there's actually many ways that you can do it, many valid ways that you can do it. Yeah, that's a good point. You know, one thing I didn't really talk about here, this backdoor path is also sometimes called the backdoor criterion, which is a technical term. I see several people in the audience know about, which is, it's really the assumption of, it's directly related to this assumption of no confounding, right? We need to, if the criterion is that we need to be able to close all of these backdoor paths, right? So that's what we're dealing with here when we're dealing with these pathways. All right, so let's check this out. Let's see what we get here. So GGDAG adjustment set, this wraps a function in Daggerty that is an amazing little algorithm that can figure out all viable, what are called minimal adjustment sets in your DAG, assuming your DAG is correct, which again, is the difficult part. We also add labels because we thought, well, maybe this is a little bit hard to read here. So maybe you got to the stretch goal for this on the last one, but I often like to add labels here because once you get with a bunch of variables, it gets hard to read, right? So what is this telling us? Before we only had one panel and it had all of our variables in it, but now we have two panels and it's got two different variables in it. So what this is telling us is that we have one pathway that we can close in two ways. We can control for this addictive variable or we can control for smoking, right? And so one of the stretch goals for this one was to examine what if I actually can't measure this addictive variable? Then what does GGDAG give me, right? And so I encourage you to check that out. And where this leads us is that actually controlling for either of these boxes pathway, right? So you may be tempted to say like, oh, both of those are along the pathway and I need to control for both of those. It turns out that if your DAG is correct, that that's not actually true, right? And there may be certain data situations, maybe you measured smoking better than you measured this other variable or the other way around, right? Maybe you have several adjustment sets where you feel better about like, this is addressing some criticism that we're gonna get, right? Which is kind of a political decision about putting your research out there, but that's important too, right? So if it's equally valid, right? Then that's great. And so this is a different way of thinking about our models than we usually have because we're saying that there are equally good ways of controlling for these pathways, right? So our model for our formula that we put in our actual model would look like this. Assuming our DAG is correct and our variables are equally good, equally well measured, these will both give us the same correct effect, right? Which should be nothing, right? We're saying that there's no association to do with coffee and cancer, right? So if we control for either of these, it should give us the right effect, which is that there's no association, no causal effect, right? In this case. Okay, so that's a quick introduction to DAGs. DAGs can do a lot of other stuff in terms of understanding your analysis. So GGDAG has several vignettes that I encourage you to check out, a deeper introduction to the package, but also deeper introduction to DAGs. And that includes a lot of resources for learning DAGs that I have found very, very useful, as well as an additional vignette that presents ways that different types of bias can present themselves in your analysis, whether that's a research question or a data science question or whatever, right? So I think I will stop sharing here. Let's take a five minute break. I stole some time from Lucy. We are actually really short on time. I don't know if I have time for a five minute break, so. Yeah, sorry about that, Lucy, I stole some time from you. No, it's okay, I think we gotta keep rolling, but we had a couple of questions about people who only had one panel. So I'm gonna start my slides, but maybe do you wanna troubleshoot those while I'm talking? Yeah, yeah. I think you can see kind of what they did in their RStudio Cloud instances, maybe. Right, right, right, right, okay. Sorry for not giving everybody a break. Well, you'll get kind of many breaks throughout mine too to be able to work on code, but I just, we're not gonna be able to finish by two if I don't get started. Sorry about that. It's okay, no, it's great. It's good stuff. Okay, all right, so we've talked about kind of this, these causal diagrams, and Malcolm gave you a great overview of how we sort of do this whole process and that one piece in the middle for how we're able to adjust for all these different variables is propensity scores. And so I'm gonna talk a little bit about that. And I've got this brief introduction into some of the difference between observational studies and randomized studies that Malcolm kind of alluded to a little bit at the beginning, but often the goal for observational studies is the same as randomized studies and it's just achieved in a slightly different way. So for example, in the work that Malcolm and I do, very often we're trying to answer some medical question. For example, how does some exposure or treatment affect some outcome? And so in a randomized controlled trial, you would have an individual that would get randomized to either be in the treatment or control group. And usually this is by something like a coin flip. So someone will have equal probability to be in either one of these groups. So in this case, this coin is flipped and this person either gets the treatment, gets assigned to the treatment group or they get the control, they get assigned to the control group with equal probability. So an observational study is slightly different because now instead of these probabilities being equal between the two groups due to something like a coin flip, we're dealing with something different. So now in the real world, people are assigned to whatever treatment they get by something like their doctor. So they may go to the doctor and their doctor might look at all of the variables that they have that describe them and may either decide to give them some treatment or not. And so we no longer have this nice setting where everybody was equally likely to get either a treatment or not and we have to adjust for that in some way. And so here, I like to sort of, this is kind of taking a step back a little bit from what I thought I was doing but just as like a visual of what's going on here. So I've got my observational study in the orange circle. I have the people who went to the doctor and were assigned, were given a treatment. And in my green circle, I've got people who went to the doctor and they did not get assigned to whatever this treatment is that my study is interested in. And all of these people have different characteristics. For example, I've got sick people. I have smokers. I have people that wear glasses. I have people with funny hair. I have people with mustaches. And each of these characters, I have Harry Potter over here. Each of these characteristics may or may not impact their likelihood of getting assigned to treatment and those characteristics might also impact their final outcome. So, and that's kind of where this concept of confounding comes in. So for example, if I like these, I have three people in my treatment group that were smokers and I only have two people in my control group that are smokers. And so because I've got this differential effect between my treatment and control group, if smoking was an important characteristic for our outcome, that's gonna be something important to control for. And so this is kind of a different picture that's similar to what Malcolm was showing, but the relationship, a confounder's basically something that distorts that relationship between the exposure and outcome, for example, something like smoking. So something that is associated both with the exposure of interest and the outcome of interest. So as I mentioned, that we come from kind of different schools that do similar things in terms of causal inference, but Rosenbaum and Rubin showed back in the 80s that in observational studies, if you condition on the propensity score, you can get unbiased estimates of an exposure effect given that two key assumptions are true. So the first being that there are no unmeasured confounders. And so that basically means that the DAG that you draw is correct and you've measured all of the important confounders that you've indicated in your DAG. And then the second important assumption is that every subject has a non-zero probability of receiving either exposure. And so this basically just means that everyone in your control group could have received the treatment, at least in some kind of counterfactual world with some probability that's greater than zero and likewise, those in the treatment group could have potentially not received the treatment. Okay, so to do this, as Malcolm showed before, the easiest way to fit a propensity score is to fit a logistic regression that predicts the exposure using the known covariates. So this is just a mathematical formula for a logistic regression. In R, we use the GLM function. And each individual's predicted values from that are the propensity score. It's super easy. It's kind of a fancy word for something that's just, something that most analysts do on a regular basis that it logistic regression and get out the probabilities. And so in R, we can do this. I'm gonna load both the tidyverse and the broom package to do this. And basically all I'm doing, I'm fitting this generalized linear model, the GL, using the GLM function. And what we're predicting is our exposure. And so because this is kind of like a pre-step before we do our final outcome model, the thing we're trying to get at is the probability of having this exposure. So my Y variable here is not my final outcome, it's my exposure of interest. And then I'm going to adjust for all of the different confounders that I've identified in my DAG. And then of course I'll specify my data frame, family equals binomial because this is logistic regression. And then I wanna actually get out those propensity scores, those predicted probabilities. I can use this augment function, which is in the broom package. And I'm gonna do type.predict equals response. And so that gets a probability instead of the logit. And then I specify my data frame in order to add that value to the data frame. So first I'm fitting my propensity score model and then I'm augmenting my original data frame to basically add those fitted probabilities to my data frame. So now I can use them to do things like waiting. So now each of these different individuals with all of their different characteristics have a single number summary that describes their probability of getting the exposure or being in the treatment group. And then we can use these probabilities to build weights, which I'm gonna talk about in the next section. So here is a simplified DAG of what Malcolm had shown before. And so what I've done here is I've proposed a different DAG that's much simpler and likely much less true, but will be easier for us to work with. And so here I'm still interested in the relationship between quitting smoking and change in weight. So these center nodes here are still the primary nodes of interest. So quitting smoking is my exposure, change in weight is my outcome. And I have four confounders here. I have sex, I have the baseline weight, which is in your data frame, it's gonna be weight WT71. I have age and I have years of smoking or smoke years in the data frame. And so what I want you to do is based on this DAG, I want you to build a propensity score model for whether or not someone quit smoking. And so if you go into exercise three in your RStudio cloud space, you can find that. So I want you to build that model and then as a stretch goal, I'm giving you five minutes to do this. And so as a stretch, if that was super simple to spill on those pieces, I want you to try to create two histograms, one for the propensity scores for those that quit smoking and one for those that did not. You can kind of compare what those propensity score distributions look like. And then we'll talk about it. But while you're working on that, I see one person asked the difference between propensity scores and propensity score weighting. And I think maybe Malcolm answered this a little bit in the chat, but I thought I just mentioned. So basically- I've certainly sent my message to the wrong place. Yikes. Oh. So propensity score is always gonna be the probability of getting the exposure or getting the treatment, whatever that thing is that you model. So that's fixed, that's gonna just be that. And then there are all different ways that you can incorporate that propensity score into your final outcome model in order to get at this causal effect. And so weighting is one way to do that. And we're gonna in the next section talk about a couple of different ways that you can weight that will sort of have a different impact on the causal estimate that you're getting at. But matching is also a way that you can incorporate the propensity score. You can do a direct adjustment, actually the propensity score. There are several different ways that you can. You can use the G formula that Malcolm had mentioned before. So there's a lot of different ways that you can incorporate those propensity scores into your final model. So the propensity score weight uses the propensity score. It's a function of the propensity score. And the propensity score is just kind of its own thing, constant thing. Yeah. That's great. So one other message that I responded to that I sent to the wrong place on accident because Zoom's chat's not the best was in response to the people in the last exercise for GGDAG that got to different panels than I showed. The reason that you got different panels is likely that you did the stretch goal. You set things as latent in your DAG. And so GGDAG and DAGITY have a way of saying I don't actually have this variable. I can't measure it or I didn't measure it or whatever. And DAGITY and GGDAG are smart enough to know that I'm not gonna give you an adjustment set that includes that variable because you can't do it. So if an adjustment set requires that variable, it's not a valid adjustment set because you can't actually incorporate it. So if you marked addictive as latent, you'll only get one, one adjustment set, smoking. If you marked them both as latent, it will tell you yikes, bad news, buddy. You actually can't get your causal effect because you can't block this pathway. So you have to redo your study and measure these things next time. Yeah, right, no, so that's not a typo. That's an explicit declaration of what variables you have and what you don't. That's very useful because you may know that something actually is important that you don't have measured, right? And so people might criticize you for that, but you can say, look, if I make these assumptions about my DAG, even though I haven't measured this variable, there actually is a valid adjustment set that will give me my causal effect. So that could be very useful. All right, so we've got about one more minute. And so while we're kind of, you're wrapping up, I'm gonna start running some of this code here so you can see what you should be getting. So first thing I'm gonna do is just run all those packages, tidyverse brooms, the iData and ggDAG. And here I'm using the NHEFS underscore complete data. I didn't remove the censored individuals in this particular analysis, although you certainly can if you'd like to, but just for example purposes, we're just sort of using the whole data set here. And so I'm gonna run this just so we can see that DAG again to remind ourselves of what confounders I've proposed as important in this model. And so I've got, we've got, oh, they disappeared. We've got smoke years, sex, wait, baseline, wait and age. I'm gonna wanna make sure I include those. So in my propensity score model, this I'm gonna use the GLM function to fit that. And my outcome here for the, for this model, because this is the propensity score model, my outcome is my exposure, which is the quit smoking, so QSMK. And then my confounders as per my DAG here are sex, age, smoke years, and baseline wait, so that wait 71. So I'm gonna fill those in. My data is this NHEFS underscore complete. And then my family is gonna be binomial because I am doing a logistic regression here. So I'm gonna run this. So I got my propensity score model. So now that propensity underscore model is gonna be my model object. And I'm gonna change this eval to true here. So if I knit my whole thing, it'll evaluate that. And then I now I'd like to actually add those propensity scores to my data frame so that I can use them later. And so to do that, I'm gonna use the augment function, which is from the broom package. And I'd like to do the type equals response. And so what that does is that makes sure that I'm gonna actually get a probability instead of a logic. And now my data is gonna be this NHEFS complete because that is the original data set that I'd like to add this to. So I'm taking my propensity score model and I'm augmenting to get that, that fitted value, the propensity score. And what I'm doing is I'm adding it to the NHEFS underscore complete data and I'm calling this new data set, df. So now down in my console, I can run that to see what it looks like. And I get my full data set, but I notice if I saw this dot, dot, dot down here, I would have this variable fitted, this dot fitted, which is my actual propensity score. Okay, so now for the stretch goal. So there's several ways to do this. And in fact, I'm gonna show you a slightly different way in the next set of slides, but I can fit these histograms by taking that data frame that includes the propensity score. And I wanna plot that dot fitted variable and I wanna plot it separately for my two groups, for my smoking and non-smoking group, quit smoking and non-quit smoking groups. So my group is gonna be this QSNK. And I'm gonna change the fill so that you can see a different color for each of these. And then I'm gonna do geohm histogram. So this is gonna give me an overlaid histograms of my smoking propensity scores and my non-smoking propensity scores. And so what I see for, I can see a couple of things from just looking at this immediately. The first thing I see is that I have far more people that did not quit smoking than did. So this blue are my people who quit smoking and the red are people who did not. And then I can also just look at this distribution. I can see that the people who quit smoking end up having propensity scores kind of higher up here in the tails and the people who did not kind of have, tend to have more propensity scores kind of down here compared to those who did. And that of course makes sense because the propensity score is gonna be the probability that you quit smoking. And so you'd expect that people who did quit smoking probably had a higher probability of quitting smoking. And you would also expect that people who didn't quit smoking might have a lower probability of quitting smoking compared to those that did. So this just gives us a little bit of insight into that. We're gonna delve into this type of plot a lot more right now actually. So let me see that. Oh yeah, so the data set comes from this library CI data exactly. And it should be loaded already in your RStudio Cloud session. So you shouldn't have to download it but if you are working in your own R session like on your personal computer you actually can get it from Malcolm's GitHub. All right, let's go back, switch to the next set. Okay, so now we're gonna talk about propensity score waiting. All right, so as we've sort of mentioned a couple of times there are a lot of ways to include propensity scores in your final model. Some common ones are waiting, matching, stratification, direct adjustment. And so today we're gonna focus specifically on waiting. And so there's all different types of weights that you can use to be able to estimate different target estimates. And so this is something that I think is not always super well understood in the causal inference framework but I think it's really, really important. So I'm gonna spend a little bit of time on it because the target estimate that you're trying to get at actually makes a huge difference and it totally changes how you do your analysis. And so the most common target estimate and the one that Malcolm was talking about when he was doing his slides earlier is the average treatment effect. And so the average treatment effect, the reason why this is the most common is likely because this is what is estimated typically in randomized trials. And so in this case, your target population is the whole population. So both your treated and your control group. And of note, this is often declared the population of interest, but actually it's not always the most medically or scientifically relevant kind of depending on your underlying question. And so you do wanna be kind of conscientious about what the thing you're estimating, how that applies broadly. So the average treatment effect assumes that every participant can be switched to the opposite treatment. So that basically everybody in the control group with some probability could have been in the treatment group. And similarly, everyone in the treatment group could have been in the control group and basically that you can make these switches, which kind of isn't always exactly how you would make that particular assumption. And so it's estimating this average effect across your whole population. And so when you're doing that, this is the mathematical equation to estimate those weights. This is the same as when Malcolm talked about inverse probability weights, that's exactly what this is estimating. It's written in a slightly different way just so that it sort of mathematically it's easy to translate between all the different weights, types of weights that I'm about to show you. But if you look at this, the Zi here is whether or not you're in, whether in the treatment group. So Zi is one for people in the treatment group and it's zero for people in the control group. And Pi is your propensity score. And so if we just take a second looking at this equation, among people who are in the treatment group, their weight is gonna be one over the propensity score. So because D will be one, so be one of the propensity score plus one minus one, which is zero over one minus propensity score. So this part drops out. So among those in the treatment group, it's just one over the propensity score. Among those in the control group where Zi is zero, this first part drops out because Zi is zero. So this part drops out and you end up with one minus zero. So one over one minus the propensity score. So this ends up being exactly what Malcolm showed you before where it's one over your probability of getting the treatment that you've got. So among the treated people, it's one over the probability of getting the treatment. Among the control group, it's one over the probability of getting the control. So that's the weight that, this is the most common weight when people talk about inverse probability weighting. This is usually what they're referring to. But it's important to note that there is a specific causal estimate that this is estimating. It's estimating that average treatment effect across the entire population. Another really common one that is used is the average treatment effect among the treated. And so now our causal, our estimate of interest, our population of interest, are the people who got the treatment. And so this ends up being slightly different and it gives you a slightly different weight. So now you end up with the people in the treatment group end up just getting weight of one and the people in the control group end up getting a weight of the propensity score over one minus the propensity score. And so Malcolm mentioned a little bit about stabilizing. This is like one way that sometimes can help stabilize your weights a little bit depending on your distribution of propensity scores. And so for example, if you have lots and lots and lots of people in either your treatment or control group, you might wanna consider kind of some of these other causal estimates that might be easier to estimate. Similarly, you can actually estimate the target population could be the control group. So you might wanna know how would the control group be if they had all switched over to the treatment or vice versa. And so the average treatment effect among the controls is similar except now among the controls, they're gonna have a weight of one and you're gonna basically be weighting all of the treatment group to look like the control group. The next is the average treatment effect among the evenly matchable. And so this one's a little bit more complex but basically what this does is that it weights your population in a way that's very, very similar to if you did a matching like a propensity score matching. And so this actually has some very nice properties because in particular it doesn't have the possibility to blow up in the tails. And so if you can imagine all of these weights because they have the propensity score or one minus the propensity score in the denominator, if those values are very, very, very small, you can end up with basically infinite weights. And so then you can have like one or two individuals that end up having a lot of pull in your final outcome model. And as Malcolm mentioned, there's things you can do like trimming and things like that. But one thing that I sometimes like when I'm in a situation that I have something like that is to actually be intentional with my causal estimate. And so if the population I have really can't estimate things well in the tails, then maybe I need to change the causal estimate that I'm going for. And the ATM is a nice way to do that. And then finally, the average treatment effect among the overlap population, this has this very elegant weight. This is the newer, these overlap weights are kind of a newer formulation in the past couple of years. And they're very similar to these matching weights, but they just have some improved variance properties. And so basically the overlap population are people who were kind of in the middle, they were pretty likely to have gotten either the treatment or the control and you're kind of estimating how they would do. So that was kind of abstracted and a little bit mathematical. And so I'm gonna show you what this looks like graphically and hopefully that can help kind of drive it home. And then I'm gonna let you all estimate these weights in R. So I've redrawn this histogram of propensity scores that we just created, except instead of overlapping them, I've made them mirrored such that on the top here, this is my histogram of the propensity scores for the treated population. So this is for people that quit smoking. And on the bottom here, it's my control group. So these are people who did not quit smoking. And I like this mirror because it makes it much easier for me to compare directly that like, yes, I have more mass over here in my control group on the side of having low propensity scores, which as we mentioned makes sense because the propensity score is your probability of getting treatment. And people that didn't get treatment probably were less likely to. And I also can look that over here, I have a little bit more mass, a little bit of a longer tail for my folks over here, which for the people who did quit smoking. So it looks like there were a couple of people who quit smoking, who also based on their characteristics in my model were very likely to quit smoking. And so now what I've done is I've created these pseudo populations that are based on the weights that I've used. And so these are using ATE weights. And so you can see in the colored part underneath, I still have the histogram of the original propensity scores. And then you can see how those were up-weighted or down-weighted in different ways to be able to sort of get them to represent a different pseudo population. So the ATE, the whole point of these average treatment effect is that basically I wanna be able to compare two comparable populations. And so I want my treatment and my control group to pretty much look the same. And so what's being done here is that both the treatment and the control folks are getting up-weighted such that basically these distributions look very, very similar. And so that's how the ATE works. The ATT, that's the average treatment effect among the treated. It treats that all of the treated folks is just they get to stay themselves. So that distribution for the treated folks looks exactly like it started. But then it downweights or upweights. So it basically changes the control population to basically mirror that. So we've taken, this is the full control population in gray and the blue is the weighted population. And so some people are down-weighted. People who have characteristics like over here that are kind of less likely to look like my treated population, they get down-weighted. And then people over here end up getting a little bit up-weighted. So this kind of basically makes the control population look like the treatment population and I can estimate the difference between these two populations in my outcome and it'll give me this average treatment effect among the treated. Similarly, I can do the same thing with my average treatment effect among the controls. And so here my population in the controls is exactly what I observed in the real data. But now I've weighted my treatment population to look more like this control group. And so now when I take the difference between these two weighted populations, I'm gonna get an average treatment effect among the controls. So if my treated people all looked like the control people, what would that average treatment effect look like? So kind of thinking in that counterfactual framework that Malcolm mentioned, what I'm doing here is essentially I'm building counterfactuals. And so in the very first one, we're building counterfactuals where we're sort of taking this, we're taking both the treatment and the control group and we're shifting them slightly. So we end up with this like average effect across both. In this case, my counterfactual, I'm taking the treated group and I'm building a counterfactual for each of those treated participants from my control group. And in this case, I'm taking my control group and I'm building counterfactuals from my treatment group. The ATM. So now this ends up in this particular population looking very similar to the ATT and that's because we have far fewer treatment, thank you versus control folks. And so you can very easily have a good match for each of them. So if we were doing a matched analysis, this is kind of similar to what you would see there. You would probably take almost all of your treated folks and you'd match them to a subset of your control group. Then the ATO you see is just slightly attenuated. So it's similar. The distribution looks quite similar to the ATM but it's a little bit attenuated which is what gives it a little bit better variance properties. But again, these ATO and ATM are never gonna be upweighting. They're always downweighting. And so you never end up with the problem of having these exploding weights that you would have to adjust in some way. Okay, so how do we do this in R? So here's our formula for the ATE. So all you're gonna do is after you've augmented to get that propensity score into your final data frame, we're just gonna do a mutate and we're gonna stick this formula directly into our mutate function. So I'm calling this weight W underscore ATE and I just take the indicator that you're in the treatment group. So that's the QSNK in the case of quit smoking. And I divide by the fitted which is my propensity score. So that's just the Z over PI. And then I add one minus D, so one minus QSNK divided by one minus the propensity score. So now it's your turn. I want you to go ahead and use the propensity scores that you created in the previous exercise to add the ATE weights to your data frame. And then if that seems way too easy, I want you to try to add a different type of weight. For example, you could try to add the ATT weight or the ATO weight or the ATM weight or something like that. I'm gonna give you a couple of minutes to do that and then we'll re-gather and I will show you how to do it. And just based on time, I'm actually gonna give three minutes for this instead of five minutes. So we cut a little bit short but I am happy to stick around for a couple of minutes after the end of our session. I just wanna make sure we're respectful of everyone's time. So if people have further questions, I'm happy to stick around. Yeah, me too, especially since it's really my fault. I take responsibility. No, no, it's such a fault. There's a lot to talk about. There's a lot to talk about. Yeah. Maybe we should have, we were originally had a three hour workshop for the use our live version. We decided, oh, you know, maybe two hours is better for online. Yeah. The stuff's too interesting to us. It's true. It's too exciting. Oh, no, yeah, okay. Yeah, no, sorry if that made it sound like it was like we took the same content and squeezed it into two hours. That's not quite right. We actually did a couple of things. Yeah, we removed a few things just to simplify. Yeah. Yeah, so if that leaves you itching for more, please do let us know because we may offer this again in the future and we'd love that feedback from you all. Someone asked in practice if I normally report the ATE or lean towards the ATM and ATO. So I personally, it depends on like the clinical question of interest. I have used the ATE, I've also used the ATT and I've also used the ATM in papers that I've done. I have not yet published a paper besides like a theory paper that used the ATO and that's only because it's relatively new and so people are still getting on board with it but I had several theoretical papers that have used it and I think it should get traction in the future. Do you use any of these Malcolm? I actually have found that this is not something that that many people think about too much, so. No, I often use, I pretty much use the basics, ATE mostly and occasionally ATT as needed but yeah, ATE is my writing barrier for sure because that's actually usually the effect that I want, right? That you're interested in. Exactly, exactly. Yeah, but these other ones are so interesting, especially their connection to matching. So the next question was about my favorite matching algorithm using the ATM. So the interesting thing is when you use the ATM you don't do any matching, it's just a weight. So you take the propensity score and you weight it using this formula here. So you literally plug in the minimum of the propensity score over minus the propensity score divided by Z times the propensity score plus one minus Z times the propensity score you don't have to do any weighting but if you're comparing it to a matching algorithm a lot of times people do like exact matches or they'll do a caliper match or something like that but there are a lot of different ways to do matching. It's like inherent propensity, it is like a propensity score it ends up getting, the effect ends up being very similar to what you would get for a propensity score matched analysis, yeah. Okay, I'm going to share screen and do run through this real quick, so okay. So the ATE, so I've taken my, this data frame that we created where we augmented it to add those fitted values. And so the ATE ends up being QSMK divided by propensity score plus one minus the exposure divided by one minus four, I'm just going to add that. And then of course the ATT or any of the others will just be using the formulas from the previous slides. Here, this is how I created those plots in case this is something that anybody is interested in doing. And so the only thing you have to change here is the weight that you want to look at. And so just run this real quick, so that makes sure you change all those evals to true so that you can see them. And this I think is kind of that nice, it shows you that distribution of those weights, which I think is a nice way to sort of visualize it. I like to do this definitely if I'm using the ATE, ATT or ATC because I can let me very quickly see if I have any weights that are really kind of blowing up that are going to end up affecting like my estimates or my standard errors. Go back to slides. Okay, so now we're going to talk about diagnostics and this is like one of the most important parts. So it's pretty easy to fit the propensity score model and it's pretty easy to create the weights. But then once you've done that, you want to make sure that you've kind of gotten good balance. And so this is where there's kind of an iterative process involved. So you've proposed a DAG that you think kind of is the correct DAG and then you fit your propensity score model and then after you fit that propensity score model, you want to check to make sure you actually have good balance between your exposure, between your two exposed and unexplosed group across these different covariates, the confounders. And if you don't, then you might want to kind of update your propensity score model accordingly. And so there's two ways that I like to check these. One is something called a love plot. It's called that because someone named Thomas Love came up with this idea. Most people might call it just a standardized mean difference plot, but I like the term love plot. I think that's very nice. And the other is an ECDF or an empirical CDF plot. And so I like my love plot as basically just a one big nice summary across all of my potential confounders in my model. And it's one that Malcolm showed in his big picture overview. And then the ECDF plots, those are really nice for your continuous confounders that you're estimating. And basically what it does is, instead of just looking at a single summary measure, you can look at the entire distribution of that confounder and make sure that you end up achieving balance after you do your propensity score waiting. So the standardized mean difference is a very simple formula. It's just the average of your, whatever your confounder is in your treatment group minus the average of that confounder in your control group divided by the square root of their pool standard deviations. And so this is a very common metric that's used to sort of do some post balance analysis for propensity scores. There's kind of a rule of thumb that you try to get this to be less than 0.1 for a lot of your for your confounders, although that it's not definitive, but it's kind of a nice way sort of that you don't want it to be standardizing difference to be more than 10%. So how do we do this? Okay, so there are several steps here and some of this is like not totally intuitive. So the very first step I think is not super intuitive and has a little bit to do with the fact that the art ecosystem is just not quite up to date on best practices for propensity score weighting. And so we're sort of make use of patching together a lot of different tools that get us at the right answer, but they're not intuitive. And so one of them is that for propensity score weighting a very common package that is used is the survey package. And so this is, I think Thomas Lemley made this package and it is for survey weighting, although the way that surveys are weighted is actually the same as how we would be weighting for a propensity score model. So the package name itself is not totally intuitive, but what we use it for is exactly what you would need. Malcolm and I have talked a bit about trying to create some tools that maybe have a little more of intuitive design behind them, but for now we're gonna kind of work with this. So the first step for creating these love plots is to create a design object that incorporates the weights. So this is sort of like, it's not a data set, it's a design object, but it's kind of like creating an object that contains all the information about your initial data set and the weights that you're incorporating. And so to do that, you load the survey package and then you run this survey design function. IDs is just gonna be equal to a tilde with a one, you don't have to worry about that. Data is gonna be your original data frame. And then weights is gonna be a tilde with the name of the variable in your data frame that has weights. So I put WTS here in the example that we just did, I would have put W underscore ATE since that was the name of the weight that I created. So this creates this design object survey underscore design. That's step one. Step two is calculating the unweighted standardized mean differences. And so this gives us kind of like a baseline to compare to. So this will tell us kind of what the differences were between each of these confounders, between the exposed and unexposed group before we did any weighting. And so for that, I like to use the table one package. And so I'm going to load that table one package and the main function is create table one. And then I'm gonna set the variables. And so this bars parameter is gonna just contain strings to name each of the confounders that you included in your original model. Whoops, sorry. And then the strata here is gonna be exposure. And so if we were doing this with the quit smoking data, I would put the QSMK in quotation marks here. And so that just dictates what I'd like to split these standardized mean differences by. And then the data is gonna be your data frame. And then I'm gonna set test equals false because this is kind of important. A lot of people try to do like hypothesis testing on these to try to show that the standardized mean differences are kind of significantly different. And that's actually really not appropriate in this setting. We really don't wanna be doing hypothesis testing. We're not trying to get P values. We're just trying to check the balance. And so this is kind of an important distinction there. Okay, so that's for the unweighted table. And then I wanna make a weighted table to look at the weighted standardized mean differences. And so this is gonna let me look at my balance post propensity score weighting to see kind of how well the propensity score did at balancing all these different confounders. And so for that, I use the very intuitive function survey SBY create table one. And so again, this is not totally intuitive because a lot of this was built for survey methodology, which is not exactly what we're doing, but it works the same as what we're doing. So just kind of bear with me with some of that terminology. So survey create table one. And then the next two are exactly the same as my unweighted version. I'm gonna just list all my confounders as my variables. I'm gonna list what my exposure variable is for my strata. But now instead of using my data frame for data, I'm gonna put in that survey design object. And so that's basically this object that includes both all the information about my data frame and the weights that I'm incorporating. And again, I'm gonna do the test equals false because I'm really not interested in doing any kind of hypothesis testing. I'm just trying to get kind of a visual descriptive measures of my standardized mean differences. That's step three. So step one, we create the design object. Step two, we create the unweighted table. Step three, we create the weighted table. So the big thing that's different between step two and three is that we're using the survey create table one and we're using this survey design object instead of the original data frame. Okay, step four. So now we're sticking this into a data frame. And the reason, this looks a little bit gnarly, but what I'm basically doing is I'm trying to create a nice data frame that I can then feed into my ggplot to be able to make that nice, well flat. So the first variable in this data frame, I'm calling it bar, and it's gonna be the row names of extract SMD of my SMD table. And so if I ran this just like in the console, I would just get like confounder one, confounder two. I just get, I'd get a vector of all of the variables that were included in my standardizing differences table. And then my second variable is called unadjusted. And for this, I'm basically pulling out some using the as numeric to just get the numeric values and then extract SMD that's pulling out the standardized mean difference from this SMD table unweighted. So this is pulling out those unweighted standardized mean differences from this object that I created before. So if I ran just this in the console, I would get just those numbers, a vector, a numeric vector of all the standardized mean differences for my unweighted table. And then weighted is gonna do the exact same thing except for my weighted table. Hopefully these standardized mean differences are smaller because we're hoping that the weighted analysis gets those two distributions to look closer. And then the spinal part pivot longer, I'm basically just pivoting that data frame so it is nice and workable with ggplot too. And so don't worry too much about this code, I provide it for you in the example and you can just copy and paste it. Okay, so step four is actually plotting this and a love plot in ggplot. And so my data is gonna be that plot underscore df that I just constructed. And then I'm going to have the variable on the x-axis, the standardized mean difference on the y-axis. And then I wanna group by method and my color is gonna be method. And so basically what that's doing is it's gonna create two separate lines, one for the unadjusted and one for the weighted model. And then I wanna add a line for each of those. I wanna have a point. I'm adding this h line is horizontal line at point one. And that's because that's kind of that rule of thumb. And so we're kind of hoping to give it a quick eyeball and make sure that the weighted version is sort of shifted beyond that point one point. And then I'm gonna actually flip the coordinates so that the x-axis is the y-axis and the y-axis is the x-axis because I think it's easier to look at that way. Okay, so here's what this ends up looking like. And so in the red, we have the unadjusted model. So this is before we did any adjustment. And you can see that the standardized mean difference across these different variables is pretty different between my two groups, in particular age. So age is quite different between the people who quit smoking and those who did not. But in the blue, I see my weighted analysis and it's nicely all below my point one rule of thumb and basically much closer to zero. And so my differences between these groups is much, much smaller. So this is a nice way to kind of visualize that. So you wanna do something like this to be able to get kind of an idea for balance. So I'm gonna give you a couple minutes to do that. We only have three minutes left. So actually I'm not gonna give you a couple minutes to do that, I'm gonna move on. And then we'll let you all try some of this at the very end if you still want to. Yeah, maybe we can jump to the outcome model at this point. And of course, if you need to go, we of course totally understand. We'll hang out after and ask questions. But if you do need to go, remember these materials are all here. Their exercises are guided. So you can work through these last two or three. I think there's two or three more total. On your own, feel free to email us, ask questions, post them on the meetup comment board, that kind of thing. We're happy about that. Yeah, or tweet at us, we're happy to answer. Or tweet at us, and yeah. I'm actually taking a Twitter break right now. Oh, okay. You can tweet at Lucy. Yeah. So briefly, the ECDF is basically just looking at the whole distribution instead of just a summary measure like the mean. And so that this is kind of a nice way to look at pre and post weighting as well for your, so this is some code here, which we've got, you've got available in your slide deck, so you can look at that. Okay, let's jump to the final outcome model, just so we can put it all together. This is actually a very short slide deck because most of the heavy lifting is done before this. So fitting the outcome model. So once we have the weights, the outcome model is actually quite simple. All you're gonna do is just progress your outcome on your exposure if you have, if your outcome is linear. So in this case, our outcome was change in weight, which is a continuous variable. You can just use a basic linear model, LM. You set your data and then you set your weights here. You can use tidy to be able to pull those out in a nice way. Now, this is gonna give you the correct point estimate. So that's great, but this is not gonna get you to the correct confidence interval. And so this is where the R sample package comes in. And so that's what Malcolm was mentioning at the beginning with bootstrapping. And so to do that, you're gonna create a function to run your entire analysis on basically a sample of your data. And so this function, it's just we're gonna have one argument for this function split. And what that is, it's because the R sample has a way to basically split your data frame into like a subset of it or maybe even just to sample from it. So it could be exactly the same size as your data frame, but you've re-sampled some of your observations multiple times. So I'm gonna do that. I'm gonna take my data frame. I'm gonna split it. And then this code here is all exactly what we've already done. So here, this is my exact same model, my propensity score model that I originally did. So if I was doing this with the Q smoke data, I would replace this word exposure with Q smoke. And then I would have tilde and I'd have age, sex, whatever I was putting into that model. And then the family is gonna be binomial. My data is gonna be .df because that's what I set my data frame as up here. And then this next part of code is gonna be also exactly the same as the code we've done before. I'm gonna augment my .df data with my propensity score and I'm gonna add my weights exactly as I did before. So this is just copying that same exact code. And then my outcome model is gonna be that exact code that I just showed you on the previous slide. It's just gonna be my exposure tilde outcome. The data is gonna be .df. It's that, this data frame that we've augmented here and my weights are gonna be my weights that I've calculated. Then I'm gonna use the tidy function from the broom package to kind of tidy it up and pull out the right estimate. Okay, so this is the function that we're gonna create. And how do we actually bootstrap this? Well, there's a function called bootstraps. And you can plug in the data frame, your original data frame into this bootstraps function. You can say how many samples you'd like. So in this case, I'm gonna do a thousand. And then I'm going to map the splits from these bootstraps into this fit underscore IPW function. So that fit underscore IPW function is this whole function that runs my entire analysis. And so what I'm doing is I'm effectively running this analysis a thousand times. And then I'm gonna take those results. And so I've saved this as IPW underscore results. So I'm taking those results and I'm using the int underscore t because I wanna get a t-statistic based confidence interval. And I'm going to pipe that and just pull out my exposure term. Okay, so in this final, we have over only two minutes over, that's not too bad. So in this final one, basically, this just kind of walks you through that exact same process but plugging in the pieces that you have already done so far. And so what I think we're gonna do now is you can, let's take a couple of minutes for folks to ask questions. And then if you wanna work on these different examples from five and six on your own and send us any questions that you might have, we'd be happy to sort of answer those. I'm gonna check the chat, see if we have any questions. I think, so sorry, Terrence. I actually, I missed your question twice. Terrence was, I guess he didn't hear the question, your answer to the question, which is a great answer of the difference between propensity score and propensity score weighting. And so maybe if you could re-summarize that. Oh yeah, yes. So propensity score, so the propensity score is the probability of getting the exposure or the probability of getting the treatment. So that's always gonna be fixed. That's just, that's the probability that you get out of that. Usually it's a logistic regression model, although you can use a different type of model to estimate any model that can estimate a probability, can estimate a propensity score. And so that's what the propensity score is. Propensity score weighting is basically a function of the propensity score. And it's one way to incorporate the propensity score into a final outcome model to be able to get a causal estimate. So hopefully that. We've talked about this several times. I'm gonna let you answer this one. Okay, so it looks like Jaylee has a question about when we do propensity score matching, should we also be using the bootstrap? Yeah, I'm not gonna talk about this a lot. So currently, I will say that in the literature, most people that do propensity score matching do not do a bootstrap. They basically get their matched sample and then once that matched sample has been achieved, they treat it as a fixed population and do their final analysis just on that population and get the confidence intervals as you would in a normal setting. Now, I actually think there are circumstances where that will not get you quite the right confidence interval. And so best practice would be to wrap the whole process and try to bootstrap it, but it is more complicated because if you end up with, if you have a matching algorithm, for example, that's not gonna always give you the same matched person, you can end up estimating effects in slightly different populations and then you could end up with kind of a weird result. So it's somewhat of an open question, but the real answer, I guess I'll say is that, no, people usually do not use bootstrap when they're, and I think, I feel like there's a paper by Peter Austin that tried to suggest that you really don't have to do, like that the variance is okay when you do matching. Is that right? Am I remembering that correctly, Malcolm? That sounds right. Yeah, all right. Yeah, yeah, he's, by the way, that's actually a good suggestion for if you're interested in the more practical side. Peter Austin has written a lot of articles about these causal inference techniques, just sort of like practical questions about using them day-to-day in your work, so that body of literature is really, really informative for this type of thing. Yeah. Yeah, and it's not, it's also, it's not as obviously distorting your associations as waiting, because waiting is up waiting people, right? They count more towards the analysis and so it inherently distorts the associations between those people. And so it's just, it's also just happening more clearly in the weights. Although with matches, see this is something that I sometimes talk, because people with matching, I mean, you're, it's almost more extreme though, because you're down waiting people to zero, that or not. Yeah. That's what matching does, is basically it takes anybody who doesn't get matched gets down waited to zero. And some people, if you do an algorithm that is like a one-to-many match, then you end up up waiting some people to like five or six or whatever your many. Yeah. So I don't know. I do think that there's, I don't think it's as simple as it sometimes made out to be, but. Yeah, that's where I lean as well. See, are there any other questions from the audience? Well, thank you all for joining us. I'm sorry, we were a little bit, oh, I see this is really useful. Can't wait to try it. Oh, great. Oh, a long question. Okay, so I'll wait for, we'll wait for this next question, but thank you all for joining us. This was really enjoyable. Hopefully we'll be able to offer it, again, maybe a little bit longer or with shorter examples or something. Yeah, and please really, really think of yourselves as collaborators with us on that. If you saw anything you have suggestions on, treat it like open source code. This is an open source teaching. Help us by contributing, whether it's suggestions or whatever, so that if there was anything that you found confusing or you found a typo or whatever, we'd love to know about it because it helps future students. And so help us help them by doing that and collaborate with us on improving this course. Yep, thank you so much. Somebody asked about a book. I think Malcolm mentioned a couple of books right at the beginning. Miguel Hernandez has a book that's really good. Malcolm and I are working on a book that hopefully will be good. Maybe it's too soon to say that. Yeah, yeah, we're in the early stages of working on it. More about this question of doing causal inference in R, which is nice, because that resource doesn't currently exist. Yeah, and yeah, so causal inference for me was a big one, but it's not afraid to dive into the technical stuff. So if you may like that, maybe you come from a statistics background and you wanna see those formulas and that sort of thing. But I think as far as those books, it does get a relatively nice balance with that stuff. So even if you're not, I'm not a biostatistician, so I wasn't diving so much into the technical formulas and I find it very useful, I refer to it often. That's really my go-to. Yeah, I think, so there's a comment that Hernan might not be basic and that Pearl might have an intro book. And I actually find Pearl's writing a little bit hard. I think it's very conceptual, but I find it hard from an analytic perspective sometimes to get to understand too. So I do think there's a gap in the literature for a very basic kind of explanation of some of this stuff. It's something actually, I think this week we'll have this podcast out, but Ellie and I talk about it on casual inference that one thing that's tough in the causal inference framework is that up until now, most of the training has been really in this like mentor-mentee kind of apprenticeship model where basically you learn it because your advisor is someone who is, like as a PhD student, your advisor is someone who's steeped in it. And like, so my advisor's advisor was Rosenbaum and his advisor was Rubin and my framework was only that framework. And I only knew about causal people in that framework. And I learned all the basics because there wasn't even at the time of class in my department, it was just like I did an independent study with my professor and kind of learned these things through this like one-on-one apprenticeship. And I think that only very recently is it becoming where that's not the case anymore, but we haven't very efficiently written down but in between parts. And so, yeah. So I think- I am not from a situation like that. Like that was really not the case. Although to some degree, that I have a little bit of a lineage going back to like to Sandra Greenland who's an epidemiologist, very interested in causal inference. But in a lot of ways, I feel like I'm stealing sugar from the castle. Like I'm just- It's good. ...peasant who's snuck on in and because it has been that way really. And so, but it's luckily it's opening up. You know, it's getting better. So- It's great, yeah. So we're hoping, I do think that there's a lot of room for there to this kind of introductory stuff to get written down in a really clear way because there's a lot of high level things. But like what I'm noticing in the causal spaces when people try to enter it from a different background. So like in machine learning, a lot of people are trying to get into causal work and they'll sometimes make some mistakes in their mathematical assumptions. And the causal people jump all over them. But it's like there wasn't anywhere that like, we all just sort of knew those assumptions because we were sort of steeped in it, but it's not actually written down anywhere. So this is a real problem anyway. Okay, there's a question, are DAGs by themselves causal? So this is like a concept, I think it's sort of a conceptual thing. A DAG is, DAGs describe the assumption. So they draw assumptions. They're basically a visual depiction of assumptions. And if those assumptions are true and you estimate your effects based on what you've drawn in your DAG, the effect you estimate will be causal. So the DAG itself, I wouldn't say it's causal, but they are describing causal effects. I mean, the DAG, you're drawing relationships that you're assuming to be true and that you're assuming to be causal, but the DAG is, and these are not testable assumptions. And so basically the thing that the DAG is good for, I think is that it gets you to write down those explicit assumptions and you can say, this model I fit because I believe this is the underlying kind of relationship between all of these variables. Does that answer that question? Welcome. Yeah, I think that's a great answer, yeah. Yeah. Yes, so if we were in the wild, these techniques would you advise using the ATM as the default target estimate? I like the ATM a lot as a good target estimate, but it's not always the appropriate estimate of interest. And I think the kind of, it does depend on your audience, but I think for the most part, I think the ATM is a very good target estimate. And what was the next question? Is there a way to derive how many bootstrap replications should be enough? Yes, so basically you can bootstrap your bootstrap if you want to, you could do a double bootstrap and you can see the variation between your samples, like in your confidence intervals and when your variation is sufficiently small then you can say that you've had enough. Now in practice, usually what I do is I'll basically do like a leave one out situation where I'll essentially run my model the number of times that I have samples and basically resample from my same data set that many times, but there are, you can check by doing this double bootstrap to make sure that you're not, your estimates that aren't still wiggling. You know, one nice thing about the R Samples packages that actually gives you a little bit of advice when you try and get the confidence intervals about whether or not you may or may not have enough samples. If you have too few, it will actually warn you that, you know, we don't know for sure, but this may actually have too few bootstrap samples for this to get this confidence interval. Oh, that's nice, it warns you automatically, that's fine. Yeah, yeah, very nice. I bumped into that because guess what? I didn't do enough. That's awesome. Somebody asked if our book has a URL. Unfortunately, it does not. This is supposed to be a pro project that will, we can share it, I guess, if we ever get one together and then someone else asked if I could share how I calculated the weights column. And so I thought I would just show that code again real quick. So this is the code for the weights. So for example, the W-A-T-E weights, it's just, I use mutate to add the column. I'm calling it W-A-T-E. And then I use the formula, whatever formula you were using for whatever weight. So in this case, because I'm using an A-T-E weight, the formula is the indicator for whether or not you're in the treatment group divided by the propensity square plus one minus that indicator divided by one minus the propensity square. And so when I run that. In practice, that ends up being the same. Like, you know, I, in my earlier code, I had if else as a wrapper and that ends up being the same, right? Like it's dependent on. Yeah, exactly. Yeah, I think you maybe did it like one over if else Q-S-M-K equals zero, then it's one minus fitted and otherwise it's fitted. Right, right, right. And as you said, one of those parts drops out so. Exactly. So then we can just prove to ourselves that these things are the same. But these come up all the times that they're not equal. Never. Good, that's good news. That's good news. It looks like the questions are kind of slowing down. So I'm on Twitter at LucyStats. Here I can send that in the chat in case people want it. And that's a good way to reach me if you have further questions or if you're working through some of these examples and have more that you're interested in Malcolm. What is the best way? Oh, there we go. And Malcolm is taking a break from Twitter. Just until the end of summer, I've got a few busy things that are distracting. Fortunately, I don't have the brain space or Twitter right now. No, that's totally... Yeah, and I will say actually, you know, I think probably a lot of people in this particular group is a probably select group that is maybe a little bit more involved in the Twitter RStats sphere than other people. So you probably already know this, but just as Twitter is a great place to learn R, it's also a great place to talk directly with the experts on causal inference. I've learned just like so, so, so much being on Twitter and interacting with the causal inference community. Yeah, me too. Well, thanks everyone. Thank you to our ladies, Ellie, for organizing this was great. I'm really grateful that we could do this. Do our hosts have anything they wanna address? Any meetups coming up or anything they wanna announce? We don't have our August meetup scheduled yet, but look forward to something from us. And we wanna thank Andrew again for doing the closed captioning. I hope that was helpful for anyone who's using it. And we are still recording, but I'm gonna shut that off shortly. And I will let everyone know when the video is hosted by the USAR 2020 team. So if you missed like the first 20 minutes or something, you can always watch again. Great, thank you. Thanks so much everyone. Bye. Bye.