 The other videos that I have recorded provide lots of details about multi-level modelling and their use in research applications. In this video, I will provide a couple of important points to conclude the discussion on multi-level models. The first thing about multi-level models is that they are applied to models where you have multiple levels. So we have level one observations typically nested in level two observations, typically nested in level three observations. We could also have crossed random effects so that for example year observations would belong to firms and industries, and each firm would have multiple year observations, each industry multiple year observations, and there is no hierarchy relation between firms and industries. For example, if level one was investment, level two would be startups, level three would be venture capitalists, then that kind of cross structure would be useful. In the nested structure here, we have the multi-level model can be expressed in a mixed model. So we have the fixed part and the random part. This random part is the beef in multi-level modelling. And one important thing that you need to understand is that if you're only interested in this fixed part, you don't actually need to use multi-level modelling and using multi-level modelling could actually be undesirable. The reason is that if you're only interested in the fixed parts, you can just use cluster robust standard errors. In a normal regression model, the assumptions about the error term basically affect how the standard errors are calculated. And cluster robust standard errors are consistent for any structure of the random part. So cluster robust standard errors, as long as you have sufficient sample size, don't care about what kind of structure you have here. You can of course use cluster robust standard errors in multi-level model as well. Multi-level modelling on the other hand, the correctness of the standard errors, and in some cases the estimates as well, depends on this error part here being specified correctly. So this imposes quite a lot of structure on the data, on the error terms, whereas cluster robust standard errors don't assume any structure of these errors. So if you're only interested in what is the average effect of year, what is the fixed part gamma 00 of the effect of year, then you don't actually need this, and you can get more robust results more easily by not using multi-level model. There's a nice paper by McNeese in Psychological Methods, where he goes through applications of multi-level modelling in psychology, and concludes that most studies that applied multi-level models did so without a real need for the multi-level modelling, because they were not looking at these variance components in the random part. They just wanted to control for unobsert effects, and they were focused on these fixed effects, and that was the interest in the study. In that case, cluster robust standard errors does its job, in some cases better, but at least more easily. If you want to use multi-level modelling, then your interest should really be in variance components. So proper applications of multi-level models, you see this kind of tables that contains two sections, one for the fixed effects coefficients, one for the random effects, and then these random effects are actually interpreted. Like in this House Nets paper, they were focused on how much do trends in absenteeism vary between individuals and between work groups, and on what level does that, on what level is there variation? Are there trends that differ between different work groups, or is the trend same for all work groups? That was one of their research questions. So if this is not of interest for you, if you don't know what this means, then you probably should not be using multi-level modelling, but just use cluster robust standard errors and focus on the fixed part. Multi-level modelling is used like regression analysis. So interpretation is the same, random slopes interpreted like moderation. Then you have spaghetti plots that you can apply. Diagnostics basically the same, you have plots of residuals and predicted random effects, added variable plots can be done, various influence plots can be done. The difference between the diagnostics is that some of these diagnostics are slower to calculate, because they require re-estimating the model many, many times, or they require predicting random effects, which can be slow sometimes. And another thing to know is that the software support for these diagnostics is not as developed as for normal regression models. So for example, if you use data, you just can't type RVF plot after multi-level model, you have to actually do the predictions yourself using predict and then do plotting yourself, so you need to construct the plots manually. But if you have done that once, you can reuse the code multiple times in different papers, so it's not that big of a deal. Not all plotting can be automated, so that's what I just said. So that's something that I think prevents people from applying diagnostic plots or visualizations, is that they really need to think through what the random effects are and how they are related, how do you actually do the plot instead of pushing a button or using an Excel sheet that you find online to do the plot that you like. Two important takeaways are that all multi-level models, all longitude data sets are also multi-level data sets. When I teach people, I sometimes hear of econometrics-focused people telling me that they don't need to care about multi-level data because they only work with panel data. Well, a panel data is a special case of multi-level data and understanding multi-level modeling will allow you to expand your toolbox so you don't have to do only with the econometrics techniques. There tends to be, as documented by the McNeese paper that I cited before, there tends to be a trend that people or disciplines focus either on multi-level modeling or panel data econometrics are largely unaware of the other tradition. So if you know both, you have a larger toolbox. And the second important takeaway is the most important question, if you think about applying a multi-level model such as LME4 in R or mixed in Stata, is do you actually need a multi-level model or would a regression model with cluster over standards do? Quite often the answer is that yes, regression would actually do the job.