 Random slope models are some of the more interesting applications of multi-level modeling. These kinds of models also pretty much require the use of multi-level modeling software. To understand random slope models, let's take a look at random intercept models first. So the idea of a random intercept model, which I cover in another video, is that we have the intercept here, and the intercept depends, it varies between firms, level two and between industries, level three, so that instead of having one intercept, we have a different intercept defined for each company in the industry, called U00K and R00JK in this example. So here importantly, the year effect here is simply a fixed effect. So there are no random effects associated with the coefficient of year, but here all the random effects only affect the intercept in the model. So when we expand this to be a full random slope model, what we do is that we add random effects for the slopes as well. So we add random effects here, we have R1JK, which is the random slope on firm level, and we have U10K, random effect or random slope on the industry level, and we still have the fixed effect. When we write this out as a mixed model, we can see that the random part gets a bit more complicated, so we have more stuff in the random part, and instead of simply having this effect of clustering here, so that the observations are not independent, we have this effect of year. So we can see here that the variance of the error term, because R1JK is a variance component depends on the year. So a random slope model can be understood also as a model that allows a certain kind of heteroscedasticity. That may not be useful for estimating the model, but you can understand it that now the variance of the error term can no longer be constant, but it must vary as a function of year. So if one case goes up and another case goes down over time, then the variation increases and that variation must be accounted for by the random part, which is assumed to be uncorrelated with the fixed part as we always do in this kind of model. Let's take a look at the empirical example to understand why or how this kind of model would be reported and how we can visualize the modeling result. Our empirical example is Holcombe's article, which uses two-level data or two-level model. Their data has more levels, but they are decided that the other levels are not relevant. So they have the individual level and they have work-unit level and they are modeling the effects of time like in Holcombe's article. And their interest is in whether absenteeson depends on the individual level or work-unit level, particularly how the trend develops in the individual level and the work-group level. Typically, when you do this kind of models, when you do random slope models, you start with a random interim model. So quite often with multi-level modeling you have this bottom-up approach. So you start with a simpler model, you add things to the model, you perform likelihood ratio tests to check if the things that you add to the model explain the data more than what would be attributed to chess, only if yes, then those additional things are required in the model. So this has random intercepts and then random intercepts and slopes and then we can compare the models. And we can see here that this model fits a lot better than that one because the divine statistics pretty substantially smaller than here. So how do you then interpret this kind of models? Really, you can look at the fixed effect of slope that gives you the average slope, but to understand how the work-units vary, then understand they're looking at this variance component. That's difficult to interpret. Again, a simple way to interpret this kind of model is plotting. So we can use our computer to predict or estimate the values of the random effects and then we can draw a different line for each workgroup and this is called the spaghetti plot of random slopes. So we can do this kind of plot where we show that okay. So this is the line that the fixed part predicts and then we calculate the predicted line for each observation based on their each unit based on their estimated random effect values and then we get this kind of spaghetti. So we can see that all the lines go roughly to the same direction, but there is some variation, some extremes, but generally they all change to the same direction. There is some difference, but nothing really, really extreme and that's expected because the variance of those slopes was rather small. Now, an important question is that if you have a regression coefficient and if you have one coefficient of the slope and one coefficient of the intercept, how are the slope and intercept related? So each of these lines has a unique slope and unique intercept. We can see that if we have a line here, let's assume that this line, the grand line, the bold line, goes through the middle of the data. So that's the fixed effects line. If we change the slope, the line must still go through the data. So if we increase the slope and this part of the line goes up and the other end of the line goes down, so the intercept would decrease. So in practice, when your intercept increases, then depending on if your data are located on the positive side or negative side of zero, the intercept will either go up or down, but intercept and slope of a regression line are typically correlated. And now the question is, does this model include that correlation? So they don't report the correlation, but they actually do estimate the correlation between these two random effects, the random intercept and the random slope. So they estimate actually four variance components. They have the winning unit, but they have three variance components for level two. They have the slope variance, they have the intercept variance, and they have the covariance between slope and intercept. How do we know? We know by looking at the device statistics and the AIC statistic, because we know that we can calculate the number of parameters. It's the difference between AIC and the variance. It's four for the first model, six for the second model, so they actually calculate the correct model, but they just don't report the covariance between the intercept and slope for some reason. That, of course, should be reported for transparency, and also it should be recorded because the default setting in some statistical software is to have random slopes and random intercepts to be uncorrelated. It doesn't really make sense for most scenarios. In fact, it's very difficult to think of a scenario where you would want to constrain the regression slope and regression intercept to be independent, but for some reason some software do constrain it. So for transparency and to ensure that the readers have the full confidence that you have done your analysis correctly, these correlations should be reported in this table. Otherwise, the table is a pretty good example of how you report multi-level modeling results. You report the fixed part and the random part, and then you interpret both parts because if you were not interpreting these various components' parts, then you probably wouldn't be needing a multi-level model anyway. So to summarize random slope models. Random slope models are used when we want to model a scenario where an effect of an unobserved variable can vary or effect of an observed variable can vary between clusters. So these are useful when we have a heterogeneity of effect that is of theoretical interest. So if you want to understand how much trends vary between units, then random slope model is a useful technique for that kind of question as demonstrated by the house net paper that I showed on the previous slide. You can also think of these as motorism models where the moderator is not observed. So the idea of a moderation model was that the regression coefficient of one variable depends on another observed variable. In the random slope model we have a regression coefficient of one observed variable that depends on an unobserved variable. So this is kind of like a moderation model without the moderator being observed. Also you typically would not use this kind of models if you have a few clusters with large number of observations per cluster. So let's say we have five countries and we have a thousand companies for each country estimating a country-level random effect wouldn't make any sense. We can just run a separate regression model for each country and that gives us better estimates than the random slope model. So this is for scenarios where you have large number of groups but small number of observations within group. It's important to remember that when you do this analysis you allow the random effects that are on the same level to be correlated. So you don't constrain the interest and slope of a regression line to be uncorrelated. That doesn't make any sense. Then we have the random effects assumption. So everything in the random part must be uncorrelated with the fixed part. If not, then you have an endogeneity problem. If you are concerned that the random effects may be correlated with the fixed part of the model you can always apply the correlated random effects approach. We discuss how to apply correlated random effects models for random slope models in this article and the article also provides references to other literature that explains how that is done.