 Random effects are latent variables, whose values are not calculated during estimates. However, post estimation when we do diagnostics, or when we visualize the results using graphics, calculating estimated values for the random effects could be very useful. So for example if we are companies, we have a random intercept, so we know that the mean value of, let's say, performance conditional on the observed values, the predictors varies between companies. Sometimes we want to estimate what is the base level of performance for each company, for diagnostic purposes and plotting purposes. There are a couple of ways of doing this. This is more of a nice to know thing, but it's easy to get confused because of one particular term used in these prediction techniques. To assume we have this model, so we have a random part here, we have three random effects. It's a random intercept model, three levels. We have the effect of year. How do we predict the values of the random effects u00k and r0jk? Well, there are two main techniques that we normally apply. We know that in the population based on the model assumptions, the means of all these are zeros. So a simple strategy to estimate what is the random effect of, let's say, r0jk, let's call that the firm random effect is simply to take the residuals that the random part is estimated from and take the mean of the residuals for a company and that is the estimated value for the random effect for that company. This is called the maximum likelihood approach. So you basically take each cluster, you take the mean of residuals for each cluster and that is the random effect, estimate of the random effect for that cluster. So that's a specific score and that's the best guess of a score for that cluster. However, this technique has some problems and the problem is that if we take the mean of a firm and then we use the same data to take the mean of the industry, then that will inflate the actual variance of the sum of the industry effect and sum of the firm effect because, let's say, one extreme observation is counted twice. And to account for this variance inflation, we have another technique called empirical base predictions. And this is something that sometimes confuses people because they see the word base here and while the Bayesian theorem is kind of used in estimation, they get confused that is this actual the base estimation where I have to define priors and where I have to interpret the posterior distribution and so on. The answer is no. This is empirical base because the technique is based on basis idea but it doesn't really affect your model estimates in any way. The model estimates are what they are and this is just a way of calculating the predictions. So how does empirical base work and what is the idea? Well, the idea is that these maximum likelihood estimates will be, they will have too much variation. So if one company has very extreme performance, then that extreme performance large residual will increase the variation of increased estimate for the industry effect and for the firm effect for that firm. So it's counted twice. The idea, I don't go into the math but the idea in the empirical base prediction is that you take the value, there are distribution based on the model and then you take the actual observer distribution of the residuals and then you shrink, you move the distribution of the residuals towards the actual estimated distribution and that gives you estimates that have more desirable statistical properties than the maximum likelihood estimates. So if you want to characterize an individual case an individual industry or individual observation, then the maximum likelihood approach is the best. But normally we want to see how the observations behave with respect to another one. For example, we would like to calculate the spaghetti plot and we want to plot many observations at once, then inflating the variance of the estimates would be bad and empirical base predictions would be better. In practice in research applications when we try to write the journal paper, we are almost always interested in trends in the data and not individual observations. So the empirical base estimation of random effects is superior for that purpose and that is in fact default in many statistical software. To get ML estimates, you have to ask for them separately from the software but it's good to know that these two alternatives exist and to verify from the documentation which one is the default before you use the default prediction set, use the prediction command with the default settings.