 RML estimation, or restricted maximum likelihood estimation, or residual maximum likelihood estimation is an estimation technique that is commonly used as an alternative for maximum likelihood estimation. This estimation technique has some advantages and disadvantages. Let's take a look at what maximum likelihood estimation does in small samples when we try to estimate the variance components. I'm using Stata for this example. So we have here data set artificially generated three observations of y, 0.9, about 0, about 0.5, and we want to calculate the mean and standard deviation. We are particularly interested in the standard deviation, which in this case is 0.481. So what happens when we run a mixed model and we get the variance component for the error term? When we run a mixed model of y, we represent the result in standard deviation metric, we can see that the estimated standard deviation is 0.39. But in our actual data the standard deviation was 0.48. So what's the deal here? Why are these maximum likelihood estimates of the variance smaller than the actual sample variance? This is the bias of maximum likelihood estimation. The reason for the bias is that maximum likelihood estimation, you can think of conceptually, not mathematically, but conceptually as a two-step procedure. We first estimate the mean and then we estimate the variance assuming the mean is known. So we are interested in estimating here the normal distribution, so we have the mean and standard deviation. And the estimation problem finding the mean, that's fairly straightforward and ML gets that right, but estimating the standard deviation of variance is a bit more complicated. What we really would like to know, what is the expected value of the square differences from the population mean? So the expected value of the square difference from the mean is the definition of variance. And we are interested in knowing how much these observations vary from the population value, so that's the quantity of interest. What maximum likelihood estimation gives us instead is the expected square distance from the estimated mean. So it kind of pulls the distribution toward the observations before estimating how far those observations are spread out. So if we have observations that are, let's say, the population mean is zero and we have observations, let's say five, six and seven, then we would have lots of variations because we square five, six and seven, we take the expected value, gives us something in the ballpark of 40 to 50. If we take the maximum likelihood approach, then we first calculate the mean, which is five, and then we take the difference from the mean, which is minus one, zero and plus one, we square, we get less than one. So the problem with maximum likelihood estimates is that it doesn't take into account the decrease of freedom that we lose when we estimate the mean. We have also, we have fortunately an alternative estimation technique called REML, or Restrictive Maximum likelihood. And this Restrictive Maximum likelihood estimation basically corrects for the bias in the variance components of the maximum likelihood estimate. If we apply REML estimation here, we get the estimate 4.81, and that is what our sample value was. And the sample variance or sample standard deviation is an unbiased estimator of the population quantity. So that's great. So this gives us less biased estimates than maximum likelihood estimates. I will not go into the technicalities of REML estimation, it's basically applying slightly different variation of the normal likelihood formula for this problem. But what you need to understand is that the Restrictive Maximum likelihood estimation corrects for the decrease of freedom lost in estimating the mean before estimating the variance components. And this has an advantage that there are, REML estimates are less biased. They are not unbiased in small samples, but they are typically less biased than the maximum likelihood estimation estimates that are typically too small in small sample sizes. There is a disadvantage though, that when you apply REML estimation, then the resulting likelihood values are not proper likelihoods, so you can't use REML estimates in likelihood ratio testing. So how people typically apply ML and REML is that you do the model testing sequence and model selection using ML, and then when you switch to your final model where you interpret the variance components, then you do the REML estimates. In practice in large samples, both of these techniques produce very similar results, but in small samples the differences can be substantial and REML is recommended for the final set of estimates if you care about the variance components. As a general practice, always running REML with ML for the final results is a good idea if the results are the same, then you can go with ML if they are not, then REML results should be trusted more for the variance components.