 In this video I will explain the quasi maximum likelihood estimation or the basic principle. The term sounds a bit complicated, but the idea here is very simple and the idea has some very powerful implications that you need to understand. Let's take a look at the Bernoulli distribution. So here we have some observations. The dependent variable is 1's and 0's. We have a model here, some predictors and some link function, that give predictive probabilities between 1 and 0, and then we calculate the likelihoods using this kind of equation. So if the predicted probability or the observed value is 1, then the likelihood is the predicted probability. If the observed value is 0, then the likelihood is 1 minus the predicted probability. So if we predict our 0.1, then our probability, then it's much more likely to get the 0 than 1, so it's 90% likelihood. The computer, when it calculates this likelihoods, doesn't actually use this kind of if construction here, instead it uses this kind of function. So the idea is that we take the independent variable, we multiply the predicted probability, and then we take 1 minus the dependent variable, we multiply 1 minus the predicted probability. The idea being that if the dependent variable is 1, this receives 0, we only take the probability, if the dependent variable is 0, this receives 1, and we only take the 1 minus the probability into the likelihood. The key insight in quasi-maxima likelihood estimation is that this equation actually works even if the dependent variable is not 1's and 0's only. So we can calculate the result for observed values of 0.1 and 0.4, for example. These are not proper likelihoods, but the estimates that maximizing the product of these likelihoods gives are actually consistent very generally. So this is a really nice idea. It's called quasi-maxima likelihood estimation. The features of quasi-maxima likelihood estimation are that certain important estimators, like Bernoulli quasi-maxima likelihood, which is between 0 and 1 dependent variable, and Poisson quasi-maxima likelihood estimation, which has an exponential model for the predicted value of the dependent variable, are consistent generally regardless of the distribution of the dependent variable. So this is the same thing that you have in least squares regression analysis. The least squares regression analysis provides consistent estimates for the linear model regardless of how our dependent variable is distributed. That's the reason why when we have a linear model we always start with least squares regression analysis. We could have a bit more efficient estimates by using weighted least squares and so on. The same here. The Bernoulli quasi-maxima likelihood is consistent. There could be some, like beta regression analysis for fractions that is more efficient, but then beta regression analysis would be inconsistent unless the distribution for the dependent variables is correctly specified. There are two important limitations for these quasi-maxima likelihood estimates, which is basically just applying Poisson regression to variables that are non-counts or logistic regression analysis applied to variables that are not 1's and 0's only, but can also take values between 1 and 0. The problems are that the standard errors will be inconsistent if you do it that way. The upside here is that we can always use robust standard errors, which will be consistent. They are less efficient, but they will be consistent. Another downside is that we can't use the likelihood ratio test for model compacts. This is very useful because it gives us some very robust analysis techniques. If we have a large sample size, then our estimates are going to be precise no matter what, so efficiency is not as important. If we want to have robust sets of estimates, then instead of trying to figure out whether the dependent variables is negative binomial or whether it's Poisson or whether it's something else, you can just apply Poisson regression analysis and trust that the results are actually consistent. This has been applied in the literature, but before I go into that, there is this nice quote from Nick Ho's presentation in his data conference 2010. He points out that not many people know that you can apply Poisson regression analysis for non-count variables. Then there is a bit of a tongue in a cheek statement in the end of his presentation that if you apply Poisson quasi-maximal likelihood estimation, maybe you should call it GLM with a log link instead of Poisson quasi-maximal likelihood estimation, because otherwise your reviewers may tell you that you can't use Poisson for non-counts, which you just did in your paper. These techniques are becoming more common, and this is something of a new idea, but here's an example so we can see that in this paper 2016, they say that they use quasi-maximal likelihood QML Poisson regression analysis and they explain that you can use it even if the dependent variables count and there is no problem with that and they cite a paper in 1984 that provides a proof that that's actually the case. So this is a pretty good explanation and that you can use as an example of how do you convince your reviewers that are using a quasi-maximal likelihood estimator is a good idea.