 In this video I will explain how the maximum likelihood estimation principle can be applied to estimate a logistic recursion model. Our data are girls and the dependent variable is whether a girl has had a menarch or not and the independent variable is age. And we are fitting the logistic curve. So we can see here that when the girls age is close to 10, which is the minimum in the sample or the predicted probability of menarch is about zero. And when the girl is 18, which is about the maximum of the sample, the predicted probability of menarch is about one. And we want to estimate this logistic curve, how it goes, and it tells us the relationship between age and menarch. We apply the probability calculations to values that are ones and zeros. That's the dependent variable. And to do that we use the Bernoulli distribution. The idea of a Bernoulli distribution is that we only have ones and zeros. And in this example the zeros are twice as prevalent as ones. And the population is always very large in maximum likelihood estimation because when we take a sample of one, one zero or one, one away from the population, the ratio of ones or zeros should stay the same even if we take a sample away from the population. The probability of getting zero is 67% from this sample and the probability of getting one is 33%. So when we have this set of observed values that are sample, we have seven zeros and two ones, they happen to be in this order by random, it doesn't have any significance or any meaning, and we calculate the probabilities, then we calculate the total probability by multiplying all these individual probabilities together. So when we know what the population is, then we know the probabilities of getting particular values from that population. In maximum likelihood estimation, the population is not known, but we have to estimate what is the effect of age on menarch in the population and what's the base level. And so we don't talk about probabilities, we talk about likelihoods. So the idea of maximum likelihood estimation is that we try to find a population that has the maximum likelihood of having had, having produced these values here. So we don't know what the mean is or what's the ratio of ones and zeros, we only know the data and we assume that the model exists for the population. Then we calculate, we have some guesses for this ratio and then we calculate likelihoods, we calculate the cumulative likelihood and we maximise the cumulative likelihood to find the maximum likelihood estimation by changing our model parameters. So for example, we could guess that the ratio is two to seven, that gives us probabilities of 78% and 22% for zeros and ones. We calculate the cumulative probabilities and we multiply everything together and this is the likelihood of the sample given our estimated population. The maximum likelihood estimate is simply found by changing our guess of the ratio of ones to zeros so that this value here becomes as large as possible. This principle is applied to the logistic recursion analysis. The idea is that we calculate using this logistic curve and this age here and the known ages and the known Menard status of these goals, we calculate the individual likelihoods for the observations and then we use those individual likelihoods to find the best possible logistic curve for the data. How it works in practice is that we have some kind of guess. So we guess that our Menard's is a linear function of age and an intercept transformed using the logistic function. So let's say that the interval is minus 20 and the effect of age is 1.54, we apply logistic function to the linear prediction and then we calculate, that gives us the expected probabilities. Then we check how likely that particle observation is given the Fitter probability. So for example, the first girl here is 13.6 years and she has had Menard's. The linear prediction for that girl using this equation here is 0.94. Then the Fitter probability using the logistic function to this linear prediction is 73.6 percent. So if the probability is 73.6 percent and the girl has had Menard's, then likelihood for that observation is 73.6. Then we move on to the next goal. So that's 11.4 years and she has not had Menard's. The linear prediction is minus 2.44, so it's calculated using this equation here and we apply logistic function, gives us 8 percent predicted probability because it's only 8 percent probable that this girl would have had Menard's given her age and she didn't, then the likelihood for this observation is 1 minus 8 percent which is 92 percent here. We do that calculation, we calculate the likelihood for all the girls and that gives us the product 6.4 percent. For computational reasons, we don't typically work with these raw likelihoods and multiply them together. Instead we work with logarithms. So we calculate the logarithm of the likelihood called the log likelihood for each individual observation and we take a sum of these log likelihoods and that gives us the full log likelihood of the sample. We adjust the values of intercept and values of age or the coefficient for age to make this full sample log likelihood as large as possible. In practice this is almost always a negative number so we try to make it closer to zero or a smaller negative number.