 So, this is my second lecture on generalised linear models. And here is the content of this module, the exponential family of distributions fitting generalised linear models and logistic regression models. So, in simple linear regression or in multiple linear regression model we make several assumption on error term like you know the error term has mean zero variance sigma square and they are uncorrelated. And also we assume that epsilon or they follow normal distribution with mean zero and variance sigma square. Now in a topic called transformation and waiting to correct model in adequacy, we have learnt you know how to deal with the situation when the assumption on constant variance and also that uncorrelated assumption is violated. And what we learnt in this topic is that how to deal with the situation when the normality assumption is violated that means the response variable or the error term they follow some other distribution not the normal distribution. So as I told in the previous class you know this generalised linear model analysis comes into play when the error distribution is not normal. So, the error distribution is not normal means the distribution of response variable is also not normal, but let me clear the fact that the error distribution is not normal, but the error distribution or which is same as the response variable distribution must be member of exponential family. So this should be clear like you know this generalised linear model is applied when the error distribution is not normal, but the error distribution must follow a distribution from the exponential family like we learnt in the previous class that like normal distribution of course it is follow it falls in exponential family and then binomial Poisson and then gamma exponential negative binomial they are in all in exponential family. So what I will do I sort of repeat this fitting generalised linear model again because this is very important fitting generalised linear models. Suppose you have a set of independent observations suppose we have set of independent observations the observations are y i this is the response variable and x i prime. So this is a vector and suppose it has p components associated with p regressor variables. So these are the observation we have and we have n observations i is from 1 to n and as I told this x i prime this is x i 1 x i 2 up to x i p and here this response variable is not from normal. So this is from some exponential type distribution of canonical form. So we know when a distribution is of exponential type and then the joint pdf probability density function is so f y 1 y 2 y n theta and phi which is basically product of this marginal pdf. So the marginal pdf is it is exponential type so the pdf is of this form exponential y i beta sorry b theta i plus c theta i plus d y i and since this distribution is of the canonical form that is why ay is equal to y and this is the product of marginal so i is from 1 to n because the observations are independent that is why you can find the joint pdf just by multiplying the marginal pdfs and this can be written as as I wrote in the previous class this is exponential some i is from 1 to n y i b theta i plus c theta i i is from 1 to n plus sum over i equal to 1 to n d y i and here this theta is a vector of parameters of interest and here the parameter of interest is theta and this vector is say theta 1 theta 2 theta n. So the ith observation is coming from an exponential type distribution with parameter theta i and this phi is vector of nuisance parameter. So next what we want is that we are given say y i and x i like the previous cases. So we are given the response variable and we are given the value for the p regressors and what you want is that we want to explain the variability in y in terms of x i's but the only problem here is that if the y i is from the normal distribution then we know how to fit a model between y i and x i but here the only problem is that this y the distribution of the response variable does not follow normal distribution here and then how to how to fit a appropriate model between between the response variable and the regressor variable. So that is the main objective here. So what we expect is that we would hope that the variation in y i or say expectation of y i that is nothing but theta i and this all this theta i theta 1 theta 2 theta n they can all be different. So the variance in y i or theta i values could be explained in terms of the x i's values and we would hope that we could find a suitable link function suitable link function say G of theta i such that the model is the model G theta i is equal to x i prime beta this held. Let me just complete my writing here and then I will try to explain this part little bit. So where beta is the regression coefficients beta is beta 1 beta 2 beta 1 beta 2 beta 2 beta 2 beta 2 beta 1 beta 2 beta 2 beta 2 beta 2 beta 1 beta p is vector of regression coefficients and this link function this link function is often the natural parameter. I am sure that you may face problem here understanding this part but let me try to explain this one. So in usual case when y i is from normal distribution what the model we fit is that we fit the model y i is equal to x i prime beta plus epsilon. So this is the simple or multiple linear regression model and then this can be I can write this model as say expectation of y i equal to x i prime beta because of the fact that expectation of epsilon is equal to 0. So my model I can also write this model as theta i is equal to x i prime beta. So this is the model in case of so the model finally the model is theta i is equal to x i prime beta when the response variable is from the normal distribution. Now I hope you can recall that the natural parameter for x i prime beta is x i prime you know normal distribution is also in the exponential family and the natural parameter for normal distribution is the natural parameter parameter is theta if theta suppose theta is the mean. I mean the link function is associated with this natural parameter. So the link function here the function I am talking about this is g theta. So g theta is equal to the natural parameter theta. So when it is normal my g theta is equal to theta so that is why I fit the model theta equal to x i prime beta. Now in the other case this suppose y is not from y does not follow normal distribution it follows some other distribution from the exponential family say binomial then my natural parameter for binomial is ln theta i by 1 minus theta i. So this theta is the probability of success in i ethereal. So this is the natural parameter for the binomial case binomial with parameter n theta i. So this is the natural parameter in that case in case of binomial. So my g is my g theta is ln theta by 1 minus theta. So in case of binomial we will go for the model that we will go for the model of g theta i is equal to x i prime beta. And we know that in case of binomial this g theta is equal to ln theta by 1 minus theta sorry i equal to x i prime beta. So this is the here you can write this in the compact form also may be finally the model is theta i you can write it as theta i is equal to exponential x i prime beta plus 1 plus 1 plus 1 plus 1 exponential x i prime beta. So this is the model in case of binomial and this is nothing but this is nothing but the final model is expectation of y i is equal to expectation sorry exponential x i prime beta by 1 plus exponential beta. So this is the model we have to so that means the y equal to this plus epsilon is the model in case of the response variable follows binomial distribution. So I will talk about this case right now in detail right. So let me consider this fitting generalized linear model in case of binomial distribution binomial distribution. So let me write it clearly suppose we have data we have data say y i x i prime from a binomial distribution from a binomial distribution with parameter say binomial n i let me write it p i instead of theta i am writing p i. So p i is the parameter of interest and n i's are nuisance parameter. So I have a set of observation from binomial distribution then how to how to fit a model I have already talked about this one but I will write it very clearly here. Now let me write it as p i instead of theta i am writing p i. So p i is the parameter of interest and n i's are nuisance parameter. So I have a set of observation from binomial distribution then how to how to fit a model I have already talked about this one but I will write it very clearly here. Now this y i the single observation y i is of the form r i by n i where r i is the number of successes in n i trials. So r i is the number of successes in n i trials. So you know binomial distribution and here y i is not really number of successes in n i trials it says proportion of success basically each having probability p i of success and this x i prime this one is basically x i 1, x i 2, x i p is a set of observations of p regressors associated with y i and this x i prime this one is basically x i 1, x i 2, x i p is a set of observations of p regressors associated with y i and this we know that this binomial I will also give an example to illustrate this part this binomial distribution is a member of the exponential family. So what I will do first is that I will write down the joint p d f because that well from that one we get the natural parameter and also the link function. So the joint p d f of y 1, y 2, y n joint p d f let me write it as f of y 1, y 2, y n is equal to product of y 1, y 2, y n is equal to product of y of n i y i n i c y i p i to the power of y i into 1 minus p i to the power of n i minus y i from i equal to 1 to n and this can be written as i equal to 1 to n and this can be written as 1 to n exponential say y i l n p i 1 minus p i it is not difficult to verify this plus n i l n 1 minus p i plus l n n i c y i and then you can finally, write it as exponential just sum over i equal to 1 to n y i l n p i 1 minus p i plus sum n i l n 1 minus p i i equal to 1 to n plus i equal to 1 to n n i c y i plus sum i equal to 1 to n c y i l n. So, this is the joint p d f of the observations we have which are from the binomial distribution. Now, the same thing what we want we have we are given y i and x i prime and we would try to explain the variability in y in terms of x i. So, same thing so we would hope that the variation in the in the response variable y i or in expectation of y i. So, here you know this generally it is n p, but here we are assuming again this y i the observation is number of successes by n i. So, this one is basically p i. So, the variation in y i or in p i could be could be explained in terms of of the x i's values. That is we would hope that we could find a suitable link function link function. So, here the function g such that this g of p i is equal to x i prime beta and this link function is equal obtained from the natural parameter. And for binomial distribution the natural parameter is so the natural parameter here meter is l n p i 1 minus p i. So, this is basically g of p i. So, that is why we fit the model and we have to find a suitable link function we fit the model l n p i by 1 minus p i equal to x i prime beta. So, this one is nothing but beta 1 x i 1 plus beta 2 x i 2 plus beta p x i p. And this is the result of the this one can be written as finally, that p i is equal to exponential as I wrote before also exponential x i prime beta by 1 plus exponential x i prime beta. So, this is same as writing expectation of y i is equal to this and which is equivalent to say that y i is equal to this plus epsilon. So, this is the model you know instead of fitting y i equal to x i prime beta which is the case for normal distribution we are fitting the model like y i is equal to this expression plus epsilon. And then expectation of y i is this one. So, the model we got is finally, is p i is equal to exponential x i prime beta by 1 plus exponential x i prime beta right. Now, when x i prime beta is equal to epsilon is equal to beta 1 plus beta 2 x i 2. That means, only one regressor and the other one is of course, the dummy variable you can put x 1 x i 1 which is a dummy variable which is one for all observations. When this is true let me call it star when this is the situation then star is called logistic function. So, we have the model with us now this is the model we have to fit when the response variable is binomial. So, the model we got is that expectation of y i or which is equal to p i is equal to exponential x i prime beta by 1 plus x i prime beta sorry exponential. So, we have the model and then how do we fit the model fitting the model means here the regression coefficients are beta 1 beta 2 beta p. So, we have to estimate those things. So, here we will use maximum likelihood method to fit them. So, estimation via maximum likelihood. So, to estimate beta we use the method of maximum likelihood. So, first what we do is that we construct the likelihood function or compute the likelihood function l which is the likelihood function which is nothing but the joint probability of y 1 y 2 y n and we know that this is exponential just now we computed this is y i l n p i 1 minus p i plus sum over i equal to 1 to n i equal to 1 to n n i l n 1 minus p i plus l n n i c y i right. So, this is the likelihood function. So, this is the likelihood function and then it is convenient to work with log likelihood. Log likelihood is nothing but you know log of the likelihood function l n. So, this simply it is become sum over y i l n p i 1 minus p i i is equal to 1 to 1. So, this is the log likelihood plus n i l n 1 minus p i i is from 1 to n plus l n n i c y i. So, this is the log likelihood and now what I want to do is that see ultimately we have to estimate the parameter beta and the model we have to sorry we have to fit this model we have to fit the model p i is equal to exponential x i prime beta by 1 plus exponential x i prime beta and then what we will do is that we will just write this log likelihood in terms of beta. So, you can check that this one is x prime beta right. So, this one is summation y i x i prime beta because this l n p i by 1 minus p i equal to x prime beta from there only we get this one plus i is from 1 to n you can check that this can be replaced by you have to put minus here by n i l n 1 plus exponential x i prime beta it is not difficult to check this one. So, from here you can check that this l n 1 minus p i is equal to this plus so l n n i c y i. So, we have the likelihood function or log likelihood function in terms of beta now. So, how do we estimate beta? You maximize log likelihood l n with respect to beta that means so here you know this beta is a vector and it has p components beta 1 beta 2 beta p. So, you have the log likelihood involving beta now you differentiate this log likelihood with respect to beta 1 beta 2 and beta p. So, then you will get p equations and you have p unknown and then you can solve for p 1 p 2 and sorry you can solve for beta 1 beta 2 and beta p. So, this is how you know you have to find the estimates of regression coefficients beta and it is not so easy to do this for a given problem. So, may be you know this numerical search method or something called iteratively reweighted least square this is i r l s could be the result of the used to compute maximum likelihood estimates of. So, now again you know to explain this example of binomial distribution you considered. Now, I will give a numerical example to illustrate the binomial case. So, here we have data called a pneumoconiosis data and this is you know this is lung disease this pneumoconiosis is a lung disease that results from breathing in dust in coal mines and here you have the data like number of years of exposure and the data can be read in this way. So, number of years of exposure is say 5.8 years and total number of minor is 98. So, this many workers number of severe cases is 0. So, if I mean then the proportion of severe cases. So, why is the proportion of severe cases and that is 0. So, 0 by 98 is 0. So, the number of years of exposure if it is 5.8 or say 6 years and here then the probability that somebody will be severely effected by this pneumoconiosis is like 0. Similarly, if you see that the number of years of exposure is more then there are chances of severely effected by this disease and here you can see if somebody is exposed for say almost like 50 years then it is almost the probability is half that a person will be effected by this disease. So, I am sure that you know understood the problem here. So, I have the data now let me write in terms of my requirement like so, I have the response variable y i. So, y i is the proportion of minors who have severe symptoms. So, this proportion these are the proportion and I want to see whether the variation in this proportion can be explained in terms of the number of years of exposure and that is my x i. So, here I talked about this x i vector. So, this vector is consist of only one component it is a simple regression model type of things. So, I have this data for i equal to 1 2 3 4 5 6 7 8 for i equal to 1 2 8. So, my y 8 is the proportion of equal to 0.45 and my x 8 is 51.5 years. So, what I want is that I want to see whether the variation in y i or in the proportion of severe cases can be explained in terms of the number of exposures. But the problem here is that this y i is not from the normal distribution. So, it is sort of binomial if you I mean this number is of course binomial number of severe cases is binomial. So, here the probability distribution for the number of severe cases is binomial. So, we will fit a logistic model and there are only one regressor. So, we will fit a logistic regression model. So, this is a logistic regression model. To the data and my model is like y i is equal to exponential x i prime beta by 1 plus exponential x i prime beta and I should write this expectation of y i which is equal to p i basically. And here you must have observed this x i prime beta is equal to beta 1 plus beta 2 x because there is only one regressor right that is the number of years of exposure right and then you go for. So, you have the model and then you know how to fit this model using maximum likelihood estimator. And finally, you can check that the fitted model is y i hat which is equal to exponential 4.79 minus 0.0935 x. So, this is the number of years of exposure x that means I am writing this beta 1 is this and beta 2 is this plus 1 plus exponential 4.79 minus 0.0935 x. So, this is the binomial I gave an example to illustrate the binomial case. Let me go for the Poisson distribution now. So, suppose we have data y i x i prime from Poisson p with parameter say mu i that means expectation of y i is equal to mu i. Generally, we write lambda i is ok. So, you know that this distribution also is in the exponential family and the probability mass function f y mu can be written as exponential y l n mu minus mu minus l n y factorial. And here the natural parameter is l n mu is the natural parameter. I am specific about this because this from this natural parameter will get the link function. So, the g mu g mu is l n mu minus l n mu minus l n mu minus l n mu. So, again what we are suppose we are given y i and x i and we want to explain the variability in the response variable y i in terms of x i. So, the variation in y i could be explained in terms of in terms of the x i's values. And the model we fit is that we fit the model the model is like g mu i is equal to x i prime beta. And we know that this link function is equal to l n mu. So, the model we fit is so l n mu i is equal to x i prime beta which is equal to beta 1 x i 1 plus beta 2 x i 2 plus beta p x i p. And finally, you can write this as mu i is equal to exponential x i prime beta. So, this is the final model you have to fit and this is nothing but expectation of y i is equal to e to the power of x i prime beta. So, this is same as writing that y i. So, you have to fit the model y i equal to e to the power of x i prime beta plus epsilon. So, whereas for the normal case it is y i equal to x i prime plus beta because for the normal case your g mu is equal to mu that is why. Now, what I will do is that I will talk about some reasonable choices of link function. Suppose of choice of this link function because the model is model depends on this choice of link function. Suppose the distribution is normal distribution and the link function. So, in case of normal you see I am just bringing one slide from the previous lecture. Here you can see the natural parameter is mu you can forget this sigma square because this is nuisance parameter. So, you can write it simply mu. So, then the link function will be g mu equal to mu. The link function will be g mu equal to mu and this is called the identity link. So, in case of binomial we know we have just established the model. So, g mu g p let me write p p is the probability of success in one trial this is l n p by 1 minus p this is called logistic link these are the name and then partial my g mu is equal to mu. This is called log link and for exponential my g mu is equal to 1 my mu and this is called reciprocal link and of course, for the gamma distribution is same because exponential is particular case of gamma distribution it is g mu equal to 1 by mu which is also called reciprocal link. So, in this module we have learnt if the error distribution or the distribution of the response variable is not normal, but it is from some exponential family the distribution is from some exponential family then how to deal with the situation how to fit a model. So, we have to stop now that is all thank you.