 So, today we will start a new module called generalized linear models. And here is the content of this module, first we will talk about exponential family of distributions. And then fitting generalized linear models and logistic regression models. So, before I start this module, I want to recall we talked about a module called transformation and waiting to correct model in adequacy. And there we have studied something called generalized list square of which weighted list square is a particular case. And the generalized list square is concerned about the application of ordinary list square technique in situation where y equal to x beta plus epsilon, this is the model. And expectation of epsilon is equal to 0, but variance of epsilon is equal to sigma square into v. So, this v is the variance covariance matrix of the error term which cannot be written in the form sigma square into i. So, let me write this thing in detail. So, the generalized list square, so this one is concerned about the application of ordinary list square technique in the situation where y equal to x beta plus epsilon with expectation of epsilon is equal to 0 and variance of this epsilon, this is the variance matrix is equal to sigma square into v and this cannot be written as sigma square i. So, as we have studied that, this happened when the observations y have unequal variances and or observations are correlated. That is why in this variance covariance matrix v, the op diagonal elements are not equal to 0. So, in either case, the conditions of Gauss Markov theorems are violated. So, the list square estimate that is beta hat equal to x prime x inverse x prime y is not the best linear unbiased estimator. So, here we have studied in the generalized list square, we have studied transformation on this model to get the best linear unbiased estimator and also in that module, that is transformations and waiting to correct model in adequacy, we talked about variance stabilizing transformation, which deals with the situation when the response variable are having inequality in variance. So, the generalized linear models that is GLM analysis comes into play, when the error distribution is not normal. So, the distribution of error is not normal or which is equivalent to say that the distribution of response variable is not normal. In that case, we need to use generalized linear model. So, you should understand the difference between the generalized linear model and the generalized list square, because generalized list square is used to deal with non-constant variance in the response variable or of course, when the observations are correlated. Whereas, this generalized linear model is used when the error distribution is non-normal. So, either the error distribution is not normal and or when a vector of non-linear functions of the response says that is eta y, which is equal to eta y 1 eta y 2 and eta y n, this vector and not y itself has expectation x beta. So, I am not in position to explain this thing at this moment, but what I can say here is that in case of multiple linear regression model, we consider the model y equal to x beta plus epsilon, which is same as we consider the model like expectation of y is equal to x beta, because of the fact that expectation of epsilon is equal to 0. So, in the usual linear model expectation of y is equal to x beta. So, that can be written in a linear combination of the regression coefficients, but here you cannot this is not true. So, here instead of expectation of y equal to x beta, there exist a non-linear function that is eta may be eta y expectation of eta y is equal to x beta. Anyway, we will come to this point at the end of this topic or module. So, let me mention one more thing when we use this generalized linear model. In generalized linear model G L M, the response variable distribution is not normal. That is what we know I mean if it is not normal, then we go for generalized linear model, but the response variable distribution must be a member of the exponential family. So, next what we will do is that we will learn what we mean by this exponential family. So, the exponential family of distribution. So, a random variable u here we denote random variable by u. Usually, we use the notation x or y that belongs to the exponential family with a single parameter theta has a probability density function f u theta which is of the form s u t theta into e to the power of a u b theta. So, this is the random variable. So, if the pdf of the random variable u is of this form, then we say that the distribution is in the exponential family where s t a b are all known functions. Let me rewrite this pdf. So, I can write this f u theta as exponential a u b theta and then plus d u plus c theta where d u is equal to e to the power of a u. This is equal to ln s u. So, this is log base e and c theta is equal to ln t theta. So, the pdf of a random variable u which is in the exponential family can be written in this form and when a u is equal to u that is when a u is a identity function. So, the pdf of a random variable u which is in the exponential family can be written in this form and when a u is equal to u that is when a u is a identity function, the distribution if the distribution is said to be in canonical form and this is important. This b theta b theta is called the called natural parameter b theta is called natural parameter and also there could be you know several parameters in a distribution like if you consider say binomial distribution it has two parameters. One is n which is the total number of trials and p or p is the probability of success in one trial. So, here we need to decide which one is the parameter of interest. If p is the parameter of interest then the other parameter n is called nuisance parameter. So, let me write it formally. So, parameters other than parameter of interest theta are called nuisance parameter are called nuisance parameters. So, let me talk about some members of the exponential families. So, the first one is say normal distribution with parameter mu and sigma square right. So, you know that the probability of success in one trial of interest pdf of now see you know whether the pdf of normal distribution which is f u mu. So, I am writing f u mu because u is the random variable and mu is the parameter of interest here and sigma square is a nuisance parameter. So, let me see whether this pdf can be written in that form. So, this one is equal to 1 by root over of 2 pi sigma square into e to the power of minus half u minus mu by sigma square and you know that this u is from minus infinity to plus infinity. Now, whether I can write this in this form say exponential you can check that this is u into mu by mu by mu by mu by mu by mu by mu by mu by mu by mu by mu by sigma square plus minus mu square by 2 sigma square minus half ln 2 pi sigma square. So, I will put this in bracket minus u square by 2 sigma square. So, this term is equal to coming from here and these three are basically this exponent here. Now, here now I need to identify all my this a u b theta c theta and d u. So, here I can see that a u is equal to u this is a u and b theta is mu by sigma square and what is c theta c theta. In fact, c mu, but you know the theta stands for the parameter of interest. So, here the parameter of interest is mu and this involves mu. So, c theta is equal to minus mu square by 2 sigma square minus half log 2 pi sigma square and d u is equal to minus u square by mu by mu by sigma square. So, since a u is equal to u, so normal distribution is in canonical form and this is the natural parameter mu is the mu by sigma square is natural. I mean in fact, mu is the natural parameter because mu is the parameter of interest. So, next we will I mean basically we will try to write down many distributions which are in the exponential family. The next one is see binomial distribution. So, u follows binomial distribution with parameter n and p and p is the parameter of interest and n is the nuisance parameter. So, here we call probability mass function, but f u p as you know this is n c u. So, here the u stands for the number of successes in n trials when the probability of success in is p. So, the probability mass function is n c u p to the power of u and into 1 minus p to the power of n minus u and this u the range of u is number of successes could be 0 1 and it could be up to n out of n trials. Now see whether this can be you know written in the exponential form that the form we talked about. So, this is equal to n c u I can write this as p by 1 minus p to the power of u 1 minus p to the power of n. Now let me write this as exponential u log p by 1 minus p plus n log 1 minus p plus log n c u. Now let me write this as exponential u log p by 1 minus p plus n log 1 minus p plus log n c u. Now let me write this as exponential u log 1 minus p. So, let me identify now the functions a u here is equal to u b theta which is the natural parameter b theta is l n. I repeat again this natural parameter is important thing we need to know what is the natural parameter for a particular distribution. So, b theta is the natural parameter which is log p by 1 minus p is the natural parameter and of course, then c theta is n l n 1 minus p and d u is equal to l n n c u. So, this does not involve any parameter of interest. So, c theta means it should involve parameter of interest in this function. So, the parameter of interest is p. So, c theta is this quantity and also this binomial distribution is in the in canonical form. Next we will talk about Poisson distribution Poisson distribution with parameter lambda and you know that the probability density function for Poisson is f u lambda is equal to e to the power of e to the power of e minus lambda lambda to the power of u by u factorial and here u is 0 1 up to infinity and this can be written as exponential u l n lambda minus lambda minus l n log 1 minus u factorial. So, clearly here my a u is equal to u. So, it is in canonical form my b theta is log lambda. So, this is the parameter this is this is natural parameter and my c theta is equal to minus lambda and d u is equal to minus l n u factorial and here is the natural parameter. So, this is regarding the Poisson distribution. So, it is in the exponential family. Next let me talk about number 4 gamma distribution with parameter theta of interest and alpha as Poisson's parameter and here is the p d f. So, f is equal to minus lambda u theta is equal to theta to the power of alpha u to the power of alpha minus 1 e to the power of minus theta u by gamma alpha. So, here all this alpha and theta they are greater than 0 and u is greater than equal to 0. Now, you write it in the exponential form. So, exponential minus theta u plus alpha log theta minus log theta minus log theta plus gamma alpha you put them in one bracket because this is basically the beta b theta plus alpha minus 1 l n u is coming from here. So, now you can identify that you know this is your a u is equal to u b theta is equal to minus theta and this is equal to this is equal this is c theta and d u. The next one is called exponential distribution. So, here the p d f of this exponential distribution f u theta which is obtained by just putting alpha equal to 1 here. So, this is theta into e to the power of minus theta u. So, u is greater than equal to 0 and theta is greater than 0. So, this can be written as exponential minus u theta plus l n theta. So, you understood that you know the a u equal to u and b theta equal to theta and c theta equal to log n theta. So, the natural parameter here. So, here b theta is equal to theta which is natural parameter. So, we are almost done next we will talk about one more distribution which is called the negative binomial distribution. So, negative binomial distribution. So, I hope that you know you understand the experiment here. The variable u is the number of failures observed to attain r successes in binomial trial with probability of success theta and then we will talk about the number the probability mass function for this one can be written as f u theta is equal to r plus u minus 1 c r minus 1. I hope you understand why this is the probability mass function theta to the power of r 1 minus theta to the power of u. So, u the number of failures before to obtain r successes it can be it can start from 0 1 anything. So, this can be my concern is to check whether this negative binomial distribution this is in the exponential family or not. So, this is exponential u log n theta 1 minus theta plus r log theta plus l n r plus u minus 1 c r minus 1. So, you must have understood that you know here my a u is equal to u my b theta is equal to l n theta plus 1 1 minus theta. So, the natural parameter is this is the natural parameter I am talking about every time because this is this is important to for the generalized linear model. And my c theta is of course, r l n theta and c d u l n theta plus l n theta plus l n theta is equal to l n r plus u minus 1 c r minus 1. So, this shows that the negative binomial distribution is in the exponential family. So, next we will talk about the expected value and variance of this a u. So, expected value and variance of a u. So, I am not going to derive it in terms of c theta and b theta. So, you can check that this is a variable is e of a u the expected value of this function is equal to minus c prime theta by b prime theta. And the variance of a u is equal to b double prime theta c prime theta minus c double prime theta. So, this is this stands for the double derivative of c theta with respect to theta. So, b prime theta by b prime theta to the power of 3 just you know you believe me this is correct. I am just giving one example example of say binomial distribution binomial. In case of binomial you just check that we had this a u equal to u and we had b theta natural parameter that was log p by 1 minus p and c theta c theta was n minus log n 1 minus p. Now, if you compute say you know b theta. So, you can compute b prime theta that is equal to 1 by p into 1 minus p. Now, so you can compute b double prime theta double derivative that is twice p minus 1 by p into 1 minus p whole square. And similarly, you do for c theta. So, c prime theta is equal to minus n by 1 minus p c double prime theta is equal to minus n 1 minus p whole square. So, now you can check that you know that for binomial distribution expectation of u is n p. So, you can check that expectation of a u which is nothing but for binomial it is expectation of u which is equal to minus c prime theta theta. So, that is n by 1 minus p by b theta. So, that is p into 1 minus p. So, that is n p we know that for binomial distribution expected value is n p and you can check that variance of variance of a u is equal to variance of u here which is equal to n p into 1 minus p. You just put plug all this here. So, this is sort of you know preparing preparation for the generalized least square because we told that you know generalized least square is used when the distribution of response variable is not normal, but it is the distribution is from the exponential family. So, now we have you know idea about which distributions are maybe this is not the exhaustive list, but we know some distribution which are in the exponential family like binomial, Poisson, normal, gamma, exponential, negative binomial. These are the example we just proved that they are there in the exponential family. Now, the thing is that suppose you have a set of observation like x i y i and the response variable y i is not in normal and it is say from it is from the binomial distribution the y i follows binomial with parameter n i p i. So, p i is the parameter of interest the parameter of interest and n i is the notions parameter. Then how to deal with such situation because till now we know that you know we talked about only if y follows normal then using the Goss Markov theorem normal and also independent identical distributed. Then by Goss Markov theorem we know that the least square estimate provides the best linear unbiased estimator for the regression coefficients. So, now we will talk about how to fit you know generalized linear model in the situation when the response variable is not normal, but it is the distribution of the response variable for is from exponential family. So, here is the fitting of fitting generalized linear. As I told suppose we have a set of independent observations. Suppose my observations are y i x i. So, y i x i. So, if you consider one regression then it is a simple regression, but I can generalized it I can make it a vector. So, this is my i th observation and then with this vector is say it has it consists of r regression say p regressors. So, this is my observation for i equal to 1 to n. So, I have n observations on the response variable y i and there are several regressors. Let me write this what is this x i prime is that it is x i 1 x i 2 x i p. That means it is a p component vector right and we have a set of independent observations from some random exponential type distribution of canonical form. That means that is a y is equal to y. Then the joint probability density function is. So, I have n observations and they are independent. So, the joint probability density function is f y 1 y 2 y n. So, I have n observations and they are independent. So, the joint probability density function is f y 1 y 2 y n. And theta and phi. So, this one is this joint probability density function is nothing but the product of marginals. So, this one is equal to and you know that it this observation is from the exponential type distribution. So, this I can write as exponential y i b theta I can write this because a u sorry a y is equal to y that is why plus c theta i plus d y i. But this is the p d f only for the i th observations and once you are multiplying this marginals here just you have to put summation here for i equal to 1 to n i equal to 1 to n i equal to 1 to n. So, where this phi is a vector of nuisance parameter that occurs within b c and d and my theta is theta 1 theta 2 theta n vectors of parameter of interest. So, this now what we want is that. So, we talked about the joint p d f and the variation in y in response variable y i can be explained in terms of x i values. Let me give some time I will try to explain what is the difference between this generalized linear model and then linear model. So, here my x is the regressor variables. So, this is basically x i 1 x i 2 x i p x i so I want to explain the variability in y using the regressor variable that is what the we want to find relation between the response variable and the regressor variable. This is the whole purpose of this course also consider the parameters that mean regression coefficients consider the set of parameters beta which is equal to beta 1 beta 2 beta p prime. Now, what we do is that we find some suitable link function. This is important link function say g is such that g of mu i is equal to x i prime beta. Let me just explain now see in usual case what happen is that we consider the model. So, what is this mu i mu i is expectation of y i. So, in usual case what happen is that we consider the model y i equal to say x i prime beta plus epsilon and then of course, expectation of y i in the ordinary case is equal to x i prime beta that is all. But, here if the response variable is not in normal. So, if it is from some exponential family then x prime y sorry x prime b the regressor variable explain the variability in g mu i. So, this is nothing but you know g of expectation of y i. So, here instead of writing expectation of y i is equal to x prime beta we write g of expectation of y equal to x prime beta and this link function a link function that is often regarded as a sensible one is natural parameter is natural parameter. So, this is I will talk about this again in the next class, but let me say just if this response variable is from is following binomial distribution. Then here this link function will be there the link function was or that will be the natural parameter. So, that is l n p by 1 minus p which will be equal to x i x i prime beta and in case of normal distribution we know the natural parameter is mu. So, in case of normal distribution this g mu is nothing but mu. So, when we assume that y follows normal distribution we can just write expectation of y i is equal to we can go with this model, but for the other distribution we need to choose this g function this link function which is nothing but a natural parameter. So, we will be talking about this again in the next class. So, today we have to stop now. Thank you.