 Hello everyone, welcome to the session of Logistic Regression. In the previous session, we discussed the introduction to machine learning. Today, as a part of predictive analytics or machine learning say, we will elaborate the details of logistic regression. In logistic regression, we will understand the basic concept and then the steps of logistic regression method and how it is been implemented. Using excel illustration also, we will conclude the session of logistic regression. So, this is the overall plan of today's session on logistic regression. Remember, in business forecasting, so far we have discussed so many models. We have discussed the regression analysis also, but whether it is a single linear regression or multiple regression. But in that case, when we use regression, the concept of regression was the basic regression was you have independent variable, you have dependent variable and you take in, try to find the causal relationship or linear relationship between the independent variable and dependent variable. How the dependent variable is been explained by independent variable? The direct linear relationship we calculate in regression, basic regression. I will show you today also the fundamental difference between linear regression and the logistic regression. But in logistic regression, when it comes to the basic understanding of the definition of logistic regression, here it is also another statistical method or say machine learning approach. But here, your dependent variables will not consider the continuous value. Your dependent variable like y will be called as a predicted variable. That predicted variable will be dependent on the predictor variables, which are nothing but the independent variables. But the outcome of your model, logistic regression model will be dichotomous. It will be like 0 and 1, like yes or yes or no kind of thing. It will not consider the actual value of y. This is the fundamental difference between the logistic regression and the linear regression. Here, we actually call the entire process of logistic regression method as a one of the classification method. So, let us understand how this model work. Here, you can see logistic regression is a statistical method used for predicting the probability of an outcome based on one or more predictor variables like independent variables. So, here we do not calculate the value of y, like say here we calculate the probability of the outcome. For example, therefore, we call it as a classification method. Categorical outcome will be considered, the outcome variables will be categorical. And variable generally, that is called categorical variable, which we will need to calculate through this method of logistic regression. For example, we can think about, suppose in stock market, whether to buy a stock or sell a stock or hold a stock, the decision outcome could be yes or no kind of thing. You might say, sir, then how come the probability are coming? I will explain that. There are many examples of logistic regression. In the banking sector, you will have to give a loan, whether the bank would like to approve the loan to a client or not. What are the independent variable or predictor variables? The independent predictor variables could be say, the income of the person, it could be the asset, the debt, all these things, credit score. Based on that, the bank will decide whether to approve the loan or not. So, the outcome, the decision, how much amount of loan they are predicting or they are approving, that is not the issue here. The concept here is that, whether to approve the loan or not, that is called the logistic regression. That we are left to find. But that, whether to approve the loan or not, like yes or no, that we need to find, like 1 or 0 say, that you need to calculate through logistic regression. But how we will calculate? Through the probability outcome, through a function called the same void function and through that, you will calculate the probability first and then that will convert into a categorical outcome, 0 and 1. There are many applications like customer churn predictions. Predicting if a customer, whether the customer will likely leave the store or not or leave the service or not, that can also be calculated through this logistic regression, like you might have the buying pattern, different type of product they are buying. So, all this past data you have, the consumer behavior you can study. And based on that, you can predict whether the customers will leave the service or not. So, this kind of churn prediction are being very popular as a application of logistic regression. Insurance claim. There are many fraud claims come in the insurance industry. In that case, you need to use logistic regression and you need to improve the or calculate the accuracy rate of your prediction, whether the claim from a client, insurance client is a genuine claim or it is a fraud claim. This is a very popular example of logistic regression in the insurance industry also. Similarly, gender profiling, whether in the office or in a different domain we use whether the candidate is a man or a woman. So, that you can also decide based on the different pattern of the different activities or say different nature of the characteristics of a candidate, you can classify man or woman kind of thing. So, there are many such applications of logistic regression, whether the candidate will score in the exam, say pass or fail. So, this kind of applications you can also, how much hours the candidate is studying, based on that also you can take a call, whether the candidate will pass or fail based on the pass data and the logistic regression model. So, how this model fit and how does it work, that we will understand today. Here, we have written few more aspects of this logistic regression as an introductory part. Here you can see we will deal only the binary dependent variable having two possible outcomes. So, there are many more possible like success, failure, yes, no. You can see success, failure, yes, no, buy or don't buy, fault or default, say loan, survive or die. I will give you examples from couple of all these today, you will get to know. So, they are all dichotomous application of logistic regression, that means outcome will be binary, two possible cases. But there might be more than that, like a multi-lomial, like more than two outcomes, buy, hold, sell. So, this type of classification can also be done as a outcome variable. So, these are called multi-lomials, there will be only original logistic regression are also in there. So, different type of classifications are there. So, today we will consider only on the binary dependent variable, like yes or no or zero and one kind of thing with two possible outcomes. Therefore, we will quote the values of the binary variables, response as zero or one. Now, this should be independent, the outcomes or should be independent as well as the observations of the independent variables should be independent to each other. There should not be any multicollinearity, like the regression analysis we do as a assumption, that also we will be implementing here also, we will be considering that also, that there should be no multicollinearity among the independent variable. Now, based on this assumption and the based on this basic understanding of background, let us understand in deep the difference between the linear regression and logistic regression. Because our objective is to understand the logistic regression, but it is very much similar like the linear regression. So, because it is also regression. So, here you need to understand what additional aspects we are covering in logistic regression, which we are not there in linear regression and what is the advantage of logistic regression and how we do model it. So, let us understand that first point, both are supervised learning, because your dependent variable is there, here also one outcome variable will be there. So, therefore, both are supervised learning. So, you are supervising the machine, you are asking the machine or say system or your algorithm that you predict. So, therefore, you are supervising the system, therefore, both are supervised model. This is called supervised regression model, this is called supervised classification model. Why it is not a simple regression model? It is a classification model, because your outcome are not the actual value of y. Your outcome are classified into category, like yes, no, by, hold, like the example that I have shown you. So, therefore, like 0, 1 kind of thing binary. So, therefore, we call it is a classification model also. At the same time, your basic regression are called as a predicting the continuous outcome, right, like y can be anything, like 1, 2, 3, 4, 5, 6, anything, any continuous variable also it can take any real value, but like money cost, etcetera. But here, you are predicting variables of outcome are categorical, as I told about like binary kind of thing, 0 or 1, right, or yes or no kind of thing, die or survived or say, you know, default or not default. Also, this in linear regression, the models the relationship between the independent variable and dependable through continuous outcome of dependent variable, which I told and it can take any real value. Whereas, in logistic regression, it models the relationship between the independent variable and the probability of the binary outcome of the system. So, remember the basic difference here, you can see here it is just a simple relationship, right, between the dependent variable and independent variable, but here it says that it models the relationship between the independent variables, between the independent variables and the probability of outcome as a binary decision making, look at here. The probability are being calculated using the help of independent variable. So, this is the difference part. So, y will be replaced here with a different function that is called sigmoid function or logistic curve, how that works that we will understand. So, this is the overall, you know, concept or difference between the linear regression and logistic regression. Here, you can see simple regression analysis. So, here, you know, if you have a data, say, for a given x, say, x and y, you can fit a line, right. So, this is what your simple regression, but this graph is nothing but the same graph, but when your data is, you know, plotted like this way, the x values are plotted in this way. And how we will fit this line that we will discuss and we will convert the concept into a logistic regression model. But when it comes to the logistic regression, as I told, this y will be replaced with a orderly c of log function, ln function. So, that we will convert eventually into a formula called sigmoid function formula and that will be nothing but the probability of your outcome, look at the probability of your outcome. It is not binary yet. It is a pure probability. It could be the range of p which will be 0, less equals to p, less equals to 1. Remember, the outcome is we are talking about binary or the yes or no kind of thing categorical. But effectively, outcome will be the formula of sigmoid function of this graph will be like this, but we will classify them at the end into binary of 0 or 1. We will classify them like hold the stock or say, you know, sell the stock or the person has died or person is survived. So, this type of, you know, yes or no kind of thing, approve the loan or do not approve the loan, yes or yes or no, but there will be a probability in between. So, 0.45, 0.75, so then you will be confused, right, whether to approve the loan or not, that will understand and then after that you will put a cutoff point in the middle and then you classify, then you classify your data or the outcome into two category yes or no. So, therefore, this is called the basic logistic regression with binary outcome. Now, let us illustrate this, how this calculation has been done and how this has been developed the method of logistic regression. So, here we have come to the normal screen and here you can see this is the basic regression, basic regression. So, what we have discussed that I am trying to illustrate how the transition has been happened from a logistic regression to a logistic regression when the data are given in a or data or the observations of the outcome are been required in a categorical format. So, let us first understand that now, but as I told we will calculate the probability and the probability will be converted into categorical outcome at the end, but now first understand this is your say regression. Now, what is the regression means it is suppose x, this is your y, let us start with the scratch again and this is your data say, right, this is your data and you can fit a line like y equals to beta 0 plus say beta 1 xa, beta 1 xa, right, say 1 variable. If you have more than 1 variable you can then add more. Suppose here in a two dimension we can understand the concept, so we are focusing only on two dimension concept now, right. So, this is your basic regression. Now, if your data are given say you know most of the data of y are falling in this side and another data are falling in like this side say the y value. In that case how will fit your regression, how will fit your regression for this data? If you fit this line it is going up it is a linear regression like y equals to say alpha plus say beta x or say whatever. So, it is a linear line, so it can go up to any infinite range, right, it can go down also. So, how will restrict that because your outcome should be you know in a probability value as I told the probability the occurrence of observation should be measured through probability not through the actual value of y. So, you cannot put this type of line, so you need to cut this line right somewhere you need to cut this line, so that it should not go thus there should be sinning it should not go beyond 1 it should not go below 0. So, you need to put a sinning, so you have to put a sinning you need a function which will capture this type of nature of the graph, so that the function should capture say let me open a pen you will get to know the function should capture a the nature of the outcome in this manner. So, so that your most of the data are been covered and it is not going above it is not going this side it is not going below the 0 point. So, therefore, these points are means also covered this is your x, but this y value look at this y we are replacing with say probability right suppose we will discuss that, but data observations are here say for the timing, so observations are here say now what happens since observations are here most of the here could be a couple of observations, but most of the observations are about lying here and here, so we need a function which will replace this y equals to alpha plus beta x or say you know y equals to you know beta 0 plus beta 1 x the way we have written in our today's session. So, the notation let us use that similar notation therefore, if this type of linear regression are there you cannot fit the actual data effectively it is a wrong prediction are coming because it is going out and going down also. So, you have to cut you have to put a sinning in below as well as the in the top also this type of sinning can be represented through symbolite function through symbolite function look at whatever the value of y say or x whatever you can put z also whatever the value of function you can generally the symbolite function are been defined like this f x equals to 1 by 1 plus e to the power minus x. So, this is the you know symbolite function formula or people define f z say equals to 1 by 1 plus e to the power minus z. Sometimes people define it like this way also you know people define it equals to e to the power z by you know 1 plus e to the power z, but we will define our you know formula of symbolite function through this particular relationship. So, now how does it work? So, therefore, since you cannot take this particular graph because it is going up or it is going down you have to cut somewhere. So, you need a function like this. So, that function are been developed through symbolite function or logit function and that will be replaced in place of this basic regression line will replace through this and that can be defined the relationship of the data and the data means independent variable data and the probability can be defined with the logistic regression formula. This is the final formula of your logistic regression how it is been developed I will explain in the next slide. Now, let us understand how this logistic function is developed from this symbolite function right. So, this is the you know general formula of the symbolite function which I have mentioned here you know generally people define like as I told people define by 1 by 1 plus e to the power minus z. So, this way you can also you can define is a function of that you know you can you know you can mention this also f y this is this is nothing but this f y is nothing, but you can develop as a later stage you know in terms of p which will develop like this right. So, how this you know f y function are been calculated or replaced through you know this is f y is a replaced through this function that we will discuss this is nothing, but the probability this will be converted into probability which I have mentioned as a p, but how this works and how the logistics function have developed let us see here. So, this is the overall function because you cannot you know put the line that you have to cut and there also you have to cut you have to put a ceiling in both side and you need a function like this. So, therefore, we are putting this s type graph or say symbolite graph right. Now, this is the function of the probability right. So, symbolite function we have captured let us say now since p is nothing, but your probability and earlier you had the line say y equals to say beta 0 plus say beta 1 x right, but that we are going to replace we will replace that because you cannot put the line regression line you need to put the symbolite function. So, we will put this we use this formula for the time being it is a probability, but we will use this formula as I mentioned the symbolite function or nothing, but 1 by 1 plus e to the power minus x a. So, this formula say let us say. So, this formula are the basic symbolite function formula, but let us see how we can calculate this particular velocity regression model. So, p is here. So, if you calculate 1 minus p if you calculate 1 minus p I can you know write this in a simpler manner it would be 1 minus this value 1 by 1 plus e to the power minus say beta 0 plus beta 1 x right. It will look like this and after rearranging your value of p. So, let me write here p by 1 minus p will be if you do this by this and this if you take these two value here it would be e to the power beta 0 plus beta 1 x after the calculation. If you take p by this ratio p by 1 minus p this value by this value it will be you know if you after calculation it will be like you know e to the power minus of beta 0 plus beta 1 x by 1 plus e to the power minus beta 0 plus beta 1 x. So, if you do that do that the denominator will be all canceling out and it would be ultimately effectively after this adjustment this e to the power minus of beta 0 plus beta 1 x will go up and it will become like this. So, effectively you found this formula now this p by 1 minus p is what it is nothing, but the odd value probability p is the probability of the success the occurrence of the chance of success 1 minus p is the chance of failure. So, this ratio are called the odd ratio. So, we call it is a odd of the function. Now, if you take ln of now this is the formula for your intermediate understanding, but the logistic function formula is here how will get it take the ln in both side. If you take the ln on both side this ln of both side ln of both side here this value will remain like this the left hand side will become this and right hand side will become ln and e exponential value will cancel out and the entire simple formula of linear relationship will come. If it is one variable it will be like this if it is a more than one variable it will be look like plus say beta 2 x 2 kind of thing if it is a more than one independent variable, but overall we are focusing only one independent variable. So, overall your formula will be like this. This is the logistic regression formula and it is been derived from the probability function look at here. So, this is the formula here in this case your p lies between say to some extent 0 to 1, but when you take p by 1 minus p it is to some extent you know 0 to say infinite it is ranges come. So, in order to put the range of ceiling you take the ln of both side and it is to some extent it is become minus infinity to plus infinity what does it mean this right hand side is your actual real value of x right and it is a linear combination. So, it can take minus x 1 can be anything x can be say independent variable it can be minus value it can take positive real value also high large value also. So, it has a infinite range right. So, therefore, your left hand side should also be look like this. So, when you put a ln value effectively it takes care of the range from lower infinity to minus infinity. So, the relationship is been well established effectively and this formula are called the logistic regression formula, but in calculation we will use this basic symbol function formula p equals to 1 by 1 plus e to the power minus this y y is nothing, but y what y is nothing, but 1 by 1 plus this y you have replaced 1 by e to the power minus of beta 0 plus beta 1 x. So, this formula we will use in our calculation of logistic function and look at here how we have look at here now how we have replace this function you can see look at here this was your data sets a data sets were like this here couple of data suppose here you will see when I will draw couple of examples you will get to know how the data are bit pattern because data are being scattered or the plotted like this way outcome variables. So, therefore, you need the replacement of logistic regression line and you need a symbol function or logistic function and therefore, you need this type of formula and you replace and which will cater the pattern of the data as a s f that is it this is your logistic regression model straight line 1 or your linear regression model which we are replacing now with a symbol function and the corresponding relationship of logistic regression have been developed this is for only understanding of logistic regression formula, but in calculation we will use this formula to calculate the probability value. Now, let us summarize the process now the point here is that remember overall we understood that and the we estimate the probabilities of the observations belonging to a class especially say p equals to 1 say success as a binary case which determines the probability and the observations belonging to a class 1 say approve the loan right, but there could be another case that do not approve the loan. So, in that case it could be 0 say. So, there are 2 class yes or no. So, 0 case will fall in this side and 1 case will fall in this side approve the loan. So, you need a middle point right cut off point right say cut off point cut off point. So, this cut off point sometimes people use as a you know say 0.5 0.5. So, whatever the probability are coming for any new candidates or new you know calculation process or observations of independent variables you will get a corresponding probability. So, that probability will define whether the outcome will be yes or no categorical whether it will be approved or not approved. So, that will be decided based on the new variable and the formula of that once you develop the formula you might say that is how will find the values of formulas this formula this beta 0 and beta 1 are nothing, but the coefficients of logistic regression which can be calculated through you know any software or say you know maximum likelihood function. So, we are not focusing that, but once you understand the formula and the corresponding beta 0 and beta 1 I will show you in excel how this you know all this entire formula are been calculated and the probability function can be developed. Once that is done your entire structure is ready of logistic regression and remember this is nothing, but your odd value right success by failure and this can be defined like this probability of success by failure is nothing, but your odd and otherwise in other way probability is nothing, but odd by 1 plus odd we called as odd ratio odd ratio it is been used sometimes for any different calculation of the application domain for different data sets. So, therefore, this is the overall you know summary of your logistic regression where the how to develop the cutoff and how what is the P value say symbol function and how you are converting that P value with their symbol function to a logistic regression model this is what your model right this is what your model final model of logistic regression. Now let us understand the cutoff point how to calculate the cutoff point that we are going to discuss now here. So, we have developed your model say logistic regression model say ln of P by 1 minus P equals to say you know say beta 0 we have used beta 0 right here plus beta 1 say x. So, this is a basic one one independent variable or predictor variable we are considering suppose this is your formula and this is your calculation through symbol function which you have replaced here right. Now look at suppose function is ready now. Now suppose you have a new observation like in regression what you do in regression you do y equals to say beta 0 plus beta 1 x right in regression and once you fit the regression line regression line you put fit the regression line and for a new x say for new x say x equals to say 50 you put in the line x equals to 50 you put here x equals to 50 you will get the value of y you will get the value of y here corresponding y you find that is what your basic regression similar way same logic you can implement here also because you have fit the model graph now you have fit the model graph now. So, now what you do for a new candidate say new candidate say you know whether the candidate will pass or fail how much how many hours every day he study or concentrate on the research. So, based on that you can think about the pass or fail we will get a degree or not all these things you can think or say you know say credit score. So, all this suppose you can take a example for a logistic regression application. Now, suppose you found that 50 is the score of the candidate say credit score or all say he studied etcetera. So, based on that you can see 50 is coming out here. So, p equals to 99 percent because in the you can put 50 here and beta 0 and beta 1 once you get the coefficient of beta 0 beta 1 using the software also you know maximum likelihood function you will get the formula of this. Now, you have to calculate the probability for the symbol function. Now, you put the x value a new 50 say x say new value and this corresponding you will get the probability like here in regression look at the regression look at my mouse look at here like look at this part. So, like the like the way you do the regression you put any new x you will get the y value right here. Similarly, here also you put any new x new x you will get through this function because all the function is been developed you will get the corresponding probability look at here 0.85 0.85. So, for 40 for 50 99 percent. So, one clear case. So, candidate will pass because he is studying more or say civil score is high or say you know candidate age is like this whatever. Now, so age is low or say civil score is low what also not studying effectively. So, therefore, the probability say only 0.7 say 7 percent say right here you can see very closer to 0 30 it is a 23 percent. So, in that case you can put a cutoff. So, that you can put a cutoff line you can put a cutoff line. So, that you can classify the data of a new candidate into two category yes or no approve or don't approve pass or fail right survive or died. So, this is what the cutoff point decide. Generally, people decide the cutoff point as a 0.5, but it is not fixed you can change it. Suppose, if you think that majority of candidates are falling here 0.5 say 0.6 etcetera. So, you can you know increase or reduce the cutoff point, but generally 50 percent are benchmark cutoff below that suppose if somebody comes with say 0.49 the outcome of the candidate the probability of that outcome based on the data sets based on the data sets we are getting suppose 0.49. So, 0.49 is very closer to 50 percent right. So, in which category you will put that candidate approve the loan or do not approve the loan. It is your decision because you have put 0.5 as a cutoff point you can say that no will not approve the loan because it is a classification you decide that no will not approve the loan, but if the candidate is of 0.51 51 percent chance that he will say like you know outcome of your model say based on the data set score and all these things we found probability as a why not why actual value we are deciding whether to approve the loan or not. So, you are finding say you know 0.51 the score of success that the person will be able to repay the loan say. So, you are getting 0.51. So, 51 percent. So, you put the category put this fellow into the approve loan category. In that case if somebody come up with 85 percent say chance that he will repay the loan even this fellow on this 0.51 percent of chance of approving like repaying the loan that fellow is also coming into the same category because you have considered into two category. So, this is what the classification point and the cutoff point you decide you can reduce the cutoff point based on your wish you can increase the cutoff point, but you can consider it into two category as a binary case of final 0 or 1. All this case will fall into all this case will fall above 0.5 into one category and all this case will fall below 0.5 as a no category or 0 category binary variable binary 0 or 1 that is it the final outcome, but effectively we are calculating the probability look at for a new candidate for any new candidate we are calculating the in a reverse manner you are calculating the p value look at the formula we are calculating the p value and here this logistic regression function formula developed through your actual relationship look at the linear combination of the data and their relationship with p are been defined through this logistic regression. So, this defines the range of the data sets this is the probability which is nothing, but 0 to 1 the range of p the range of p is nothing, but 0 to 1 right that is fine this is the final formula of your symbol function which we have placed here, but we are defining that through logistic formula logistic function formula because you need to define the range right. So, here p range is 0 to 1, but this says that it can be minus infinity to plus infinity because this is the right hand side is a linear combination which has a no limit therefore, this should also have a no limit, but p by 1 minus p is also has a issue because p by 1 minus p ranges from 0 to infinity. So, it has a it has no ceiling, but you have to cut a ceiling point right therefore, you need to put a range which and the lower side and upper side should also match put together in with the right hand side we put a ln function and which how the ln function are derived that I have shown in the previous slide. So, this is what the overall aspects or understanding of logistic regression and if you would like to see the summary the first point the overall summary of you know or understanding of say you know inter logistic regression. The logistic formula regression formula are been defined through this or through this. This ranges from 0 to 1 and this ranges from minus infinity to plus infinity look a negative infinity to positive infinity. So, this the relationship I mean developed now right and this is the odd ratio like success by failure which has a 0, but it has a positive infinity. So, you cannot consider this formula right you cannot define this formula rather you convert this by taking ln which is very easy to understand and this formula are called the basic or representative of logistic regression formula, but effectively we will use this for our calculation purpose right. Now, look at that therefore, the final final point therefore, the logistic regression which is the output of linear regression between 0 to 1 look at the 0 to 1 and we can easily predict the class in which it belongs like lower case or upper case and this beta 0 beta 1 or if you have a more variable say you know if you put x 1 plus beta 2 x 2 etcetera for more I will show you an example for more independent variable in that case all these coefficients can be calculated through maximum likelihood function equivalent with least square here because it is a very complicated logistic function are there you have to take partial derivative and all these things based on the number of very parameter beta 0 beta 1 etcetera. So, you need ln function and use the maximum likelihood formula and there you can calculate the coefficients value using software say excel state or real state or adrix or you know python easily this calculation are been the derivation of this calculation of coefficients are been done there. So, we are not discussing that part in detail assume that we get the beta 0 beta 1 coefficients and we fit the model and then how we can apply it in real data. So, let us go the application process to understand the inter logistic regression model with couple of illustration.