 In this video, we will see the example of where this linear regression and logistic question is used, but we will see only the linear regression in the paper. So, let us look at this paper, how learning artists can early predict under-achieving students in blended medical education course. It is hard to find a paper which explains linear regression in much detail in 2017 or the recently, because linear regression has been in used in education field for a long time. But since the authors might be using it for first time and it is in the medical education course, they explained in detail. So, this is a good paper to look at it to understand how linear regression. But I do not recommend you writing a paper in such a detail because everyone now knows what is linear regression and they are looking for the response or the metrics not the intercept values, the each weights and everything. But this paper is explained, let us look at this paper. It is the paper, it is how learning artists can early predict under-achieving students in a blended medical education course. So, in this paper, they want to predict that students who are at risk who are going to like final score is below 65 percentage. So, potentially say if we save students who are final score more than 60 percentage, the final score in 55 will be considered as at risk students. How to predict the students were going to get less than 65 marks in the final score that is the at risk. So, there are 145 students, however the data is excluded many. So, they have only 133 students over the period of six weeks. So, what is the data they are collecting? The data they collected is in the blended learning approach that is they use MOOC and classroom. The data they collected is the data they collected is login like weekly, mid-course, total course logins, how many times he logs in a week, in a mid-course or many times he logged in before the mid-same or many times total login times. And login before and after the end course of the exam like did the student really want to understand it after the NSEM something also. Also views like number of views daily, weekly and mid-same till mid-course. Also total course views, number of unique resources accessed, number of unique resources means how many resources he looked at it, how many papers he read, how many videos he watched everything. And the data type of resources he accessed. Also the forums like number of posts created, leads, replies, number of edits made in the course like and also how influential this particular post he created based on number of people he played to that post. And the time he spent on each of these sessions like weekly, mid-same, overall time on this particular learning environment. Also the grades at each formative assessment and participation in assessment regardless of submission of the answers whether the students submit the answer or not. If he participated in the quiz or the pop-up questions in the video if he answers those questions this also considered as a formative assessment, not the NSEM score. So they took all this in a multiple things like if we see the login can be classified into total login, number of login first off and second off if you combine the big loss. Total post, total post by student all these multiple features has been created they computed a correlation with the final score. So this you can compute using the correlation matrix you know how to compute it now. They computed each one individually and also they identified whether this is significant value or not because there are 133 students we know whether this correlation is significant or not. Significant is not telling the strength of the correlation. Significant tells that whether this correlation is reliable or not whether this part 29 is quite unknown. Significant does not mean this correlation is high or not that is a very important thing we have to understand. So what they do they see this 0.29 does 2 stars is significant by 0.01, 0.01 level yes. So all of them are like average correlation not really good maybe this is good correlation number of times you login and the assessment grade is a very good correlation with the final scores. Yes as expected assessment is really high correlated with the final score based on the performance in the quizzes or mid-term like assessment grade and formative grades. So what they did they combined these future values again this bit given to us 6 variables look at it. They made it into a 6 parameters of encasement. See the login indicators, subformative assessment, posting, time, course views and total unique days. They made it into a 6 variables x1, x2, x3, x4, x6 and that is a value and they used SPSS software for automated linear modeling that they called as ALM here but I or SPSS is the proprietary software I do not recommend using that but if we have access to SPSS 19 please go ahead and use it. But they use SPSS and they are reporting the values here. Let us look at the values. So the final grade is the value is here and the actual final grade is given in this dots but the regression scale is like this. So this is the difference between final grade and a predicted value. So also they are giving the weight of each of the x1. They can simply give y equal to x1, x2, y1 this equation but as I mentioned this is the paper which explains very detail about linear equation which is not needed for current setting because everyone knows what is this. So but yeah see the weight 0.16, 0.1 and formative assessment is 0.10. So there is a like a weight of each of these variables and intercept is not given and that is a problem you know intercept is not given we do not know what is the value but we do not want to interpret the intercept. So we want to interpret only this value. So login is the most strong indicator of the performance in the answer. So they computed the students performance at Riggs students and they also used the logistic regression to do that and they also plotted or Vosika's predicted using logistic regression, predicted using a linear model and predicted using logistic regression at the mid course. Mid course is if you can early predict you do not need to wait till the students all the number of activities post till the end sum if can I use only the first semester interactions can you predict it. If you use the mid sum so you know you see this I hope you know what is this line means. This line indicates area under curve is like 0.5 and this is average you know this is the average below S is not good. This particular curve indicates how early predict you can predict and this is the best you know predicting using logistic regression gives the better you know better classification accuracy precision and in terms of question and recall compared to the linear regression. So that is important you know this paper discusses what we learned in general about ROC curve, area under curve and also logistic regression and regression. Please go and read this paper explains how to collect data, how to collect features and how to use those features to predict the final map score. So you saw the what is linear regression also the logistic regression. Can you list down one or two applications of linear regression using data collected from classroom environment. The paper we discussed in a previous slide that we just discussed now is applying linear regression on a blended learning that is a student is interacting with the MOOC kind of environment where they have to log in, watch videos, read things, post in discussion forums in a moodle something like that. Can you think of application of using linear regression in a classroom environment and which data you can collect. So let us down the viewer answers and resume to continue. So I am not answering any answers here because there are a lot of things can be predicted. I also discussed about that at the starting of this week's course. You can predict students performance, students engagement, you can also use the students engagement in the class like a mid-sem attendance to predict the performance, a lot of things. If you have listed down something else is good, if you can access this data you can go and collect the data, please go ahead and collect data and see which one works. In this week we discussed only linear and logistic regression not in detail. This week is kind of less on new learning but I request you to go and explore the tools demo to you and use linear regression logistic regression, collect data and use existing data, go and check for internet in the data, use the data and try to apply and understand. So that is application of linear regression. Thank you.