 Welcome back. In this video, we will talk about performance metrics in machine learning algorithms. Consider we have a binary classification problem, not a linear regression. This is a binary classification problem. And you have ID, say there are a lot of values, say let us keep there are only 6 data for this example. And you have Yi that is the real value you have already with your test data set and your prediction value from output of the ML model. So, you are Yi and you are Y predicted. So, Yi is 0 for student will not pass the course 1, the student will pass the course. So, the binary classification problem is to classify the student in whether he will pass the course or will not pass the course. Our intention our most important factor is to identify the student will pass the course or not. So, it is about 1 not on 0. For the example described in the previous slide that is binary classification problem of predicting student whether he will pass the course or not and you also have Y prediction and Yi. For example described how do you measure the performance of the algorithm say you used some classification algorithm. If you think about it and list down at least 2 metrics. If you have more metrics please list down everything we will discuss that in the class. So, some of the metrics can be accuracy most people will think about that because accuracy is the tells exactly complete accuracy how much accurately the classifier can classify the data. And also precision recall and also you know head about kappa and f score or more like all you see you see we will talk about that in this class. So, if I have the results in this kind of a table if I want to convert this into this kind of a table format. For example, I have Yi this is true value true value is in your test set you add a Xi comma Yi you gave only Xi to the machine learning algorithm to get a Y prediction the Yi is with you. So, that is a true value and uses a predicted value. So, let us say this is a predicted value this is a true value predicted value is 1 and 0 and 1 and 0 this is called confusion matrix or in some other domain it is called contingency table making the able kind of things is like contingency but this is called confusion matrix. So, how many true value 1 has been classified as a predicted value 1. So, if the true value is this how many of 1 in true value will be as exactly classified as a predicted value also 1. So, this is 1 this 1 and this 1. So, this is 1 this is 2. So, how many of a true value 0 has been classified as 1 this is 1 and 0 classified as 1 it is only 1. So, now how many of the values true value 1 is misclassified as 0. So, true value 1 is misclassified as 0 and true value is 1 this is misclassified as 2 this is so the 2 values here. And so the rest is 0 to 0 that is true value is also 0 and predicted value is 0. So, rest will be only 1 this is the only 1 left with so that is 1. So, now you have this data. So, the data is here it is exactly what we did like a 2 and a 2. So, you know how to compute this confusion matrix from the given table. It is easy to compute using excel sheet or manually computing is also good. In excel sheet you can just write some macros or use the filters to compute these values. It depends on how many data you have. If you have too many data I recommend you writing a small script in Python. This is a good exercise for you to start writing the Python scripts to create such a table. So, now what we saw in a previous slide it is called true positive false positive false negative and true negative this is what I am saying. If the value is 1 and it is correctly classified as 1 in our prediction model it is called true positive. If the value is 0 but we wrongly classify it is a false positive because the student will not pass the exam but you say you will pass the exam. So, it is a false alarm is there you wrongly predicted it. So, false positive not a false alarm so false positive. And the student will definitely pass the exam but you classify it as not passed it is a false negative. Student will not pass the exam and you also predict they will not pass the exam so true negative. So, true positive true negative false positive and false negative. If you know what is type 1 and type 2 error you might now understand what is this error is matching to type 1 and type 2 error. We will talk about that in later but this is the table you have to understand. For predicted value true value and this is a table. So, by using that table we can compute the precision recall and accuracy. So, precision is true positive divided by true positive plus false positive that is for example you can say that among all users predicted will pass exam how many will actually pass. Let us see that in example we have we have example true positive and false positive. So, among the users you said everybody will pass how many will actually pass. So, you said that all the three students will pass the exam but actually only two will pass the exam right. So, that is precision. So, how do you compute it? You compute it by adding these two values that is true positive and false positive like then divided by 2. So, 2 divided by true positive divided by true positive plus false positive it will be 0.66. The precision score is 0.6. The recall value is it is defined as ratio of true positive to total number of actual positive. That is though the true positive is true and true positive plus false positive 2 plus 2 is 4. So, you will see that it is a 0.5. So, it is a recall value. So, if you have better recall and better precision you can call that algorithm is successful. But in this case it is only 0.6 and 0.5 it is not so great. What is accuracy? So, accuracy is correctly classified examples true positive and true negative divided by all the samples you have. The number of samples we have in this example is 6. So, it tells the overall effectiveness of the classifier and it tells us in all the sample how many times the classifier get the correct results. So, the maximum value for accuracy is of course, 1 because everything classified as correctly. So, n by n will be 1. If that happens the classifier is perfect. The most of the research we do will not have accuracy equal to 1. So, if we get accuracy equal to 1 and if we get accuracy 1 on multiple examples, multiple observations, multiple things that kind of algorithm will be moved to commercialized for the without doing any mistakes. So, most of the research will not report accuracy more than say 0.8 or 0.9 because we are trying to improve that and when we get the perfect that particular product will be commercialized. So, the product you see in the commercialized machine learning algorithms are perfected accuracy at least that should be happening but yeah. So, in our example if you want to compute the accuracy the true positive and true negative is 2 plus 1 3 the total number of samples is 6. So, 3 by 6 the accuracy will be 0.5. In next class we will talk about what is Kappa score and what is X score and what is other metrics we have in the machine learning. Thank you.