 In this video, we will discuss Naive-based classifier. So, what is Naive-based classifier? It is based on the Bayes theorem. We saw what is Bayes theorem in the last class. And it has a new assumption and we said that the features are independent, but all are related to the dependent value. So, all x1 is dependent to y, x2 is dependent to y, but x1 and x2 are independent events. That is the idea. So, to just imagine, you know, it is like this, if you want to imagine that say it is in the same Venn diagram, if it is x1, this is y, x and y are dependent. There is something called x2, there is x2 and there is something called x3. There are multiple dimension features can be possible. It is not possible to explain in simple terms, but that is exactly what the Bayes theorem will do. Get that imagination of the figure, where there are features which are independent, but all dependent on one particular predictor or one particular dependent value. So, it is simple Bayes theorem is this. You remember that from the last video. Let us see how Naive-based classifier can be used. So, this is simple example I created. If the attendance is in this percentage, attendance in percentage in the pass mark in midterm is this. So, let us see. So, attendance in percentage, pass marks in midterm, pass in midterm is yes, is pass or fail. So, yes or no, pass the exam or not pass the exam. So, from this table, we can construct the free considerable. What is free considerable? Free considerable is simple. Let us make a bins out of these values. I am going to 3 bins here, 40 to 60, 60 to 161 to 70 and 71 to 80 or above 80, 71 to 80 and above. It is not just 80, it is 80 and above. So, I have the 3 bins here. Now, I need to compute x and y, s, how many people will pass in this? 40 to 60, 1 yes, 1 no, 1 yes, 1 no, I should, let us see the yes only, only 2, there are 2 yes. And let us see no, 40 to 51, no, 2, 2 and 3. So, 3, it is just a free considerable you can construct from the given values. Here it is only 1 x and 1 y, it is like independent variable here. And for 61 to 70, similarly there will be 61 to 70, there are 1 and 2, 2 yes and 1, 3 yes. Let us change this value to no, 65 to no, then it will be like 2 and 1 and 71 and above, it is only 175 that is x, that is good. So, if you count these numbers, total s is 5, total no is 4, total numbers is 9. So, probability of s is 5 by 9, probability of not pass is 4 by 9. That is a simple thing you can compute. Also probability of 40 to 60 is 5 out of 9, 3 out of 9, 1 out of 9, that is numbers also you can compute easily. So, this frequency table tells you what is the value of each bin and their corresponding s and no probabilities. So, probability of s is 5 by 9, probability of let us change this also just to be clarified, so this will be no. So, probability of s is 5 by 9 computed by this value, probability of no is 4 by 9 exactly. And probability of attendance 40 to 60 is the total number is 4 by 4, 3 plus 2 that is 5, 5 by 9, probability of 61 to 70 is again 3 by 9, 70 to 80 is 1 by 9. So, that is the probabilities you can compute from this table, like in the column wise, also in each low wise. That is the value you can get from the Navier's classifier. Let us use this to answer some questions. So, let us see. Consider you have this table, kind of this is the table you got it from the given input variable. And we want to ask you question, what is the probability of passing the exam if a student's attendance is 64? What is the probability of student attendance 64 being passed the exam? Simply you can go ahead and see out of this 2 by 3. So, 2 by 3 is the possibility to get it. But exactly if you apply it in Bayes theorem also, it is the same answer. Let us see how it goes. So, write down the answers, check apply in Bayes theorem steps and find what are the events and write it down after writing it down, resume video to continue. So, if you apply the Bayes theorem, you want to know probability of passing the exam equal to as given attendance 61 to 70. So, which means you have to say probability of attendance 620 given the students is passed and probability of passing the exam also equal to y, you would have a probability of attendance 61 to 70. So, probability of this, so this probability right, probability of attendance is 61 to 70 given the student passed the exam. That is exactly the student passed the exam is the students, the 5 events the students can pass the exam, 5 students pass the exam in the 2 of them are in the attendance 61 to 70 bracket. So, that is 2 by 5. This is exactly this 2 divided by the total number 5. And probability of pass we have computed already 5 divided by 9, that is the value. And the probability of an attendance 61 to 70 is this right, the 3 numbers, 3 were 3 times the student is in this range, so total number is this. This table is computed from the training value. This new student coming in asking what is the probability the student will pass exam, if attendance is 61 to 70, what is the probability will pass in the midterm. So, if you apply this 5, 5 will cancel and 2 by 3 will be the answer. It is simple, you can compute directly from the table, but imagine there is not a 1 variable, the 2, 3 variables, then it will be a bit tricky. That is why we can use Bayes theorem there. So, this is a very simple example where we can apply Bayes theorem to solve this training and predicting which student will pass the exam in the midterm or not based on the attendance. Let us see how we can apply the Bayes theorem into multiple classes. In a multi-classes problem, C1, C2, C3 up to C1. So, instead of a class 1, there are multiple classes. It is not a pass or fail, it can be getting less than 30, more than 30 to 50, 50 to 80 and 80 to 200, something 4 classes or student will drop out in first week, student will drop out in second week, something like that or different classes you want to classify. If you want to classify in multi-classes like C1, C2, C3 and your multiple features, like interest, deadline of other works, students sleeping at the time or student is working on different activities or multiple features or students engagement in the class or students attendance, midterm marks, a lot of things are coming here. If you have that kind of multiple features and multiple classes to predict, it is called multivariate in the regression. But if you have that kind of problem, simple Bayesian can be extended to solve this problem. That is probability of CI, that is suppose for the class 1, probability of C1 given all the features a1, a2, a3 to an is probability of CI into a Bayes theorem, simple Bayes theorem, probability of a1, a2 up to an given the particular event divided by probability of all the event occurring. That is exactly the Bayes theorem application. This is exactly where you want to use a Bayes theorem. If you have this kind of equation, we had a naive assumption, you know the assumption is that a1, a2 are independent, which means probability of computing this probability is easier. So, this is this. So, this is very simple to compute because we have a naive assumption that is a1, a2 are independent, which means probability of a1, a2, a2 all occurring given particular class C1 can be split into probability of a1 occurring given the label 1, probability of a2 occurring given the label 2, like similarly a3 by CI till an by CN. If you have this kind of product rule, then you can make it is a simple product of probability of a1, a2. So, product of all the features will occur given the particular event is for that this particular equation will be solved. If you have that and you might have the probability of CI already and this is also nested will see how it happening. So, if you have this particular equation, it is easy to compute. This will compute probability of CI given suppose CI equal to 1. Then we have to compute the same probability for CI equal to 2, CI equal to 3, suppose we have 4 classes, we have to compute 4 different probabilities and find which one is CI, you have to pick that one. So, since we are comparing this CI equal to 1, 2, 3, 4, what we can do, we can ignore this term, because this will be the common denominator in all the comparisons. So, we can ignore that. So, the Naive-based classifier simplified is very, very simplified into a probability of all the event occurring given CI that is exactly this product rule into probability of CI that is it. So, what you have to do is take the class 1, suppose if you want to do the class 1 probability of C1 given all the features like a1, 2, a3, a1, a2 and a3. It is simple product rule of, product of a probability of a1 i equal to 1, 2, 3, ai divided by C1 into probability of C1, that is it. It is very simple to compute. Similarly, you have to compute for probability of C2 given a1, a2, 1 all the 3 events occurs, probability of C3 given a1, a2, a3. If you compute all 3 probabilities and you have to pick the probability which is higher that is the particular class of that particular given events. So, this is how you train the Naive-based classifier. So, from the given input table input values, you have to compute all these values like probability of a1 given C1, probability of CI, probability of a2 given C1. So, you have to compute that frequency table. If you have that particular frequency table, it is very easy to compute the new features, what is the probability of student pass the exam, not pass the exam, but then get more than 30 marks, less than 30 marks, all this different classes can be classified. So, that is what all about Naive-based classifier is. So, here is one simple example. This example is a very simple example. Consider student has passed in all other exams in previous semesters, he is taking a current semester course in one particular subject, in your subject. And he has passed all the exams, if he passed it is yes, if he is not passed it is no, yes, no. And students attendance also low, medium, high, we split into 3, maybe less than 30, 30 to 60, 60 to 100 something like that. You split into 3 bins and that is a bins value, it is a categorical variable. Then you have midterm marks also categorized to 35, 35 to 60 or more than 60, that is it, 3 bins. Given these values, you want to predict the Y, X1, X2, X3, you have to predict Y that is the student pass in the final exam not. If it is passed, it says, if it is passed, if it says it is no, I did not know. So, the question is, what is the cross probability of a student with the midterm mark equal to 55 and attendance is low and he has not passed in all, he is not passed in all other subjects. He has not passed all in all other subjects. So, he has not passed all the subjects, might have passed 2 or 3, well missed 1 or 2 subjects not passed the exams. Given this value, what is the probability? So, it is simple. So, what we are asking is probability of, probability of final exam is equal to yes, that is a class C1, C2, so only one class here occurs. And given the pass in all other subjects, so other subject is no, that is A1, attendance is also low, that is A2 and midterm mark is 35 to 60, midterm mark is 35 to 60, what is the probability? So, if you can compute the frequency table from this figure, this table, you can compute this very simply because it is simply the probability of passing, you know, probability of passing the exam that is 1, 2, 3, 4, so my 10 in the 3, 7 by 10. So, probability of yes equal to 7 by 10. And if you compute the probability of like A1 given CI, so what is the probability of A1 is equal to no given this value. So, if you can compute that 3 values and you can simply substitute, you can get the probability of whether the student will pass the exam or not. So, that simple is Naive-based classifier. Naive-based classifier is basically based on what to say the Naive-based theorem and with the Naive assumption and it is computing the frequency tables and using that table to predict the future events. Thank you.