 Hello everyone. Today we are going to discuss on the topic the impact of machine learning algorithms on applications. Let us see the learning outcome of this topic. At the end of the session, students will be able to explain the characteristics and working of machine learning algorithms. As we all of you know, machine learning algorithms have different roles in different type of applications. We have a different type of machine learning algorithms which are used to build a right model from the right data set. When we use a particular data set, we have to decide which algorithm is best suited for the particular database. So that's why we have selected top three algorithms in the machine learning which is classification, second, exhaustion rule mining and third one is a clustering. So when we have a large amount of data set or when we have a large amount of database, so first thing is we have to classify those data set. Meaning of classification is to put the label of the particular data set. So why the label is required? Because when we predict a model from the based on the data set where the class label is predicted for a given example of input data. So that's why putting a label of that particular data set is a very important. To do this, we put the label of every data set to identify a data. The best algorithm used in the classification is decision tree algorithm and as well as the support vector machine algorithm. Second, association rule mining. So association rule mining, when we use this association rule mining is when we have a large number of data set which frequently observe or frequently count atom set from the data set which found in the database. In that case we use association rule mining. To do this, we use a priory algorithm which is the best algorithm to generate a rules from a frequent atom sets which is found in the database. Third algorithm is a clustering algorithm. It is an unsupervised problem of finding natural groups in the feature space of input data. For this, we have to collect the data from the various sources. Then we apply a cluster on the particular data set and then we decide the input data to form the clusters. The best algorithm on this clustering is K-means algorithm and mini batch K-means algorithm. So first we go for the K-means clustering algorithm. In the K-means clustering algorithm, first we collect the data set from various sources and in the data set, we have to classify those data or we have to group those objects based on their attributes or features into a K-number of group because developer or user has to decide K-number of group or K-number of clusters which are totally used to find the features of the particular data. Here K we consider as a positive integer number and grouping is done by minimizing sum of squares of distance between data and corresponding cluster centroid from the data set. So when we use the data set with the groupings, then we have to classify those data in the different ways, in the different orders and in the different procedures. Then only we put the label those data and we use those data for the right model. So that's why K-means clustering is very important algorithm for the clustering purpose in the machine learning. So when we go for K-means clustering, we have to follow some steps to cluster those data. First, here we have to determine the number of cluster K based on our data set received from the particular sources. In that data set, we assume the centroid or center of the particular clusters. When we get the cluster, then we have to calculate the distance of those clusters from the particular centroid. For this, take any random objects as the initial centroid or the first key object in the sequence which can also serve as the initial centroid of the particular data set. Here K-means algorithm will act as some three steps which we follow importantly while calculating the centroids. These steps are shown in the form of flow chart. First one is start that is opening the data set. Then we have to find out the number of clusters from the data sets like K equal to 2 or K equal to 4, K equal to like in this way. Once we get the cluster, we have to calculate the centroid of those clusters. Once the centroid is calculated, the distance object to the centroid is evaluated. This is very important. Once we get the distance, then we group the atom based on the minimum distance. Then we are ready to move the objects as per their groupings or as per their similar properties. If the object moves to the particular group, then we can send those objects to the group one. If the object belongs to the other group, then we can send those objects to the group two. If there is no object is moving, then our procedure is stopped there. If the object is getting, then we have to move the object to the particular group. That is why this is shown in the program flow chart. So, when we follow the steps for K-means clustering, we have to decide the number of data which is less than number of cluster. Why? Because when we assign a particular data in the centroid of the cluster, then we have to calculate the distance to that centroid. Each centroid will have a cluster number. This is very important because without cluster number, we cannot calculate the centroid and we cannot measure the distance to that centroid. And the number of data is bigger than the number of cluster for each data. We calculate the distance to all centroid and get the minimum distance. Once we calculate the distance to the centroid, then based on the minimum distance, we decide whether the data belongs to group one or group two. This data is said to belong to the cluster that has minimum distance from the particular data. Once we get the distance and object clustering, then we iterate this until your data stay stable. Your data stay stable means no object move group occurs. So, for that we have to follow the steps is determine the distance of each object to the centroid, determine the centroid coordinate and group the object based on minimum distance. So, to follow these steps, again we have to iterate this till our data set or our data object remains stable. If the data object remains stable, then we conclude that there is a no movement of object or there is a no moving of any data to any group. And then we consider that data set is a final data set or we can say that group is a final group. After studying all these k means algorithm, there is a question for you, which of the following algorithm is not sensitive to the outliers. Options are a, k means clustering algorithm, b, k medians clustering algorithm, c, k Morse clustering algorithm, d, k metoids clustering algorithm. Think on this question and give the answer. Answer is k means clustering algorithm. Why it is k means clustering algorithm? Because it uses the mean of cluster data points to find the cluster center. So, that is why we use k means clustering algorithm because k medians clustering is used for different purpose. k Morse clustering algorithm is used for calculating the different modes based on the centroid. k metoids of clustering algorithm is used for calculating the distance of the based on the different centroid to that object. So, that is why which are the most sensitive to outlier is the k means clustering algorithm. So, the answer is a, k means clustering algorithm. These are the references for this topic. Thank you.