 In the previous video we talked about K-Means clustering and how we can find interesting groups in our data. But how does K-Means actually work? That's easy to find out. We can use interactive K-Means from our educational add-on. This add-on is designed for teaching machine learning and it includes some wonderful instructive widgets. To install it, go to Options, choose Add-ons and select Educational. You will need to restart Orange to make the widgets from this add-on active. We will first plot a data set with three groups of data instances to see if we can retrieve them with K-Means. Now connect pain data to interactive K-Means. Besides the data, the widget also plots the centroids marked with squares, each one in its own color. Centroids are the assumed centers of clusters. Interactive K-Means places them randomly. Notice that each data point is associated with the closest centroid. In this way, the centroids define the clusters. But while we have free clusters, they're not the ones we would wish for. First, it would be better to move the centroids to the center of data points. We do this by pressing Recomputed Centroids button. The centroids moved. Look at the red centroid. Some green and blue points are closer to it than to the other two centroids. We need to reassign centroid membership so that the data instances are labeled with the closest centroid. We do this by pressing Reassign Membership. We can again move the centroid to the center of its data instances using Recomputed Centroids. And again reassign the membership. We repeat these two steps until convergence. That is, until the position of the centroids does not change. For our data, the convergence took only a few steps. And the algorithm found the appropriate clusters. Now let us try to place centroids so the algorithm would fail. Here, I will put the red centroid in between two groups of data points. And the green and the blue centroid so that they share the remaining group. Now press Recomputed Centroids and reassign membership a couple of times. The algorithm converged, but the clustering it found is not the one we had wished for. Today we have learned how K-means finds clusters with the help of an interactive visualization. We also learned that the initial placement of centroids is important. This is why K-means normally uses some heuristic for smart placement. And we run the algorithm a couple of times to report only on the best clustering.