 This video is going to be divided into three parts. First, is a high-level intuition on what K-means is, what it does, and the algorithm. Next, we're going to represent it in math notation. And finally, use this notation to code an image compressor. I'm AJ Hathor, and let's just jump into the big picture. K-means is a non-parametric method of clustering. Non-parametric? That means computation complexity depends on the number of samples. Clustering? A group of similar data points to form clusters. How many clusters? Well, how many ever you want. This is defined by setting K in K-means. It's a tunable hyperparameter. So what's a hyperparameter? A parameter that is set by the programmer manually, and it's not learned by the algorithm. In summary, K-means is a non-parametric method of clustering where we pre-define the number of clusters. You got that? I know you got that. Now let's move to the algorithm. Step one, determine the number of clusters K you want to divide the data. Step two, set K random non-overlapping points as the cluster centers. You can just take these from your data points. Step three, assign each of the end data points to the closest cluster center. Step four, for every cluster, compute the centroid. Step five, the centroid is now the new cluster center. And then for step six, we just go back to step three where we repeated until convergence. So how do we define convergence? Well, after successive iterations of the algorithm, the cluster centers don't change that much. Then you know that the algorithm has converged. We'll end up with the end data points separated in K-distinct clusters. So here's a question. When we're computing the cluster center, why exactly are we considering the centroid? Well, in order to understand that, we'll have to go through the math and let's do that right now. As with any type of statistical learning method that we've discussed before, the idea is to optimize some objective or minimize a loss. For K-means, this loss is the distortion function. In plain English, this distortion function is the sum of distances of every sample from its cluster center. In this form, J is the distortion measure. Xn is the n-th sample. Mu K is the k-th cluster center. And R represents the cluster membership. Here it's an n cross k matrix with values either 0 or 1. R and k is equal to 1 for only that value of k, where the k-th cluster center is the closest. Otherwise, it's 0. So you can see that every row is actually a one-hot encoded vector. In k-means clustering, the only set of parameters we need to learn are the cluster centers. The idea is to find the optimal cluster centers, Mu K, that minimize the distortion. We determine this by taking the derivative of distortion with respect to Mu K and equating it to 0. Let's split the sigma over the terms. Now remove the common 2. Take Mu K out of the sigma as it is independent of n. And bring all the terms to the right-hand side. In English, what does this represent? Well, the numerator is the sum of all samples Xn belonging to the k-th cluster, while the denominator is just the number of samples in the cluster k. This sum of all samples in a cluster divided by the number of samples in the cluster is the definition of centroid. I hope you now understand why we are determining centroids in order to define the cluster centers. It actually gives us meaningful results. Now let's revisit the k-means algorithm, but we'll map it up a bit. Step 1, we'll initialize some values for the cluster centers Mu K. This is typically k-discreed data points. Step 2, we'll set the initial distortion to infinity. Step 3, determine the cluster membership RnK for all samples. Step 4, determine the new distortion J. Step 5, recompute the cluster centers Mu K with the centroid update rule. And for step 6, we go back to step 3 where we once again determine the cluster membership. We repeat these steps until convergence. That is, either until the cluster centers don't change between successive iterations or if the distortion J decreases ever so slightly in subsequent iterations. Let's now use k-means to build an image compressor. Take a look at these baboon images. They don't look too different, right? Well, they kinda are. The one on the left has hundreds of colors, but the one on the right has just 16. We can see this difference if we zoom in. On the left, we see a nice gradient while the image on the right is kinda choppy. The idea behind compressing an image is to replace every pixel in the image with one of k-colors. These k-colors should be chosen such that the distortion of every pixel is minimum and determining these k-colors is done with the k-means algorithm. Effectively, the size of the image is decreased significantly. Given just 16 colors, each color can now be represented in 4 bits as opposed to the 24-bit or 32-bit color. First, import numpy, Python's library for math and matrix operations. Create a class k-means with a constructor that takes 3 arguments. The first is the number of clusters that is k and k-means. The second is the maximum number of iterations to consider returning. And the third is the tolerance. This is the difference between successive distortion values to be considered converged. Next, define the fit function which does the heavy lifting. It takes the n cross d input matrix as an argument. Now, just follow the exact steps we described in the k-means algorithm. Initialize mu k as random samples. Initialize distortion j to infinity. Compute the cluster membership r. Compute the difference between current and previous values of distortion j. If it is less than the tolerance, then the algorithm is converged. Otherwise, update the cluster centers and continue. And we're done here. Now, we can build the image compressor around it. We define a function transformImage that takes the original image as input and outputs a compressed image of code vectors, the k-colors we need to determine. Iterate over every pixel and replace it with its closest code vector. In the k-means image compression method, we read the image, call k-means to learn the best code vectors, compress the image using these code vectors, and save the new compressed image. And we're done. Just run k-means test.py and it will generate the compressed image in the plots folder. And so we have successfully implemented an image compressor using the simple k-means algorithm. You can get all this code on GitHub, so check it out. And that's all I have for you now. If you like the video, hit that like button. If you like videos like this, well, hit that subscribe button for videos on AI, machine learning, data sciences, and deep learning. For notifications when I upload, ring that little bell icon. Links to the code and other resources used in this video will be down in the description below, so check that out. Still haven't gotten your daily dose of AI? Well, click or tap one of the videos right there for another awesome video. And I will see you in the next one. Bye.