 As I said in the last video, we now know how to code. We've covered all of the basis of code, and so now we're just looking at applications or some of those subtle newer things that we need to do, not that we know how to code, you know, we need to apply sort of our ability. And that's actually where, just to sort of demonstrate that, we'll talk about something like principle component analysis, which is a statistical model, and we're going to run through it to get the values that we're looking for. So what is principle component analysis? Big fancy $5 word, oh well, you know, it's meant to calculate the optimal hyperplanes when we're dealing with a large dataset. I know that that is also a very, that's not even a $5 word, that's a $10 word. But the entire idea, just to work off of say, for example, this research paper. So the entire idea is using something like principle component analysis, based on, or to look at food patterns to identify obesity and Nepalese adults. Okay, again, the entire idea here is we happen to have a number of variables. We can look at them, you can see, we've got different food groups going on here. And if I wanted to then look at this data and say, well, what are the, you know, eating habits of Nepalese adults? Right, that's a very big loaded question. How do I do that? Well, principle component analysis is a way to look at these variables. And if we happen to have all of the different food patterns from all of the different people in this population, we can do something like principle component analysis. Again, I know that's a big fancy word here, to identify patterns based on some X, Y coordinate. And optimal hyperplanes, you know, now we're dealing with, in this case, we've got one, two, three, four, five, six, seven, eight, nine, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23 possible dimensions, instead of it just being one X and Y, 23, right? That's what optimal hyperplane is referring to. But, you know, we can break this down and you can see that they have very common patterns and common patterns can be observed, like the mixed pattern where they're just eating just a mix of healthy foods, the not healthy pattern where they're just doing fast food and sweets. And then just, you know, meat, proteins, and alcohol, and then finally solid fats and dairies. The big idea is more this idea that they are able to identify these patterns by utilizing this model. Let's boil it down even smaller. Let's just say we were dealing with a one-dimensional plane. I have a vector of just 100 values. Well, looking at them, is there a way that they could be divided such that I am properly separating them out? Oh, well, yeah, you know, obviously we can see. I built this graphic specifically for that reason. I could split it down the middle somewhere here. And yes, this shows me sort of the very easiest way to divide and show variants of these two values. Ta-da. But what happens when I start getting into larger values? Well, in this case, you can see it's slightly, you know, tilted, it's not a perfect just circle. But realistically, you could say, oh, well, you know, let me make my color a little darker so it stands out. I could work off of sort of this angle and won't be perfect because I, you know, I'm drawing. But I could work off of this angle and we can see that this sort of angle is going to have a large, you can see, it covers a large amount of the data and the variants going on there, you know. So if we were to sort of just square this out, try and turn it into almost like it's a shape, right? Oh, well, you know, that covers a large amount of that data. I won't call that sort of an X or a Y, the point, there we are, the point is more that, you know, this angle covers a very large amount of our data. And then as you can already guess from sort of what I was drawing out, this angle sort of covers the second variants of our X and Y sort of platform. Now, visually, yes, that was able to be taken care of quite easily, but when we're again dealing with, what's the word, when we're dealing with 23 variables, you know, again, you can't find optimal hyperplanes like this in 23 dimensions or you can't do it visually. And if you can, you're not watching this video, you're a genius or beyond human comprehension. Anyway, my point is, well, based off of this idea of these lines, that's actually where we could start to represent something known as eigenvectors and these axes, these angles. We can see that because this angle is gonna have the largest amount of variance, right, this covers and shows the largest amount of variance of our values, we would classify this angle, this eigenvector as our first principle component and then the same kind of thing. Since we're dealing with two dimensions in sort of this 2D world, I would need to then look at the next dimension. In this case, two, if we were dealing with the eating habits, there are 23 variables, there are 23 dimensions or principle components. Not all of them are gonna give you the highest value, but they exist. Either way, to start with principle component, you have to do covariance, right? You have to calculate out covariances because what you're trying to do is see which variables have positive correlations or positive covariances or negative covariances or no covariances, right? You're looking for these variables and if you were dealing with this at some point, you would be shown this mathematical formula and that's the it, that's the only thing you would, you have to now implement that. Yep, so what does that mean? Well, okay, if we broke it down, we're talking about, well, given some array or list of numbers, each number in there is represented by its respective locations. We also are gonna need to get the means and then we need to know how many data objects we're dealing with. Now, I'm not gonna work off of sort of this amount of numbers because that's too many or this because again, it's very complex, but working off of just a very simple X, Y plot, here it is, you already can visualize that this is sort of going to be our principal component, optimal hyperplanes or axes going on there. Here's our largest and here's our sort of smallest and slimmest, it can go a little further but effectively you can see that, all right, good. Same kind of concept would come into play here. We would need to go out and calculate out the variance means and here's my N and then I would need to start going through those calculations as necessary. Now, in a nutshell, again, the slides can do that or we happen to have the ability of once again, applying this thing called Python to these calculations. So again, I've just taken those values and converted them into NumPy arrays. I do have the approximation Unicode symbol here. That's just mostly for visualization perspective but you can see that NumPy happens to already have the dot mean or have the way to calculate average mean or for you because it's such a common statistical model. Okay, well, it's already built it so you don't have to do that for you. It would also allow you to do median or mode or other types of statistical models. But either way, you can see, oh, well, now I have roughly 1392, 62, 42. We double check, yes, that does in fact equal what my slides say, I would hope so because I ran them together and filled them out. Anyways, my point is congratulations, okay, I've got the means, same kind of concept. I need to get that X prime where each individual X is subtracted by the mean. Well, remember that this is NumPy and so NumPy says I'm just gonna go element by element in X and subtract it by this scalar value. And so guess what? It automatically does it all for us. Oh, that's so beautiful, isn't it? It's so great, wonderful, magical, fantastic. I don't know where that X went. Same thing with my Y and then as you can imagine I can do the exact same thing off of these calculated X and Y primes. I just called them prime just as a term. But guess what? I got the calculations and doing a comparison off of those you can see we would get them, 115 and then a negative 6.9. They're in scientific notation but oh, there's that same value. And then we're almost done. We gotta take that, we gotta sum it together and then we gotta divide it. We gotta take it, we gotta sum it together and then we gotta divide it. And congratulations, that's how you calculate out a covariance between two arrays. That's not principal component analysis at all. That just told me the covariance. And in fact, that is correct. When we're dealing with a large data set, again I just found one covariance out of all of them. And if I happen to have 23 like the eating habits of Nepalese people, that's a lot to process. Again, we're only dealing with an X and Y because it's easy. But you can imagine you'd have to do this over and over and over again. Well, luckily as you can imagine, guess what? There happens to be a way to do covariance matrices with numpy, np.cov. Ta-da, there it is. And if you happen to notice it's not just that 2.94 which is what we were getting but it's also going to produce the covariance matrix for X compared to X because I need to maybe do that and then Y compared to Y. Because again, if we're thinking about principal components we're looking for where those values have correlations or covariance, positive covariance. So in that case, it can work. If you're feeling froggy, guess what? You can do a third covariance as well. In this case, you do need to make it a list that's more of a numpy thing, but it will do it for you. And so you can see, here's my covariance off of just X, Y. Here's my covariance off of X, Y and this magical random Z that I've produced. Now, I've already, I'm slowly cheating on it as well but as you can see here, we can in a nutshell very quickly with one step do this. It's fantastic, beautiful. And I'll, okay, to do the next step which is taking those covariance matrix of all those different values and guess what? We gotta build an identity matrix and we gotta find some lambdas and find a determinant and here's a great YouTube video that'll do that for you or no, we can just use numpy. Numpy.linalg, the linear algorithm or linear algebra sub-library inside of numpy happens to have its own function, ig, which will magically produce the eigenvalues and the eigenvectors for your principal component. And so here we go. You can, let me jump down here. I'm using the 2D, so just X and Y. And so you can see we're going to get a few different calculations. The first one, again, eigenvalues. Eigenvalues is representing how important a particular, eigenvalues are representing how important a particular principal component is. I was looking for those words. So in a nutshell, you see 6.18 and then 411. It's again representing how important a particular angle, remember if we're dealing with these in angles, a particular principal component is. So 411, again, if we're comparing to all of our other eigenvalues, that would mean that this principal component, this vector, this angle, is our most important, our first principal component. And here is the angle, the vector, the eigenvector of that value. Again, in a nutshell, just like we saw when we were drawing it out, right, it's this. We've identified that this is in fact our first primary component. And so again, just here's some more kind of details going on there. You can see that I can break that out even further where I grabbed the max. It's just gonna produce the max value and where that index is and in particular where what that eigenvector is. Again, if we do some jumping back and forth, those values are very similar to what we're, those values are what we're seeing going on here. In this case of principal component for the Nepalese diet habits, again, you're seeing those values. You can see where there were important values going on here and they kind of bold things out that are just larger. Our values are a little different because they're not, we're not dealing with just, we're dealing with random data on an X, Y plane but you can see, oh well, this sort of the Y value will have a higher change than the X value. And yes, I know that it doesn't look terribly like that but again, that's where as you can see, Y goes from in this case, 30 to 90. So there's a 60 digit range here versus zero to 25, which is not as pronounced. So again, Y is going to be a heavy component here going on. But as you can see, again, we can sort of produce them. This is a terrible drawing of sort of those. There are methods for drawing arrows onto these graphics but that is actually something we'll talk about a little later.