 The introduction to principal component analysis in my previous video was intentionally brief. We wanted to dive into using PCA on a dataset and to enjoy the benefit of visualizing the multidimensional data by reducing it to two dimensions. And so we did. We took a peek into a dataset with 16 features, describing around 100 animals. Now it's time to slow down a bit and delve further into the details of this exciting reduction of dimensionality. I'll use orange's paint data to create a two-dimensional dataset and try to make it as linear as I can. This way, to determine the position of each data point, it's sufficient to know its projection onto this line. These distances, from the center of all the points to the individual projections, represent the values of the first PCA component. We can now use these distances, that is, these components, to represent our data with a single number. The prevalence of the first component in our data is also evident in a screen plot of the PCA widget. The values of the first PCA component bear 98% of the information about this data. Everything else that remains is a projection to the second PCA component, which is orthogonal to the first one. We can see here that projections on the second principal component clutter and the differences between the projection values are fairly small. Notice also that the screen plot reports only two components, as they're sufficient to describe all our two featured data. For more complex data with a higher number of features, the screen plot would report on a higher number of components. Now, we can observe the data in the projection space. To do so, let's use a scatter plot and set the axes to PC1 and PC2. It looks a bit messy, but notice the scale. PC1 spans from negative 2 to positive 2, while PC2 has a much smaller value span. We can squeeze this graph to scale it properly, like this. PC1 is a major axis, and the data spans its entire dimension. I can also check the location of the original data in the projected space. I'll first plot the original data in the scatter plot, select, say, a few data points in the upper corner of the plot, and send that subset to my plot that shows the projected space. Here they are, where they should be. And here are my data points from the bottom and from the center. Now, what would happen if I added a few more data points? This time, I might just place them a bit to the side, like this. See the change in the screen plot. The first component now explains only a portion of the variance, and is nowhere near the full 100%. If I add some more offset data, this gets even worse. If my data were evenly distributed across the entire data space, the explained variance of the first principal component would approach 50%. Thus, we would not be able to effectively reduce the number of dimensions for such data to a single component. I hope this tells us something about PCA. We can only successfully reduce the number of dimensions if the original features are correlated. In our two-dimensional space, that means that the data is placed along a line. In the absence of any correlation, though, there are no opportunities for the reduction of dimensionality. In real data sets, however, features most often do correlate to some degree, and there are groups of correlated features as well. For such data sets, the application of principal component analysis makes a lot of sense. It's then useful to find out more about the directions of principal components and features they most depend on. I'll explore this in my next video.