 When it comes to finding clusters of variables in your data, the two most common approaches, by far, are principle component analysis, which we covered in a previous video, and exploratory factor analysis, which I'm going to talk about right here. And for a lot of people, the differences between these two don't really amount to much. It's a distinction without a difference, or it's sort of like the two procedures are identical twins from different parents. That's an interesting thing because these two methods make profoundly different assumptions about the relationships between the observed variables and the implicit factors that they're related to about really which one drives the other. But I want to show you how you can do both of these in Jmovi, so you can make your choice depending on what seems to work best with your particular data. Now, principle component and exploratory factor analysis are sophisticated topics. And my goal here is not to give you a thorough demonstration of the techniques and the theory, but really to show you how to set it up in Jmovi and where to find the output so you can match that up with your understanding of how these things work in real life. To do this, I'm going to use the Big Five data set where we have information from people about age and gender, and then we have 50 variables with 10 each for the five major personality characteristics, extraversion, conscientiousness, agreeableness, openness, and neuroticism. And what we're going to do is come up here to factor and choose exploratory factor analysis. And the first thing we need to do there is tell it what variables we're going to use. Well, we're going to use the radiant scale ones that are supposed to be measuring these five characteristics. And there's 10 each, so there's 50 total. I'm going to do a shift click down to the bottom and then move those over to variables. And then you can see the table starts propagating immediately, but we do get to make some choices. Now, first off, it's choosing to do seven. And we saw this when we did principle components as well. But I want to do it a little bit differently. So what I'm going to do is I'm going to come over here and I'm going to choose different extraction methods. Minimum residuals is a lot like the least squares criterion that we have in regression. And we have some other choices. Principal access is going to make it practically identical to principal component analysis. Let's use maximum likelihood because that's also a really productive approach in a lot of situations. We can also choose different kinds of rotation. Remember, a rotation is a way of changing the way you get the results. Think about it, for example, that if you have height as one variable and weight as another, you can rotate the data a little bit because those are going to be highly correlated. And maybe you talk about big versus small and it's a way of making it more interpretable. So we can come down here and I like Pro Max and so I'm going to choose that one. It allows for correlated variables and it's going to shift the table a little bit over here, but we do still have seven factors. Now, because I know this is supposed to be measuring five factors, I'm going to come right here and choose a fixed number and I'll put in the five right there so we can see how well it lines up with that. When I do that and when I let things sort in the order that they appear in the data, so we have e1 through 10 and so on, you can see they fall into this nice little step-step-step pattern where the off diagonal elements, there are numbers there but they don't see them because they've been suppressed. Anything below 0.3 doesn't show and that's why we're missing one here because that would be less than 0.3 and these ones are a little bit bigger so they show. But you still see this really clear pattern going all the way down. I can sort them by size but that makes it harder to interpret right here. I do want to mention a couple of other things that we can do. Number one is Bartlett's test of sphericity. The test of sphericity is, well, sphericity is a rephrase. Sphericity is kind of like normality that in most situations you expect your data to be normally distributed in a bell curve. Well, when you have a multi-dimensional data set like you're using with exploratory factor analysis, what you're dealing with is something instead called sphericity. And if we come down here, now this is going to be similar to what we saw in our previous example with principal components analysis. Our data do differ significantly from sphericity. Now, probably the easiest way to check that, although it's a little tedious, would be to go back and look at the distributions of the 50 variables going into it to see, say for instance, are they symmetrical? Do we have outliers? You can check that quickest by doing a whole series of box plots where it's very easy to check symmetry and whether things sort of match the expectations of a normal distribution. Another thing we can do is come down here and start looking at the factor summary, which is going to tell us the sum of squares loading, the percent of variance that each of the factors accounts for, and the cumulative variance. And this is basically identical to what we had with the principal components analysis. We can also look at the factor correlations because I used a rotation that allowed for these to be correlated with each other, and that's what we have right there. There are a collection of model fit measures that are available as well. That includes things like RMS EA, that's the root mean square error. And we have several others, including the chi-squared there at the end, that you can use to see how well the data match the five-factor structure that we're putting in here. And then finally, you can also do a graphical screen plot. And this is a way of looking at how the factors that are determined by the factor analysis account for the variability that the data began with. If each variable is standardized, then you have one unit of variance for each of the variables. We have 50 variables, so we have 50 units of variance. And we see that the first one accounts for about seven, then four, and then so on. And this is the five. These are the five that we expect from the big five now. Now, empirically, we could go down to six or seven, but those don't have the theoretical tightness that we expect from a measure that's supposed to be looking at the big five personality factors. And so, the analysis that we have here is nearly identical to the one that we had with principal component analysis. We certainly are getting the same conclusions out of it. And which one you choose really is a preference about the nature of the relationship between the individual items and the factors or the components that you're getting from the analysis. Either one's going to give you insight into how, for instance, you could group variables and either one's going to help you find the stability of your data by averaging out across these items and getting more reliable conclusions that will work in new situations and help you put your insights into practice.