 Let's take a look at the matrix presentation of a latent variable structural equation model. When you study books on SCM, you'll often see tables like this. So this is from Todd Little's book and he has all these Greek, lambdas, psi, phi, and they refer to matrices, then they are the Greek symbols. And if you look at statorius and manual, you'll find things like this. So they're matrices with their explanation. And why understanding all this stuff is useful? So it's Greek? Why you should know some Greek if you do SCM? There are a couple of reasons. First, your software might report matrices. And even if it doesn't report matrices by default, all SCM software that I know can report the estimates in matrices and that's actually very useful format for presenting things because it's easy to check what is being estimated, what is not estimated or what's constrained to be zero. The second reason is that some books or articles use matrices. Particularly if you go for more advanced books on SCM, then they use matrix equations to show some features of models. And to understand that, you need to understand a bit of the logic behind the matrix presentation of an SCM. Then if you want to calculate model implied covariances from a model, which you might want to do for diagnostic purposes, doing that by hand one covariance at a time tracing all the paths is very tedious, but with matrices you can calculate it all in one go. And then finally understanding the matrices or the different matrix presentations because they are more than one, then that can allow you to understand why your software will not allow for example certain kinds of covariances or certain paths. That might be because they are just not available in the matrices that that computer software uses. Let's take a look at Stata. Stata's user manual contains this example of S-stat framework, which says that it prints out the estimated structural equation modeling in bentler weeks matrix system. So there are different systems for constructing matrices and they vary in how many matrices are used and how large matrices are used, but they all basically work under the same principle. So we have these matrices. We have a beta, gamma, psi, alpha, pi and kappa and Stata fortunately provides as an explanation that kappa is the means of exogenous variables for example, but not all software do that in their output. So remembering what the different matrices are in your software is useful. So why would we want to take a look at these matrices instead of just taking a look at the parameter list that Stata prints out by default and many others of SCM software as well. Well, here we can see easily what is being estimated, what is being constrained. For example, we have the error covariances, we can easily see that all the error covariances are constrained to be zero, they are zeros. We can also see the pattern of regression coefficients here, what is being estimated, what is not estimated. We can see if there are any constraints with, for example, exogenous variables, we probably don't want to have those. And if we have, they are going to be shown here. So this shows you, in addition to the estimate, it shows you what is not being estimated and that's a very useful thing to take a look at every time after you estimate something. Let's take a look at more detail on these matrices. So this is from Reiko Markovic's book and they present the literal model, which is one of the more well-known models and one of the older ones. And they show that in literal model there are basically two sets of equations. So one is y, that's y equations, that shows that the observed variables y are functions or they depend on the later variables eta with the factor loadings and their error terms and intercepts. And then there's a model for the later variables. So the later variables are linked to other later variables using recursion coefficients stored in the beta matrix. And if you want to calculate the model implied covariance matrix, then you'll get that from this kind of equation. So what's the logic of this equation? We can take a look at the equation, it has basically two parts. This thing in the middle is latent variable covariances. So it's latent variable covariances multiplied by both sides with the factor loading matrices. So what's the logic here? Let's take a look at an example of an exploratory factor analysis model, which is a simpler model because it does not contain the beta matrices, we just look at the psi matrix here. So the modeling implied covariance matrix of this model is given by this matrix equation. So we take the factor covariances, we multiply on both sides using factor loadings and we add indicator error terms and their covariance matrix. So here are the matrices. So this is the factor loading matrix. You have indicators on rows. Latent variable was on columns and zero means that factor loading does not exist. One means that it's a scaling loading. So the first loading speaks to one and lambda means that it's an estimated loading. And then we have the factor covariances and variances here. So what's the logic behind this? So why do we construct it this way? Let's take a look at the multiplication of the first two matrices and what it gives. So these first two, multiplying all these matrices, these two matrices together gives us this six by two matrix. And we again have indicators on the rows and latent variables on the columns. And what this matrix actually contains is all paths from every indicator to every latent variable. So we can go from A1 to A and then to B using the correlation and we can go from B1 to B and B1 to A using the covariance and this matrix simply tells us that. So it's calculated using normal matrix multiplication rules. So we first take the first element of the first matrix and the first element of the second matrix so it's one times psi A and then we take the second element on the same and the second element on the row on the same column. So zero times AB. So there are no paths from A1 indicator to A that would go through the covariance B to A and that's what it tells us. So this is simply applying tracing rules. So you take every possible path from A1 to A for example and then you sum those paths. So you multiply everything along the way and then you sum the paths. And if we add a cross loading like so, lambda here, we can see that now there are two paths from A1 to A, there's the direct path and then there is the path along the covariance which is added to the model and the same thing from A1 to B using two paths. So there's the covariance path from A to AB to B using the covariance and then there's this cross loading. So all possible paths and then multiply everything along the path take a sum and that's what the matrices do. So when we look at these three matrices together the first two matrices contain all paths from every indicator to every latent variable and then the final matrix when we multiply it we simply go back to every indicator. So that gives us these model implied coherences. If we add the cross loading here then we will get more paths from A to all the A indicators and from A to all the B indicators. So this is a symmetric matrix. So this is simply application of the tracing rules that I normally teach using scalar format but it just automates the multiplication and summing. So remember that when you multiply two matrices you first do this element by element multiplication and then you take a sum of all those products and then you move to the next element of the matrix and that's how matrices are multiplied. So where does this come from? Where does this identity matrix minus beta inverse psi identity matrix beta prime inverse where does that come from? Well I don't think that going through that is very useful but if you want to understand how that is derived here is how you derive it so that's just the derivation. I got it from Susanne Jak, who is working on a book on SCM and I really look forward to seeing the book but this is as far as I have of the book now. Now the final thing is that matrices define what models can be specified. So this is from Stata's SCM builder and we have a simple confronted factor analysis model with a simple one factor and let's say that we try to add a path from the error term to a latent variable, we can do that. Stata tells us that we cannot correlate endogenous variable with exogenous variable. So why would that be the case? The reason is that the endogenous variables and exogenous variables, their covariances are two separate matrices and there is no matrix in the model that would contain the covariances between the error terms of endogenous variables and the exogenous variables. If we take a look at this literal model this also has some constraints, for example the indicators are allowed to depend on the latent variables but not the other way around. So the latent variables affect the indicators but not the other way around. Also you can't regress indicator on another indicator because only regressions between latent variables are allowed in this modeling framework. So understanding this Greek is useful for capillarism. First, your software will report matrices and they are called with the Greek names. Second, books and articles use matrices understanding those books. You need to understand the matrices. Then calculating implied correlations is a lot easier with matrices than by looking at individual colors or individual numbers. And finally, if you understand the matrix presentation of an SCM then you can understand what kind of models are possible and not possible with your SCM software.