 Hello, everyone. My name is Mike Zhang. I'm going to give a very brief introduction to the Metasam package in this video. The Metasam package conducts metanases using a structural equation modeling, or SCM framework. It uses the OpenMX, Levan, and Samplot packages at the back end. The R-code and the reference files of this presentation are available at GitHub. Please refer to it. In this video, I'm going to focus on how to use the Metasam package to conduct analysis. Please refer to the Levan documents for the theories behind it. In the first part, we're going to present a metanetic structure equation model to combine correction matrices and to fit structure equation models on the average correction matrix. We'll illustrate the procedures with an example in the theory of plant behavior. There are five variables. They are attitudes, subjective, norm, perceived behavior control, behavioral intention, and behaviors. So this is the theory. This is the model we are going to fit. First of all, we may load the package and also the data frame. The primary data are the list of correction matrices and their sample sizes. Here are three of them listed here, these three correction matrices. Please note that they are also missing data and they are represented by NA. Here are the sample sizes. We also extract the variable names for future use. The first thing for us is to check the missing pattern. We can use this command. We can see for some variables, there are 23 studies, but for the behavior, there are only six. We can also check the total sample sizes on each cell. The first part is to conduct a two-stage structure equation model. We are going to use a random effects model. The syntax is like this, random. This is the object we are going to save for our analysis. We include the data, the sample size, and then in the method, we specify random effects model. We assume the random effects are independent because we may not have enough data to estimate the full variance-coference matrix. We can get a summary. Here is how to read the output. The first part labeled intercepts 1 to 10 are the average correction matrix because there are five variables, so totally we have 10 correction coefficients. Then the other one labeled with tile two, there are the variance-coference matrix. And we can also get the heterogeneality, the i-squares, on each correction coefficient. For each of our references, we can also extract the fixed effects component and arrange them in a way so we can see this is a pulled correction matrix. So we have five variables, so this is the average correction matrix consists of five by five. And we can also extract the variance component by extracting the random part. And moreover, we can also use the variance-correction function to extract the full variance-coference component. Please note that the off-diagonals are all zeroes because we assume these random effects are independent. Then now we can fit the structure question model. To specify the model, we can use a LaVance scene text like this. So we save it as model 1. And we specify the path coefficients. Since we are analyzing a correction matrix, the variances of the independent variables are fixed at 1. And we also allow the predictors to be correlated. And here is the graphical representation of our model. Then we can use a LaVance to RAM function to convert it into RAM representation. It is because the Metasam package only understands the RAM representation. There are three matrices to represent a structure question model in RAM representation. The first one is the A matrix. It represents the path coefficients, including regression coefficients and factor loadings. The columns represent the independent variables and the rows represent the dependent variables. For example, this one, BI on ATT. It represents the path from ATT to BI. And there's also an other S matrix, which represents the symmetric path. For example, the variances and covariances. And we can also see for the independent variables, the variances are fixed at 1. And finally, there's an F matrix to indicate which variables are observed. In these examples, all five variables are observed. Now we can use it to fit our model, our stage 2 model, and get the summaries. So for this part, the outputs are quite similar to standard structure question models. We have all the labeled elements. Mofra, we also have various goodness of feed indices. For example, the chi-square of the target model, the DF and the P-values. And also RMSCA, and it's 95% Ci and also other feed indices. Mofra, we can also plot the model to get some sense how it looks like. When we are using structure question models, we can impose constraints to test various research questions. For example, in this case, we may want to test whether the path coefficients from the independent variables to BI, the mid data, are the same. So we can use the same labels. And this is our model. BI on X, it means from the predictors to BI, we assume all of them are identical in these three paths. Now we can convert it into the RAM representation. And then we fit it. And when we look at the output, all of them are the same because we force them to be identical. And we can get the usual chi-square, statistics, P-value. And Mofra, we can also check the output graphically. So these three paths are all the same 0.29. Finally, we can also ask the question, is it statistically different from a model without imposing the constraints? Since these two models are nested, the models with constraints and model without constraints, then we can use Mofra functions to compare these two models. And in this example, the chi-square difference test is about 35, and the DF is 2, and the P-value is various more. Then it suggests that we shouldn't impose constraints on these three paths. Another advantage of using the Metasam package is we can use it to calculate various functions of parameters. For example, we may be interested in the R-squares on the mediator and dependent variable. And we can also want to calculate the direct effect and indirect effect, which are functions of the parameter estimate. And we can also use a Leica-based content interval to get the 95% content intervals on these functions. So let's see how to do it. First of all, we can label each coefficients. So it makes it easier for future calculations. So we label all the paths. Secondly, we can also label the ever-variance. Because we know that since all variables are standardized, R-square represents one minus the ever-variance. So we can calculate the R-squares on Bi and the R-squares on BH by using one minus the ever-variance. Moreover, we can also define various indirect effects. And also the sum of the indirect effects, there are three paths. And then the direct effects and then the total effects. So this graph shows us all the paths with the labels. And then we can use them to calculate the indirect effects, the indirect effects, the total effects, and then the R-squares. Finally, we convert it into the RAM model. And in the outputs, we can also see the actual computations, which will be one in the functions later. Then we can refit it. But in here, we need to make two minor changes. First of all, we need to use the diagonal constraint equals true because we want to estimate the ever-variance. Secondly, we want to get the interval type with the legal-based counter-interval. And then finally, we can run the analysis. Then in the output, we can see the message that 95% counter-intervals are now based on the legal-based statistics. And then these lower bound, upper bound are all based on the legal-based counter-interval. More importantly, we can get some extra outputs. For example, the R-squares on the BI, the estimate is 0.46. And then the 95% counter-intervals, where R is 0.39 to 0.53. Similarly, we can also get the R-squares on the BH. And then on the various indirect effects, direct effects, some of the indirect effects. The two-stage SCM is quite useful to conduct analysis when there are categorical or continuous moderators. We may use the one-stage Mason approach to conduct the moderator analysis. Let's illustrate the procedures with the percentage of female as a moderator. In the dataset is represented by female. First of all, we need to do some data transformation because we need to add the moderators into the data as well. Then we can use the correction to data frame function to convert it into a data frame. And then secondly, for the percentage of female participants, usually we center the data to get more stable results. So we can center the data here. Then here are these variables female. And we can see there are some missing data here. Then we can create an indicator to indicate which roles to include in the analysis. It is because when we run the analysis with a moderator, we want to compare against a model without moderator. We need to make sure that the same dataset are used. So when there are missing data, for example, these few roles, we have to remove them. By using this indicator, we can remove it from both the models and without the moderators. In the first step, let's run a model without any moderators. So we feed a structure equation model using the one-stage approach. And then we just label it as no moderator. The same, the RAM1, the data frame, and then the success roles to indicate which roles to include. The outputs are almost identical to the two-stage approach. So we are not going to discuss it any further. And we can also check the graphical output, which should be similar. Then now we need to specify our model with a moderator. And let's take a look at the A matrix. So these are the paths that we would like to model with females of the participants. Then to create a matrix to indicate which layer parameter to model, we can use the create a moderator matrix. And then we are using the A matrix. And then we can also see in the output. So these five paths would be models. And sometimes we may not want to model all of these five paths. Then what we can do is to remove the others as zeros. For example, in this matrix, if we are going to use it in our analysis, then what we are going to model is just the path from pbc to bh, where the others will not be modeled. Now we can fit the model by specifying our RAM and then this moderator. And also to remember to include the subset roles. Then in the outputs, these few lines, what we can see is they are from the A zero matrix. They represent the intercepts. Then the other ones, A1, they represent the slopes. So let's see how it looks like. But before interpreting the results, usually we would like to compare the model with and without the moderators. We can use the ANOVA functions to compare these two models. So the DF is five, meaning that we are comparing five coefficients. And then the chi-square difference is 6.6. The p-value is 0.25. In other words, we don't have enough evidence to reject the models. It means the moderator, the female, the proportion of female participants does not modulate the paths. As an illustration, we can also extract the A matrix. So this A zero matrix represents when the females is zero, then these are the estimated paths. And then the A1 represents the slopes. It means when the female percentage increases one unit, what's the expected change on the regression coefficients? And this one, A1, when we compare to the previous chi-square test, we have noticed that it is not significant. The metasam package also includes other functions, especially one important one is to calculate multivariate effect sizes. Sometimes it's quite straightforward and they are standard formulas, but for other models, they are less easy. And then the metasam package provides some useful framework using the SCM modeling framework to compute various effect sizes and sampling variances and co-variances with the delta method. So let's look at two examples. The first one is what we call multiple treatment studies, MTS. It is used when there are more than one treatment group and they are compared to one control group. And Glaser and Orkin, they derived formulas to calculate standardized mean differences in the sampling covariance matrix for multiple treatment studies under the assumption of homogeneity of variances. And the metasam package makes it more flexible because we don't need to assume homogeneity of variances. Users can control their assumptions. So let's look at how we can do it in the SCM framework. In this figure, there are three groups in this figure. The first one is the control groups. So in here we have one observed variable Y and then we have the means which is represented by triangle. So this is the mean and then it's variance. Then we have two treatment groups and then we have the observed variables and then variances and then the means. And we can calculate standardized mean difference according to the definition. It is the mean difference provided by the pool standard deviation. So the sigma. So we have two. But now it depends on assumption. If we assume homogeneity of variances, then we can impose a constraint that all of these three groups, they share the same common variance or standard deviation. And then based on this definition, then the calculated standard time mean differences are based on the homogeneity of variances. But on the other hand, if there are reasons to believe that the variances may not be the same. For example, the clinical groups or normal participants. Then in this case, we can use the control group as the standardizers. Then we can also calculate the standardize mean differences. Without assuming homogeneity of variances. So let's see how to do it in the method same package. So for ease of reference, so we label which groups are the control treatment one and treatment two. So we have three. The sample means for here. And then the variances and also the sample sizes. So there's a SMDMTS. What we need to do is to give the means variances, sample sizes, and then we specify homogeneity. In this case, we assume homogeneity of variances and then the two standardize mean differences are these two values. And then we have a variance covariance matrix. How about if we don't assume homogeneity of variances, then in the homogeneity, we use the argument none. Then now we have another version calculated. And then also the sampling variances and covariance. And in these versions, we don't assume homogeneity of variances. Once we have calculated the effect sizes and sampling variance covariance matrices, we can use them in the multivariate method analysis. Then let's consider another case. It's a multiple endpoint studies. So multiple endpoint studies means that there are two outcome variables, for example, mathematics and language. And we have two groups, control group and treatment group. Again, they derive formulas to calculate standardize mean differences with the assumption of homogeneity of covariance matrix. And the SEM approach released this assumption. And let's look at this figure. So here we have a control group. We have two effect sizes. The first one, let's say, is mathematics. The other one is language. And we have the means. Observe means. And then in this case, we decompose it into the standard deviation. And in the variances, all of them are fixed at one. And then we have the correlations. Similarly, we can also apply the same model to the treatment group. Now, if we want to calculate the standardize mean difference on the first variables, the first outcome, we can use this definition. And essentially, we can also do it on the second effect size. So since we are using a structural equation model approach, we can impose constraints to assume that in these two groups, the data homogenous in their variances and co-variances. Or we can impose homogeneity on only the correlations, but we allow the variances to be different. Or we don't need to impose any constraint. And then we can use the control group as the reference groups to calculate the standardize mean differences. So let's see how to do it in R. We have two variables, mathematics and language. So this is the first group, the means for the first group. And then this is the means of the second group. And we have the covariance matrix for the first group and then the covariance matrix of the second group. And then also the sample sizes. Then we can use the SMD-MES functions to calculate the standardize mean differences between these two groups. So one is on the mathematics, the other is on the language. And we can assume homogeneity of covariance matrix. And then we can get the effect sizes. And then also the sampling variance covariance matrix. And we can also only allow the assumption of homogeneity of correlation matrix, but we don't assume the variances are the same. Then we can get the different versions. And finally, we don't need to assume anything. And then we use the control group, the first group as the reference group. So by using three different versions, it allows researchers to check the sensitivity of the assumptions. So it means usually we assume the data homogenous in terms of covariance matrix. So if the variances are different, then we can use the homogeneity of correlations. And then in the worst case scenarios, what happened if we don't make any assumptions? Then we can use this version. And researchers can check how robust they are. So in these two examples, we are limited ourselves to specific models. But sometimes researchers may want to create their own model and calculate effect sizes based on that model and use it for metanus. So let's illustrate with our previous examples, the theory of plant behaviors. Let's just take one sample correlation matrix. So we have five variables and then the sample size. And we are fitting the same models. So now we are going to extract the sum of the indirect effect and also the direct effect. And we want to use them as effect sizes in our metanus. Although we illustrate this example, but it doesn't mean that we should do it. Because a better way to do it is to summarize the correlation matrices and then to fit the structural question model in the later step. But in here we use it to illustrate the power of using this approach to calculate various effect sizes. So now we just calculate the indirect effect and then direct effect and use them as two effect sizes. So we can show the model. Then to get the indirect effect and direct effect as effect sizes, we can use the calculate effect sizes function. We provide the model and then the correlation matrix and the sample size. Then we can generate the indirect effect is 0.22 and then the direct effect is 0.08. And also the sampling variances. And if you have a list of these effect sizes, then we can conduct metanus based on them. In this brief tutorial, we have only illustrated a few functions. There are several others in the metasyn package. For example, the meta functions for the univariate and multivariate metanus. And then the meta three functions for the three level metanus. There's also a meta theme functions to fit univariate and multivariate metanus, which allows missing covariates to be estimated with the full information maximum micro. Moreover, there's also an indirect effect to calculate standardized and unstandard indirect effect in mediation analysis. That's all for today.