Maximum Likelihood Estimation in Meta-Analytic Structural Equation Modeling
Frans J. Oort University of Amsterdam
Suzanne Jak Utrecht University
This paper is currently under review in Research Synthesis Methods.
Correspondence should be adressed to: Prof. dr. F.J. Oort Research Institute of Child Development and Education Faculty of Social and Behavioural Sciences University of Amsterdam P.O. Box 15780, 1001 NG Amsterdam, The Netherlands Tel. +31 (0)20 525 1291 E-mail:
[email protected] 1
Abstract Meta analytic structural equation modeling (MASEM) involves fitting models to a common population correlation matrix that is estimated on the basis of correlation coefficients that are reported by a number of independent studies. MASEM typically consist of two stages, (1) estimating the common correlation matrix and (2) fitting models. The method that has been found to perform best in terms of statistical properties is the two stage structural equation modeling (TSSEM), in which maximum likelihood (ML) analysis is used in the first stage, and weighted least squares (WLS) analysis in the second stage. In the present paper we propose to use ML estimation in both stages of TSSEM. In a simulation study we use both methods and compare chi-square distributions, bias in parameter estimates, false positive rates, and true positive rates. Both methods appear to yield unbiased parameter estimates, and false and true positive rates that are close to the expected values. The differences between the methods seem negligible, so the choice between the two methods may be based on other fundamental or practical reasons such as consistency in the estimation method or the availability of software to fit the models.
2
Introduction The combination of meta-analysis (MA) and structural equation modeling (SEM) for the purpose of testing hypothesized models is called meta-analytic structural equation modeling or MASEM (Cheung & Chan, 2005a). Using MASEM, information from multiple studies can be used to test a single model that explains the relationships between a set of variables or to compare several models that are supported by different studies or theories. MASEM typically consist of two stages (Viswesvaran & Ones, 1995). In the first stage, correlation matrices are tested for homogeneity across studies and combined together to form a pooled correlation matrix. Several methods to synthesize correlation matrices have been proposed and evaluated (e.g. Becker, 1992, 1995, 2000; Cheung & Chan, 2005a; Hafdahl, 2007, 2008). In the second stage, a structural equation model is fitted to the pooled correlation matrix. For this stage, also several methods have been proposed (see Zhang (2011) for an overview). The method that is generally found to perform best in terms of statistical properties and flexibility is the two-stage SEM (TSSEM) approach of Cheung and Chan (2005a). In this method, maximum likelihood estimation is used to estimate a pooled correlation matrix (Stage 1), and weighted least squares estimation is used to fit structural equation models to this correlation matrix (Stage 2). The purpose of the present paper is to propose and evaluate an alternative way of conducting TSSEM, using maximum likelihood (ML) estimation in both stages of the procedure.
Two-stage SEM with weighted least squares estimation in the second stage In Stage 1 of the TSSEM approach, correlation matrices are tested for homogeneity across studies using multigroup modeling. If homogeneity of correlation matrices is not rejected, they are combined to form a pooled correlation matrix. In Stage 2, the pooled correlation matrix is taken as the observed matrix in a SEM analysis. Cheung and Chan use maximum
3
likelihood to estimate the pooled correlation matrix in the first stage, and weighted least squares (WLS) to estimate model parameters in the second stage. In Stage 2 WLS estimation, the inversed matrix of asymptotic variances and covariances of the Stage 1 correlation estimates are used as the weight matrix in the WLS optimization function. This ensures that correlation coefficients that are estimated with more precision (based on more studies) in Stage 1 get more weight in the estimation of model parameters in Stage 2. The precision of a Stage 1 estimate depends on the number and the size of the studies that included the specific correlation. Cheung and Chan (2005a) conducted a large simulation study in which the performance of the TSSEM approach was compared with two univariate approaches and one multivariate approach. In the univariate approaches, the correlation elements are pooled separately across studies based on bivariate information only. Cheung and Chan use the term univariate-r method to refer to the approach from Hunter and Schmidt (1990), which involves pooling the untransformed correlation coefficients. The univariate-z method refers to the method of Hedges and Olkin (1985), who proposed using Fisher’s z-transformed correlation coefficients. The multivariate approach with which they compared the two-stage SEM method is the GLS method (Becker, 1992, 1995, 2000). The simulation showed the GLSmethod rejects homogeneity of correlation matrices too often and leads to biased parameter estimates at Stage 2. The univariate methods lead to inflated Type 1 errors, while the TSSEM method leads to unbiased parameter estimates and false positive rates close to the expected rates. The statistical power to reject an underspecified factor model was extremely high for all four methods. The TSSEM method thus came out as best out of these methods. Software to apply TSSEM is readily available in the R-Package metaSEM (Cheung, 2013), which relies on the OpenMx package (Boker et al., 2011) to fit the Stage 1 and Stage 2 models. In
4
this way, MetaSEM provides parameter estimates with standard errors, a chi-square measure of fit, and likelihood based confidence intervals for parameters at both stages of the analysis.
Two stage SEM with maximum likelihood estimation in both stages Instead of treating the estimated pooled correlation matrix as fixed and using WLS estimation to fit a structural equation model in Stage 2, the structural equation model can be fitted to the observed correlations directly, by imposing restrictions on the Stage 1 model. In this way, the Stage 2 model is nested in the Stage 1 model, and both models can be fitted through maximum likelihood estimation. This is an attractive approach because it provides a maximum likelihood chi-square to evaluate the overall goodness-of-fit of the structural equation model, and a chi-square difference test to evaluate the significance of the difference in fit between the Stage 1 and Stage 2 models.
Method
Below we describe the Stage 1 and Stage 2 models that are used in MASEM, and the design of a simulation study that evaluates the ML analysis in Stage 2 and compares it with Cheung and Chan’s WLS analysis in Stage 2. The Stage 1 analysis is identical in the two approaches.
Stage 1: Testing homogeneity of correlation matrices Let Rg be the pg x pg sample correlation matrix and pg be the number of observed variables in the gth study. Not all studies include all variables. For example, in a meta-analysis of three variables, the correlation matrices for the first three studies may look like this:
5
1
R1 =
_ _
1 _
, 1
R2 =
1 _
1
, and R2 =
1 _
1.
Here, Study 1 contains all variables, Study 2 misses Variable 1, and Study 3 misses Variable 3. An estimate of the population correlation matrix R of all q variables is obtained by fitting a multigroup SEM model, where the model for each group (study) is:
Σg = Dg ( Mg R MgT ) Dg
(1)
In this model, R is the q x q population correlation matrix with diag(R) = I, matrix Mg is a pg x q selection matrix that accommodates smaller correlations matrices from studies with missing variables (pg < q), and Dg is a pg x pg diagonal matrix that accounts for differences in variances across the G studies. Correct parameter estimates can be obtained using maximum likelihood estimation, optimizing
G FML = FML(g) , g=1
(2)
FML(g) = logg - logSg + trace(Sg g-1) - pg ,
(3)
with
where Sg is the pg x pg sample variance-covariance matrix in the gth study. A chi-square measure of fit for the model in Equation 1 is available by comparing its FML value with the
6
FML value of a saturated model that is obtained by leaving out the selection matrix Mg and replacing R with Rg ,
Σg = Dg Rg Dg ,
(4)
where diag(Rg) = I for all g. The difference between the resulting FML values of the models in Equations 1 and 4, multiplied by the total sample size minus the number of studies, has a chisquare distribution with degrees of freedom equal to the difference in numbers of free parameters. If the chi-square value of this likelihood ratio test is significant then the hypothesis of homogeneity must be rejected. If the correlation matrices are not homogeneous, it does not make sense to create a single pooled correlation matrix to test the structural model in Stage 2. One possible solution is to create subgroups of studies with homogenous correlation matrices and to fit different Stage 2 models to each subgroup (Cheung & Chan, 2005b), or to allow study level variation in the correlation coefficients by using a random effects approach (Cheung, 2014). In the present study we assume that homogeneity holds, and we do not consider heterogeneous situations.
Stage 2: Fitting structural equation models Cheung and Chang (2005a) proposed to use WLS estimation to fit structural equation models to the pooled correlation matrix R that is estimated in Stage 1. Fitting the Stage 1 model provides estimates of the population correlation coefficients in R as well as the asymptotic variances and covariances of these estimates, V. In Stage 2, hypothesized structural equation models can be fitted to R by minimizing the weighted least squares fit function (also known as the asymptotically distribution free fit function; Browne, 1984):
7
FWLS = (r – rMODEL)T V-1 (r – rMODEL) ,
(5)
where r is a column vector with the unique elements in R, rMODEL is a column vector with the unique elements in the model implied correlation matrix (RMODEL) , and V-1 is the inversed matrix of asymptotic variances and covariances that is used as the weight matrix. For example, in order to fit a factor model with k factors, one would specify RMODEL as
RMODEL = Λ Φ ΛT + Θ ,
(6)
where Φ is a k by k covariance matrix of common factors, Θ is a q by q (diagonal) matrix with residual variances, and Λ is a q by k matrix with factor loadings. Minimizing the WLS function leads to correct parameter estimates with appropriate standard errors and a WLS based chi-square test statistic TWLS (Cheung & Chan, 2005a, 2010). As an alternative to using WLS for fitting a RMODEL to the estimated R in a singlegroup analysis, we propose to retain the multigroup analysis and to use ML for fitting a common RMODEL to the observed Rg matrices, which can be done by substituting RMODEL for R in Equation 1,
Σg = Dg ( Mg RMODEL MgT ) Dg
.
(7)
Multigroup ML analysis of Equation 7 yields a ML chi-square of overall goodness-of-fit (TML) and correct standard errors, just as multigroup ML analysis of Equation 1. Moreover, as RMODEL in Equation 7 is a restriction of R in Equation 1, the difference between the
8
associated chi-square values has a chi-square distribution itself with degrees of freedom equal to the difference in the numbers of free parameters in R and RMODEL. For example, if we choose the factor model of Equation 6 as RMODEL, then the multigroup structural equation model is given by
Σg = Dg ( Mg (Λ Φ ΛT + Θ) MgT ) Dg .
(8)
In summary, the ML procedure for MASEM consists of a series of three nested models: the saturated model (Equation 4), the Stage 1 model (Equation 1) and the Stage 2 model (Equation 7). Table 1 gives an overview of the three models and the associated tests. The difference between the first two models provides a test of the homogeneity of correlation coefficients, and the difference between the Stage 1 and Stage 2 models provides a test of the structural model.
Simulation study We will use simulated data to investigate the performance of ML TSSEM and compare it with the TSSEM procedure as proposed by Cheung and Chan (2005a). As the two procedures involve identical Stage 1 analyses, we will focus on the Stage 2 analyses, and calculate estimation bias, false positive rates, and power in both the multi-group ML Stage 2 analysis and the single group WLS Stage 2 analysis. The design and population model of our simulation study are chosen similar to the design and population model in the Cheung and Chan (2005a) study. As a result we have numbers of studies equal to 5, 10, and 15, with 50, 100, 200, 500 and 1000 respondents each. The population model is a two-factor model with six variables, with parameter values for Λ, Φ, and Θ as shown in Figure 1. We generated continuous 9
multivariate normal data using the MASS package in the R program (R Development Core Team, 2011), after which we computed the correlation matrices. As it is not realistic that all studies reported correlations between all research variables, some variables were removed from some studies. Table 2 shows the patterns of missing variables in conditions with 5, 10 and 15 studies (these patterns are also chosen equal to the missing data patterns in Cheung and Chan (2005a)). As the pattern is fixed a priori, the missingness may be considered missing completely at random (MCAR, Graham, Hofer & MacKinnon, 1996), and will not bias parameter estimates. We generated 500 data sets for each of the 15 conditions.
Estimation bias. We investigated estimation bias by fitting the model of Figure 1 to the data with both the ML method and the WLS method. The percentage of estimation bias is calculated as 100 x (mean estimated value - population value) / population value. According to Muthén, Kaplan, and Hollis (1987), estimation bias less than 10% can be considered negligible.
False positive rates. The distribution of TML and TWLS and false positive rates for the tests of overall goodness-of-fit were inspected by fitting the correct model (Figure 1) to the data sets (df = 8, = 0.05). We also calculated false positive rates for a single parameter, by fitting a model in which Variable 4 has a redundant cross loading on the first factor. False positives for this single parameter estimate were determined in two ways. Firstly, on the basis of the likelihood ratio test, by checking the chi-square difference between models with and without the redundant parameter (df = 1, = 0.05). Secondly, on the basis of the 95% confidence interval of the estimate for the redundant parameter, by counting the number of times that the interval did not include zero.
10
True positive rates (power). To investigate the power to detect non-zero parameters we generated data according to the population model given by Figure 1 but with an additional 0.10 loading of Variable 4 on the first factor. Power was determined for the overall goodnessof-fit (by counting true positives of the chi-square difference test with df = 8, = 0.05) and for the test of the single additional 0.10 loading (chi-square difference test with df = 1, = 0.05). In addition, we counted the number of times that the 95% confidence interval of the additional parameter estimate did not include zero. We compare the power of the ML and WLS procedures to each other and to the power that is expected on the basis of the non-centrality parameter that was estimated by fitting the covariance matrix implied by the Stage 2 model (Model 2 as in Figure 1) to the covariance matrix implied by the Stage 2 model with the cross loading (Model 2 with the additional loading); see Saris and Satorra (1993).
Results Table 3 shows the percentages of estimation bias in a selection of parameters: The first three factor loadings and de covariance between the two common factors. For both the ML and the WLS method, the bias is very small, ranging from 0.00% to (minus) 3.50%, which is well below the 10% threshold suggested by Muthén et al. (1987). In general, the bias seems to be somewhat smaller with the ML method than with the WLS-method, but the differences are negligible. Table 4 gives the means and standard deviations of the test statistics obtained with ML and WLS estimation. In all conditions, both test statistics approximate the chi-square distribution well, as the mean and standard deviations are very close to their expected values (mean = 8, sd = 4), with the TML values generally slightly closer than the TWLS values. Table 4 also shows the false positive rates, that is, the rates of rejecting the correct model. The false 11
positive rates appear to converge to the expected value (0.05) when the sample size increases, which is in accordance with the findings of Cheung and Chan (2005a). The results are very similar for the ML and WLS method. The false positive rates of the redundant cross loading are shown in Table 5. These were very close to the expected 0.05 in all conditions and equal for the ML and WLS methods, except for two conditions (with ML being 0.01 off in one condition and WLS being 0.01 off in another condition). Table 6 gives the true positive rates or power to reject the model without a cross loading for the overall goodness-of-fit test (df = 8), for single parameter test (df = 1), and for the 95% confidence interval. As expected, the power increases with sample size. The number of studies does not affect power that much, probably because the numbers of studies do not vary much (5, 10 and 15). Both methods reach acceptable power levels with sample sizes of 500 or larger. The ML method generally shows power results that are somewhat closer to the expected power than the WLS method, but the WLS method often yields slightly more true positives.
Discussion Our simulation study showed that using maximum likelihood estimation in both stages of MASEM leads to almost identical results as using WLS estimation in Stage 2 of the analysis. ML estimation may have an edge on WLS estimation, but the differences are not consistent and hardly noticeable. There are some fundamental and practical differences though, which may guide a researcher’s choice between the two methods. Advantages of the ML procedure are that the same estimation method is used at both stages, and that the Stage 1 and Stage 2 models are nested. The WLS procedure has practical advantages. In the WLS procedure, the Stage 2
12
model is not a multi-group model, so that estimation convergence is much faster than in the ML-approach. The necessity to calculate a weight matrix (the inverse of the matrix of asymptotic variances and covariances of the pooled correlation coefficients) may count as a disadvantage of the WLS method, but fortunately the readily available R package metaSEM takes this burden off of the user. As a result, the WLS approach may actually be easier to take than the ML approach. Another possible disadvantage of the WLS approach is that the pooled correlation matrix that is estimated in the multi-group SEM in Stage 1, is taken as fixed in the single group SEM in Stage 2. This could lead to larger rejection rates, and although in our study the WLS method did not show more false positives than the ML method, it did yield slightly more true positives. Notably, true positive rates for both methods were generally somewhat higher than expected. In our study we considered situations in which some studies did not include one or more of the variables of interest. In addition to missing variables, there may also be missing correlations. When authors do not report the full correlation matrix of all research variables, some correlation coefficients may be derived from other statistics that are reported, such as regression coefficients. This may not be possible for all correlations. For example, authors may not report the correlations between the outcome variables in their study. One way to handle missing correlation coefficients is presented by Jak, Oort, Roorda, and Koomen (2013), but their method has not been thoroughly evaluated yet. A crude way to handle missing correlations is to remove one of the two variables that is associated with the missing correlation (Cheung & Chan, 2005a), but this method leads to a substantial additional loss of information. In applications of MASEM, correlation matrices will often not be homogenous, rendering the fixed effects Stage 2 models inappropriate. One of the options to handle heterogeneity is to create subgroups of studies that can be considered homogeneous with
13
respect to the correlation coefficients. These subgroups can be created based on study level variables such as type of respondents in the study, experimental vs non-experimental studies, type of measurement instruments used, and so on. If no interesting study level variables are available, it may be an option to use a cluster analytic approach (Cheung & Chan, 2005b). Another alternative is to use random effects modeling, which does not assume homogeneity of effect sizes across studies. Cheung (2013, 2014) shows how the Stage 1 model can be estimated using the random effects approach, after which the study level variance is accounted for, and the Stage 2 model can be fitted to the pooled correlation matrix using the asymptotic variances and covariances from Stage 1 as the weight matrix in WLS estimation. The ML approach presented here has not yet been extended to include random effects. It would be useful to be able to estimate study level variance for the Stage 2 parameters directly. One way in which the ML procedure can handle heterogeneity is to freely estimate one or more of the parameters across studies. A possible problem with this approach is that the model should still be identified in each of the studies, which may be problematic if some studies include too few variables. The simulation study we performed was small, and only served to investigate whether the ML approach is a viable alternative to the WLS approach. Future studies may examine the performance of the method with larger numbers of studies, with varying sample sizes, and with more variation in numbers and patterns of missing variables, and missing correlations.
References Becker, B. J. (1992). Using results from replicated studies to estimate linear models. Journal of Educational Statistics, 17, 341-362. Becker, B. J. (1995). Corrections to" Using Results from Replicated Studies to Estimate Linear Models". Journal of Educational and Behavioral Statistics, 20, 100-102.
14
Becker, B. J., & Schram, C. M. (2009). Model-based meta-analysis. In Cooper, H., Hedges, L. V., & Valentine, J. C. (Eds), The handbook of research synthesis and meta-analysis (2nd ed., pp. 377-395). New York, NY:Sage. Boker, S., Neale, M., Maes, H., Wilde, M., Spiegel, M., Brick, T., Spies, J., Estabrook, R., Kenny, S., Bates, T., Mehta, P. and Fox, J. (2011). OpenMx: An open source extended structural equation modeling framework. Psychometrika, 76, 306-317. Browne, M. W. (1984). Asymptotically distribution-free methods for the analysis of covariance structures. British Journal of Mathematical and Statistical Psychology, 37, 62-83. Cheung, M. W. -L. (2010). Fixed-effects meta-analyses as multiple-group structural equation models. Structural Equation Modeling, 17, 481-509. Cheung, M. W. -L. (2013a). metaSEM: Meta-analysis using structural equation modeling. Retrieved from http://courses.nus.edu.sg/course/psycwlm/Internet/metaSEM/R package version 0.8–4. (Accessed August 26, 2014.) Cheung, M. W. -L. (2013b). Multivariate meta-analysis as structural equation models. Structural Equation Modeling, 20, 429-454. Cheung, M. W. -L. (2014). Fixed-and random-effects meta-analytic structural equation modeling: Examples and analyses in R. Behavior research methods, 46(1), 29-40. Cheung, M.W. -L. and Chan, W. (2005a). Meta-analytic structural equation modeling: A two-stage approach. Psychological Methods, 10, 40-64. Cheung, M. W. -L., & Chan, W. (2005b). Classifying correlation matrices into relatively homogeneous subgroups: A cluster analytic approach. Educational and Psychological Measurement, 65, 954-979.
15
Graham, J. W., Hofer, S. M., & MacKinnon, D. P. (1996). Maximizing the usefulness of data obtained with planned missing value patterns: An application of maximum likelihood procedures. Multivariate Behavioral Research, 31(2), 197-218. Hafdahl, A. R. (2007). Combining correlation matrices: Simulation analysis of improved fixed-effects methods. Journal of Educational and Behavioral Statistics, 32, 180-205. Hafdahl, A. R. (2008). Combining heterogeneous correlation matrices: Simulation analysis of fixed-effects methods. Journal of Educational and Behavioral Statistics, 33, 507-533. Hedges, L. V. & Olkin, I. (1985). Statistical methods for meta-analysis. Orlando, FL: Academic Press. Hunter, J. E. & Schmidt, F. L. (Eds.). (2004). Methods of meta-analysis: Correcting error and bias in research findings. Newbury Park, CA: Sage. Jak, S., Oort, F. J., Roorda, D. L., & Koomen, H. M. (2013). Meta-analytic structural equation modelling with missing correlations. Netherlands Journal of Psychology, 67, 132-139. Muthén, B., Kaplan, D., & Hollis, M. (1987). On structural equation modeling with data that are not missing completely at random. Psychometrika, 52, 431-462. Neale, M. C., & Miller, M. B. (1997). The use of likelihood-based confidence intervals in genetic models. Behavior genetics, 27, 113-120. R Development Core Team (2011). R: A Language and Environment for Statistical Computing. [Computer software]. Vienna, Austria. Viswesvaran, C., & Ones, D. (1995). Theory testing: Combining psychometric meta-analysis and structural equations modeling. Personnel Psychology, 48, 865-885. Zhang, Y. (2011). Meta-analytic structural equation modeling (MASEM): Comparison of methods. Electronic Theses, Treatises and Dissertations. Paper 534.
16
Table 1. An overview of models and associated tests in the ML approach to MASEM Model Model 0 (saturated) Model 1 (Stage 1) Model 2 (Stage 2)
Equation Σg = Dg Rg Dg Σg = Dg ( Mg R MgT ) Dg Σg = Dg ( Mg RMODEL MgT ) Dg
Chi-square Difference Test Homogeneity of correlations (Model 1 vs Model 0) Fit of structural equation model (Model 2 vs Model 1)
17
Table 2. The patterns of missing data in conditions with 5, 10, and 15 studies. Conditions with 5 studies Conditions with 10 studies Conditions with 15 studies Observed variable Observed variable Observed variable x1 x2 x3 x4 x5 x6 x1 x2 x3 x4 x5 x6 x1 x2 x3 x4 x5 x6 Study 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 Study 2 0 1 1 1 1 1 0 1 1 1 1 1 0 1 1 1 1 1 Study 3 1 0 1 1 1 0 1 0 1 1 1 0 1 0 1 1 1 0 Study 4 1 1 0 1 0 1 1 1 0 1 0 1 1 1 0 1 0 1 Study 5 1 1 1 0 1 1 1 1 1 0 1 1 1 1 1 0 1 1 Study 6 0 0 0 1 1 1 0 0 0 1 1 1 Study 7 1 0 0 0 1 1 1 0 0 0 1 1 Study 8 1 1 0 0 0 1 1 1 0 0 0 1 Study 9 1 1 1 0 0 0 1 1 1 0 0 0 Study 10 0 1 1 1 1 0 0 0 1 1 1 0 Study 11 0 0 0 0 1 1 Study 12 1 0 0 0 0 0 Study 13 1 1 0 0 0 0 Study 14 0 1 1 0 0 0 Study 15 0 0 0 1 1 0 Note: 1 = variable present in study; 0 = variable absent in study.
18
Table 3. Estimation bias for selected parameters, in percentages. ML
λ11 WLS
ML
λ21 WLS
ML
λ31 WLS
ML
Φ21 WLS
Convergence* ML WLS
5 studies
N=50 N=100 N=200 N=500 N=1000
-0.45 -0.77 -0.05 -0.01 0.11
-1.23 -1.22 -0.25 -0.09 0.06
-0.33 0.19 -0.04 -0.01 0.06
-1.79 -0.49 -0.29 -0.14 0.00
0.16 0.28 -0.21 -0.04 -0.26
-1.64 -0.59 -0.63 -0.22 -0.33
1.21 0.57 0.67 -0.18 0.13
-1.67 -1.00 -0.17 -0.51 -0.05
498 499 500 500 499
497 499 500 500 500
10 studies
N=50 N=100 N=200 N=500 N=1000
0.06 0.06 0.06 -0.16 -0.13
-0.56 -0.22 -0.09 -0.22 -0.16
-0.99 -0.32 -0.14 0.16 0.04
-1.76 -0.66 -0.30 0.08 0.01
-0.47 -0.27 -0.17 0.06 0.02
-1.81 -0.86 -0.45 -0.05 -0.02
-1.43 -0.13 -0.64 -0.51 0.22
-3.50 -1.12 -1.15 -0.71 0.11
500 500 500 500 500
500 500 500 500 497
15 studies
N=50 N=100 N=200 N=500 N=1000
-0.19 -0.13 -0.38 -0.08 -0.03
-0.69 -0.33 -0.48 -0.13 -0.07
-0.45 -0.02 0.00 -0.09 0.10
-1.07 -0.30 -0.14 -0.17 0.09
-0.68 0.42 -0.02 -0.21 -0.15
-2.03 -0.16 -0.27 -0.34 -0.18
-0.39 -0.66 -0.85 0.00 0.16
-2.44 -1.64 -1.30 -0.13 0.05
500 500 499 493 493
500 499 500 500 494
Note: Bold figures indicate cases in which WLS estimation bias is smaller than ML estimation bias. *Number of data sets with which the analysis converged to a solution.
19
Table 4. Means, standard deviations, and rejection rates of TML and TWLS (df = 8, α = 0.05). mean 5 studies
TML sd rejection rate
mean
TWLS sd rejection rate
N=50 N=100 N=200 N=500 N=1000
8.28 8.25 7.81 8.20 8.18
3.96 4.23 3.79 4.26 4.16
0.07 0.07 0.05 0.05 0.05
8.38 8.30 7.83 8.22 8.18
4.05 4.26 3.81 4.27 4.16
0.07 0.07 0.05 0.05 0.05
10 studies N=50 N=100 N=200 N=500 N=1000
8.46 8.01 8.01 8.18 8.01
4.35 4.10 3.99 4.09 3.94
0.06 0.05 0.05 0.06 0.04
8.64 8.10 8.05 8.20 8.05
4.52 4.20 4.03 4.11 3.93
0.07 0.06 0.05 0.06 0.04
15 studies N=50 8.52 4.13 0.08 8.77 4.37 0.09 N=100 8.08 4.20 0.06 8.15 4.28 0.07 N=200 0.04 7.92 3.86 7.97 3.92 0.05 N=500 0.04 0.04 7.78 3.80 7.86 3.84 N=1000 8.15 4.01 0.05 8.23 4.04 0.06 Notes: Bold figures indicate cases in which WLS results were closer to the expected mean (8.00), standard deviation (4.00), or rejection rate (0.05). With df = 8, α = 0.05, the critical chi-square value is 15.51.
20
Table 5. False positives rates for redundant model parameter. Chi-square difference test (df = 1, α = .05) ML WLS
95% CI
Convergence
ML
WLS
ML
WLS
5 studies
N=50 N=100 N=200 N=500 N=1000
0.06 0.05 0.05 0.05 0.06
0.06 0.04 0.05 0.05 0.06
0.06 0.05 0.05 0.05 0.05
0.06 0.04 0.05 0.05 0.06
493 498 500 500 499
492 498 500 500 499
10 studies
N=50 N=100 N=200 N=500 N=1000
0.04 0.04 0.05 0.07 0.05
0.04 0.05 0.05 0.07 0.05
0.04 0.04 0.05 0.07 0.05
0.04 0.05 0.05 0.07 0.05
497 499 499 500 500
496 500 499 499 496
15 studies
N=50 0.05 0.05 0.05 0.05 498 499 N=100 0.06 0.06 0.06 0.06 497 497 N=200 0.07 0.07 0.07 0.07 499 499 N=500 493 500 0.06 0.05 0.06 0.05 N=1000 0.05 0.05 0.05 0.05 493 496 Note: Bold figures indicate cases in which WLS has better (lower) false positive rates than ML. Proportions are calculated based on converged solutions only.
21
Table 6. Power of the overall test, the likelihood ratio test and the likelihood based confidence intervals to detect a cross loading of 0.10. N
5 studies
Chi-square difference Chi-square difference 95% CI Convergence test ( df = 8, α = 0.05) test (df = 1, α = 0.05) Exp. ML WLS Exp. ML WLS ML WLS ML WLS
50 100 200 500 1000
0.10 0.17 0.33 0.76 0.98
0.11 0.17 0.32 0.74 0.98
0.13 0.16 0.31 0.72 0.98
0.22 0.38 0.66 0.96 1.00
0.24 0.38 0.64 0.96 1.00
0.25 0.39 0.65 0.96 1.00
0.24 0.38 0.64 0.96 1.00
0.25 0.39 0.65 0.96 1.00
496 499 499 497 499
495 499 499 500 497
10 studies 50 100 200 500 1000
0.12 0.21 0.42 0.87 1.00
0.11 0.21 0.40 0.83 1.00
0.13 0.22 0.40 0.83 1.00
0.26 0.46 0.75 0.99 1.00
0.27 0.45 0.72 0.99 1.00
0.28 0.47 0.73 0.99 1.00
0.26 0.45 0.72 0.99 1.00
0.28 0.47 0.73 0.99 1.00
498 500 500 500 500
498 499 500 500 493
15 studies 50 0.11 0.11 0.13 0.22 0.23 495 500 0.25 0.22 0.23 100 0.20 0.24 0.25 0.44 0.44 0.43 0.43 0.43 500 498 200 0.39 0.40 0.40 0.73 0.73 0.73 0.73 0.73 500 497 500 0.84 0.85 0.85 0.98 0.97 0.97 0.97 0.97 494 500 1000 0.99 1.00 1.00 1.00 1.00 1.00 1.00 1.00 499 493 Note: Bold figures indicate cases in which WLS power is closer to the expected power that is based on the non-centrality parameter.
22
Figure 1. Population model used in data generation.
.30 1
1
ζ1 .80
.70
ζ2 .60
.80
.70
.60
y1
y2
y3
y4
y5
y6
.36
.51
.64
.36
.51
.64
23