Chapter 14 Latent variable models (SEM 2)

The previous chapter concerned a multivariate model to describe relations between observed variables. In this Chapter, we will extend this idea by adding latent variables. Latent variables are variables which can not be directly observed. Instead, they can be measured or inferred via their relation with observed variables. A classic example in psychology is intelligence. Another example are personality traits such as extraversion. People’s tendency to be outwards facing is not directly observable. However, there are indicators (e.g., someone actively seeking out busy social situations, liking to be the centre of attention, etc.) which together may allow one to determine a person’s level of extraversion.

The measurement of psychological constructs is the focus of psychometrics, a field with a long history and its own societies and journals.

14.1 Measurement of latent variables

Let’s start with a simple example: measuring temperature. Temperature is a physical quantity that reflects the kinetic energy of atoms in a substance or the air. This kinetic energy is not directly observable, but can be measured with a thermometer. A classic mercury thermometer has a bulb filled with mercury attached to a narrow tube of glass. An increase in heat expands the volume of the mercury so that the level of mercury in the narrow tube rises. In other words, the temperature is the cause of the state of the measurement device. Markers along the narrow tube can be placed such that the level of mercury in the tube corresponds to a standard scale, such as the Fahrenheit or Celcius scale. In the latter, the scale is calibrated such that \(0^\circ \textrm{C}\) represents the freezing point of water, and \(100^\circ \textrm{C}\) represents the boiling point of water. In the Fahrenheit scale, the freezing point of water corresponds to \(32^\circ \textrm{F}\), and the boiling point to \(212^\circ \textrm{F}\).

Thermometers are relatively accurate measurement devices, but not devoid of measurement error. That means that the measurement device can provide different readings for exactly the same true temperature. Such measurement error may be due to factors affecting the measurement device (e.g. the volume of mercury not being deterministically related to the temperature) as well as factors affecting the observer (e.g. the person reading the value of the thermometer not being able to see very small changes on the thermometer).

As a statistical model, we might propose something like the following linear model: \[\texttt{measurement}_i = \beta_0 + \beta_1 \times \textrm{temperature}_i + \epsilon_i \quad \quad \epsilon_i \sim \mathbf{Normal}(0, \sigma_\epsilon)\] This model is depicted graphically in Figure 14.1. If temperature were a directly observed variable, this is just a simple regression model and we would be able to estimate all parameters (\(\beta_0\), \(\beta_1\), \(\sigma_\epsilon\)) using e.g. maximum likelihood. However, temperature is a latent variable and we can only observe the measurement.

Measurement model for temperature.

Figure 14.1: Measurement model for temperature.

14.1.1 Scaling and identification

As we start working towards general SEM models with latent variables, let’s write down our model for temperature measurements in more abstract terms as: \[\begin{aligned} Y_{i} &= \alpha + \lambda \times \eta_{i} + \epsilon_{i} \\ \eta_i &\sim \mathbf{Normal}(\mu_\eta, \sigma_\eta)\\ \epsilon_i &\sim \mathbf{Normal}(0, \sigma_\epsilon) \end{aligned}\] Note that in addition to relabelling the intercept as \(\alpha\), the slope as \(\lambda\), and the latent variable as \(\eta\), we have also added an assumption about the values of the latent variable, namely that these are Normal-distributed. We need to make such an assumption about the distribution of the latent variable because without it, we cannot infer properties of the latent variable from measurements \(Y\).

The model above implies that \(Y\) is Normal-distributed with mean and variance \[\begin{aligned} \mu_Y &= \alpha + \lambda \times \mu_\eta \\ \sigma^2_Y &= \lambda^2 \times \sigma^2_\eta + \sigma^2_\epsilon \end{aligned} \] From observations of \(Y\), we can estimate the mean and variance of \(Y\) in the usual manner as \(\hat{\mu}_Y = \overline{Y}\) and \(\hat{\sigma}^2_Y = \frac{\sum_{i=1}^n (Y_i - \overline{Y})^2}{n-1}\). We then have two “observed” values, whilst our latent variable model has 5 parameters. We cannot estimate 5 parameters from what are effectively two properties of the data. For example, consider \[\hat{\mu}_Y = \alpha + \lambda \times \mu_\eta\]

We could set \(\alpha = \hat{\mu}_Y\) and \(\lambda = 0\). This would provide a perfect fit, but \(\mu_\eta\) could then be anything. We could also set \(\alpha = 0\), \(\lambda = 1\), and \(\mu_\eta = \hat{\mu}_Y\). This would again provide a perfect fit, but so would setting \(\alpha = 0\), \(\lambda = 2\), and \(\mu_\eta = \tfrac{1}{2} \hat{\mu}_Y\).

There are two issues here. The first is the number of parameters relative to the number of observed properties of the data. The second is that latent variables don’t have an inherent numerical scale. Temperature can be measured in degrees Celcius or Fahrenheit. Both are valid ways to provide a number to temperature. But temperature itself does not has an inherent number beyond such measurements. The numeric scale of a latent variable is therefore arbitrary. What matters is the relation between a latent variable and its measurements: an increase in the latent variable results in an increase in the measurements. Beyond that, we can choose the scale of the latent variable as we like. This is similar to regression, where centering or changing the scale of the predictor affects the intercept and slopes, but does not change the model fit itself.

As the scale of the latent variable is arbitrary, we can choose how the numerical values of the latent variable corresponds to numerical values of the measurements. There are two common choices to scale latent variables. In both, we set \(\mu_\eta = 0\). The first is to then set \(\lambda = 1\). This implies that a one-unit increase in the latent variable is equal to a one-unit increase in the measurement. The second way is to set \(\sigma^2_\eta = 1\). This implies that the latent variable follows a standard Normal distribution. A one-unit increase in the latent variable would then result in a one standard deviation increase in the measurement (i.e. an increase of \(\sigma_y\)). As the correlation between the latent variable and the measurement is the same for both these choices, they are both valid ways to scale the latent variable. The first is more common nowadays, and has the benefit that the scale of the latent variable may be more easily interpretable, as it relates directly to the scale of the measurement.

Let’s focus on the first choice of scaling. Setting \(\mu_\eta=0\) and \(\lambda = 1\) reduces the number of free parameters to 3. This is still more than the two observed properties of the data. Hence, there is still no way to uniquely determine the remaining parameters.

Now, suppose we have readings from two thermometers, \(Y_1\) and \(Y_2\). Suppose also that we don’t know the scale of each (they could measure degrees Celcius, Fahrenheit, or something else, and each could have a different scale), and that each thermometer can have a different measurement error. All we know is that readings on both are caused by the true temperature \(\eta\). Setting \(\mu_\eta = 0\), we can express the model for simultaneous readings of both thermometers as: \[\begin{aligned} Y_{1,i} &= \alpha_1 + \lambda_1 \times \eta_{i} + \epsilon_{1,i} \\ Y_{2,i} &= \alpha_2 + \lambda_2 \times \eta_{i} + \epsilon_{2,i} \\ \eta_i &\sim \mathbf{Normal}(0, \sigma_\eta)\\ \epsilon_{1,i} &\sim \mathbf{Normal}(0, \sigma_{\epsilon_1}) \\ \epsilon_{2,i} &\sim \mathbf{Normal}(0, \sigma_{\epsilon_2}) \end{aligned}\] Using variance-covariance algebra (Section 13.4.1.1), this implies that \[\begin{equation} \begin{aligned} \left( \begin{matrix} Y_{1,i} \\ Y_{2,i} \end{matrix} \right) &\sim \mathbf{Normal}\left(\boldsymbol{\mu}_Y, \boldsymbol{\Sigma}_Y \right) \\ &\sim \mathbf{Normal}\left( \left[ \begin{matrix} \alpha_1 \\ \alpha_2 \end{matrix} \right] , \left[ \begin{matrix} \lambda_1^2 \times \sigma_{\eta}^2 + \sigma_{\epsilon_1}^2 & \lambda_1 \times \lambda_2 \times \sigma^2_\eta \\ \lambda_1 \times \lambda_2 \times \sigma^2_\eta & \lambda_2^2 \times \sigma_{\eta}^2 + \sigma_{\epsilon_2}^2 \end{matrix} \right] \right) \end{aligned} \end{equation}\] We have two parameters (\(\alpha_1\) and \(\alpha_2\)) for two means (\(\mu_{Y_1}\) and \(\mu_{Y_2}\)), which is fine. However, the implied covariance matrix \(\boldsymbol{\Sigma}_Y\) contains five parameters: \(\lambda_1\), \(\lambda_2\), \(\sigma_\eta\), \(\sigma_{\epsilon_1}\), \(\sigma_{\epsilon_2}\). But the observed covariance matrix has three unique values: \(\hat{\sigma}^2_{Y_1}\), \(\hat{\sigma}^2_{Y_2}\), and \(\hat{\sigma}_{Y_1,Y_2}\). To scale the latent variable, we can set one of the \(\lambda\)’s to 1, e.g. \(\lambda_1 = 1\). But we cannot set both \(\lambda\)’s to 1, as the thermometers may have a different scale. We could alternatively set \(\sigma_\epsilon = 1\). Both choices would reduce the number of parameters to four. We therefore still have too many parameters relative to the properties of the data.

When we have readings from three thermometers, we finally get to an estimable model. The model for simultaneous readings of thermometers \(Y_{j}\) can be expressed as: \[\begin{aligned} Y_{j,i} &= \alpha_1 + \lambda_j \times \eta_{i} + \epsilon_{j,i} && j = 1, \ldots, 3\\ \eta_i &\sim \mathbf{Normal}(0, \sigma_\eta)\\ \epsilon_{j,i} &\sim \mathbf{Normal}(0, \sigma_{\epsilon_1}) && j = 1, \ldots, 3 \end{aligned}\] which implies \[\begin{equation} \begin{aligned} \left( \begin{matrix} Y_{1,i} \\ Y_{2,i} \\ Y_{3,i} \end{matrix} \right) &\sim \mathbf{Normal}\left( \left[ \begin{matrix} \alpha_1 \\ \alpha_2 \\ \alpha_3 \end{matrix} \right] , \left[ \begin{matrix} \lambda_1^2 \times \sigma_{\eta}^2 + \sigma_{\epsilon_1}^2 & \lambda_1 \times \lambda_2 \times \sigma^2_\eta & \lambda_1 \times \lambda_3 \times \sigma^2_\eta \\ \lambda_1 \times \lambda_2 \times \sigma^2_\eta & \lambda_2^2 \times \sigma_{\eta}^2 + \sigma_{\epsilon_2}^2 & \lambda_2 \times \lambda_3 \times \sigma^2_\eta \\ \lambda_1 \times \lambda_3 \times \sigma^2_\eta & \lambda_2 \times \lambda_3 \times \sigma^2_\eta & \lambda_3^2 \times \sigma_{\eta}^2 + \sigma_{\epsilon_3}^2 \end{matrix} \right] \right) \end{aligned} \end{equation}\] The implied covariance matrix has seven parameters, whilst the observed covariance matrix has six unique values. Setting e.g. \(\lambda_1 = 1\) to scale the latent variable, we are left with six parameters for six observed (co-)variances. Hence, although saturated, the model is estimable.

To measure a latent variable, we thus need at least three measured variables. Such measured variables are also called indicators.

14.2 Confirmatory factor analysis

Factor analysis is a technique for the measurement of latent variables when the indicators or measured variables are assumed to follow a (multivariate) Normal distribution. A distinction can be made between confirmatory and exploratory factor analysis. The latter, as the name suggests, is part of exploratory data analysis. Exploratory factor analysis aims to describe relations between observed variables via a smaller set of latent variables or factors, and was first proposed by Spearman (1904). Exploratory factor analysis is essentially based on saturated models. Confirmatory factor analysis, on the other hand, poses more restrictions. A main goal of confirmatory factor analysis is to assess the viability of an assumed measurement model.

In the following, we will introduce confirmatory factor analysis with an example of personality measurement. “Big Five” personality traits such as extraversion are generally assessed with self-report questionnaires. One commonly used questionnaire is the Big Five Inventory [BFI; John, Donahue, & Kentle (1991)]. In the BFI-2 (Soto & John, 2017), participants are asked to rate their agreement with 60 statements on a 5-point scale ranging “disagree strongly” to “agree strongly”. There are 12 statements relating to each of the following main personality dimensions: Extraversion, Agreeableness, Conscientiousness, Negative Emotionality (traditionally called “neuroticism”), and Open-Mindedness (tranditionally called “openness to experience”). Example statements are related to extraversion are “I am someone who is outgoing, sociable” and “I am am someone who is sometimes shy, introverted”. Note that agreement with the first statement should be positively related to extraversion, whilst agreement with the second statement should be negatively related to extraversion. Mixing such positively and negatively stated items is common practice in the design of psychological measurement questionnaires. Here, we will consider data collected with a Czech translation of the BFI-2 (Hřebı́čková et al., 2020), with a total of \(n=1733\) respondents.

14.2.1 A one-factor model

The BFI-2 contains \(k=12\) items which relate to extraversion. As these items pertain to the same psychological dimension, we would expect answers to them to be correlated. A confirmatory one-factor model aims to account for these correlations by assuming that all answers are “caused” by a single underlying latent variable (factor). The one-factor model can be expressed as \[\begin{aligned} Y_{j,i} &= \alpha_j + \lambda_j \times \eta_{i} + \epsilon_{j,i} && j=1,\ldots,k\\ \eta_i &\sim \mathbf{Normal}(0, \sigma_\eta)\\ \epsilon_{j,i} &\sim \mathbf{Normal}(0, \sigma_{\epsilon_j}) && j=1,\ldots,k \end{aligned}\] In the context of factor analysis, the latent variable \(\eta\) is referred to as a latent factor, and the parameters \(\lambda_j\) which relate the factor to the indicators (measured variables) \(Y_j\) as factor loadings. Each indicator is allowed to have measurement errors, such that none provides a perfect measurement of the latent factor. But a key assumption is that the measurement errors of different indicators are independent and therefore uncorrelated. Any correlation between scores on the indicators is then fully accounted for by the underlying latent factor. This independence of measurement errors allow indicators to be more informative about the latent factor. Going back to our temperature example, if one thermometer provides a too high measurement, that would not imply that other thermometers would also provide too high measurements, allowing the measurement errors to cancel each other out. Correlated measurement errors, on the other hand, make this more difficult. Positively correlated measurement errors would imply that if one thermometer provides a too high measurement, so would others. And if one thermometer provides a too low measurement, so would others. When measurement errors are highly correlated, there is then relatively little to gain from adding more measurement instruments.

Graphical representation of a single factor model for Extraversion. Items i1, i16, i31, and i46 relate to the Sociability facet of Extraversion, with items i16 and i31 stated in the negative direction. Items i6, i21, i36, and i51 relate to the Assertiveness facet of Extraversion, with items i36 and i51 stated in the negative direction. Items 11, 26, 41, 56 relate to the Energy Level facet of Extraversion, with items i11 and i26 stated in the negative direction.

Figure 14.2: Graphical representation of a single factor model for Extraversion. Items i1, i16, i31, and i46 relate to the Sociability facet of Extraversion, with items i16 and i31 stated in the negative direction. Items i6, i21, i36, and i51 relate to the Assertiveness facet of Extraversion, with items i36 and i51 stated in the negative direction. Items 11, 26, 41, 56 relate to the Energy Level facet of Extraversion, with items i11 and i26 stated in the negative direction.

Figure 14.2 shows the estimated parameters of a single factor model for the 12 items related to extraversion in the BFI-2. The factor loading of the first item was fixed to 1 in order to scale the latent factor. Note that all estimated factor loadings are positive, whilst 6 of the 12 items were stated in a negative direction (i.e. higher scores should indicate less extraversion). If all items relate to extraversion in the intended manner, we should expect a mix of positive and negative factor loadings. Table 14.1 shows the parameter estimates and test results, as well as the model fit indices. As can be seen there, the single factor model is rejected by the overall model \(\chi^2\) test, and the CFI, SRMR and RMSEA indicate relatively poor model fit.

Table 14.1: Results of a single factor model for extraversion.
Model
Estimate Std. Err. z p
Factor Loadings
E
i1 1.00+
i16 1.43 0.05 27.13 .000
i31 1.25 0.05 25.06 .000
i46 1.14 0.04 25.30 .000
i6 0.97 0.04 23.55 .000
i21 1.15 0.05 24.69 .000
i36 0.55 0.04 14.14 .000
i51 1.05 0.04 23.57 .000
i11 0.47 0.04 11.36 .000
i26 0.51 0.03 14.72 .000
i41 0.79 0.04 20.15 .000
i56 0.67 0.04 18.45 .000
Residual Variances
i1 0.62 0.02 26.29 .000
i16 0.72 0.03 23.96 .000
i31 0.84 0.03 25.80 .000
i46 0.66 0.03 25.63 .000
i6 0.65 0.02 26.64 .000
i21 0.76 0.03 26.03 .000
i36 0.93 0.03 28.81 .000
i51 0.77 0.03 26.63 .000
i11 1.12 0.04 29.06 .000
i26 0.72 0.03 28.75 .000
i41 0.75 0.03 27.81 .000
i56 0.69 0.02 28.18 .000
Latent Variances
E 0.49 0.03 14.86 .000
Fit Indices
χ2 1594.65(54) .000
CFI 0.79
SRMR 0.08
RMSEA 0.13
RMSEA (lower bound) 0.12
RMSEA (upper bound) 0.13
AIC 57031.05
BIC 57162.03
+Fixed parameter


The poor model fit of the single factor model indicates that the correlations between answers on the 12 items are not completely accounted for by a single factor. Soto & John (2017) suggest that participants’ responses to the BFI-2 might be affected by their “acquiescent response style”. This refers to the tendency of an individual to consistently agree or consistently disagree with questionnaire items, regardless of their content. Such general (dis)agreement would distort the factor loadings. They suggest to account for Acquiescence by including a second factor in the model, with fixed factor loadings on all items. By fixing the factor loadings to be identical for all items, this factor can account for a general tendency to provide low or high scores on all items, as such a tendency would be expected to affect answers on all items in the same way.

Graphical representation of a single factor model for Extraversion with an additional Acquiescence factor.

Figure 14.3: Graphical representation of a single factor model for Extraversion with an additional Acquiescence factor.

By fixing all the factor loadings, this additional factor only requires one additional parameter for the variance of the latent Acquiescence factor. The resulting model is depicted in Figure 14.3 and the resulting estimates and fit measures are provided in Table 14.2. Whilst there are now a few negative factor loadings and the fit is somewhat improved, the model is still not adequate.

Table 14.2: Results of a single factor model for Extraversion with an additional Acquiescence factor.
Model
Estimate Std. Err. z p
Factor Loadings
E
i1 1.00+
i16 1.89 0.10 18.37 .000
i31 1.59 0.09 17.97 .000
i46 1.17 0.07 16.97 .000
i6 0.70 0.06 12.43 .000
i21 0.93 0.07 14.23 .000
i36 0.20 0.06 3.36 .001
i51 0.73 0.06 12.11 .000
i11 -0.16 0.07 -2.40 .016
i26 -0.07 0.06 -1.25 .210
i41 0.44 0.05 8.19 .000
i56 0.20 0.05 3.90 .000
Acq
i1 1.00+
i16 1.00+
i31 1.00+
i46 1.00+
i6 1.00+
i21 1.00+
i36 1.00+
i51 1.00+
i11 1.00+
i26 1.00+
i41 1.00+
i56 1.00+
Residual Variances
i1 0.60 0.02 26.34 .000
i16 0.49 0.03 15.48 .000
i31 0.72 0.03 22.73 .000
i46 0.64 0.03 25.64 .000
i6 0.67 0.02 27.33 .000
i21 0.82 0.03 27.38 .000
i36 0.90 0.03 27.16 .000
i51 0.80 0.03 27.66 .000
i11 0.94 0.04 25.56 .000
i26 0.58 0.02 23.92 .000
i41 0.71 0.03 27.27 .000
i56 0.60 0.02 25.98 .000
Latent Variances
E 0.26 0.03 8.79 .000
Acq 0.28 0.01 19.44 .000
Latent Covariances
E w/Acq 0.00+
Fit Indices
χ2 1214.16(53) .000
CFI 0.84
SRMR 0.07
RMSEA 0.11
RMSEA (lower bound) 0.11
RMSEA (upper bound) 0.12
AIC 56652.56
BIC 56789.00
+Fixed parameter


The BFI-2 was designed to cover different aspects of each personality trait. For example, extraversion was assumed to consist of the “facets” sociability, assertiveness, and energy level. The BFI-2 contains four items for each of these facets. Insofar as these facets reflect different aspects of extraversion, we might expect then to be related to general extraversion, but to also be somewhat independent. To allow for this possibility, we could consider a model where general extraversion “causes” values on each latent facet, which in turn cause scores on the different items. Including the Aqcuiescence factor, the resulting model is depicted in Figure 14.4. Note that the three facets have additional residual variance, but any correlation between the facets is entirely accounted for by the higher-order factor. As there are just three facets for the higher-order extraversion factor, that factor is “just-identified” (similar to needing at least three indicators to measure one factor). The part of the model consisting of one higher-order factor and three lower-order factors is saturated, meaning that the model can perfectly account for any the variances and covariances of the facets. Because each facet loads onto four indicators, the part of the model linking facets to indicators is not saturated. As there are 12 observed variables, there are 12 means, 12 variances, and \(\frac{12 \times 11}{2} = 66\) covariances. A fully saturated model would thus have \(\text{npar}(S) = 90\) parameters. The present model is more constrained, using \(\text{npar}(M) = 49\) parameters.

Graphical representation of a hierarchical factor model for Extraversion with an additional Acquiescence factor. Es represents extraversion-sociability, Ea  extraversion-assertiveness, and Ee extraversion-energy-level

Figure 14.4: Graphical representation of a hierarchical factor model for Extraversion with an additional Acquiescence factor. Es represents extraversion-sociability, Ea extraversion-assertiveness, and Ee extraversion-energy-level

Table 14.3 provides more detailed results. Although the model is rejected by the overall fit chi-square test, the fit indices indicate a mostly reasonable fit. Inspecting the estimated variance of the latent variables shows something peculiar though: the variance of the \(\texttt{Es}\) factor is estimated to be negative. A variance can never be smaller than 0, so this estimate is by definition not a good representation of the true variance in the DGP. Negative variance estimates are common in structural equation modelling (Kolenikov & Bollen, 2012).

Table 14.3: Results of a hierarchical factor model for Extraversion with an additional Acquiescence factor.
Model
Estimate Std. Err. z p
Factor Loadings
E
Es 1.00+
Ea 0.42 0.12 3.63 .000
Ee 0.22 0.07 3.08 .002
Es
i1 1.00+
i16 1.95 0.10 19.53 .000
i31 1.53 0.08 19.29 .000
i46 1.19 0.06 18.45 .000
Ea
i6 1.00+
i21 1.68 0.09 18.66 .000
i36 0.49 0.06 8.90 .000
i51 1.30 0.07 19.13 .000
Ee
i11 1.00+
i26 0.23 0.09 2.52 .012
i41 1.00+
i56 1.47 0.18 8.37 .000
Acq
i1 1.00+
i16 1.00+
i31 1.00+
i46 1.00+
i6 1.00+
i21 1.00+
i36 1.00+
i51 1.00+
i11 1.00+
i26 1.00+
i41 1.00+
i56 1.00+
Residual Variances
i1 0.59 0.02 26.08 .000
i16 0.41 0.03 12.01 .000
i31 0.73 0.03 23.19 .000
i46 0.63 0.02 25.30 .000
i6 0.56 0.02 25.04 .000
i21 0.35 0.03 10.99 .000
i36 0.85 0.03 27.30 .000
i51 0.55 0.03 21.43 .000
i11 0.92 0.04 25.88 .000
i26 0.60 0.02 24.62 .000
i41 0.65 0.03 23.85 .000
i56 0.39 0.04 10.35 .000
Latent Variances
E 0.38 0.09 4.02 .000
Es -0.09 0.09 -0.96 .335
Ea 0.23 0.02 9.30 .000
Ee 0.13 0.02 6.25 .000
Acq 0.22 0.02 13.41 .000
Latent Covariances
E w/Acq 0.00+
Fit Indices
χ2 401.63(51) .000
CFI 0.95
SRMR 0.05
RMSEA 0.06
RMSEA (lower bound) 0.06
RMSEA (upper bound) 0.07
AIC 55844.03
BIC 55991.38
+Fixed parameter


14.2.2 Heywood cases

Negative variance estimates (and estimated correlations with an absolute value larger than 1) are also called “Heywood cases” (after the statistician H. B. Heywood). Such illogical estimates can be problematic, as they may indicate model misspecification. Other reasons for negative variance estimates include sampling fluctuations and true variances close to 0, outliers, missing data, nonconvergence of the numerical estimation procedure, and empirical underidentification (e.g. the parameters cannot be estimated adequately due to extremely low or high correlations).

For negative variances, Kolenikov & Bollen (2012) suggest to perform a hypothesis test whether the “true variance” is smaller than 0. If the best possible value of a variance with respect to the DGP is negative, that would indicate a structural misspecification of the model (i.e. the structural model does not correspond to the structure of the DGP). Confidence intervals can also be used for the same purpose. In the example above, the Wald test for the variance of Es is not significant. As such, there is no strong evidence for a misspecification, and we could conclude that the negative error variance might be due to estimation error.

14.2.3 Full hierarchical factor model for the BFI-2

The BFI-2 was designed with a hierarchical structure in mind for each of the “big five” factors. Each factor was assumed to be further broken down into three facets: Extraversion (E) into sociability (Es), assertiveness (Ea) and energy level (Ee), Agreeableness into compassion (Ac), respectfulness (Ar), and trust (At), Conscientiousness into organization (Co), productiveness (Cp), and responsibility (Cr), Negative Emotionality into anxiety (Na), depression (Nd), and emotional volatility (Ne), and Open-Mindedness into intellectual curiosity (Oi), aesthetic sensitivity (Oa), and creative imagination (Oc. Earlier, we estimated a hierarchical factor model for extraversion. Here, we estimate a hierarchical factor model for all “big five” factors, also including the additional Acquiescence factor.

Figure 14.5 shows the estimated model. Note that the higher-order latent factors were assumed to be independent (i.e. their covariances were fixed to 0). Independence of the factors implies that each represents a separable aspect of personality, in the sense that knowing someone’s relative extraversion would not help you infer their relative agreeableness. Each higher-order factor loads on three lower-order factors (the “facets”). These facets have additional residual variance, but any correlation between the facets is entirely accounted for by the higher-order factors. As in the earlier hierarchical model for extraversion, these higher-order factors are “just-identified”.

The BFI-2 consists of \(P=60\) items. Hence, there are 60 means, 60 variances, and \(\frac{60 \times 59}{2} = 1770\) covariances. A saturated model for this data would thus have a total of \(\text{npar}(S) = \frac{P \times (P-1)}{2} + 2 \times P = 1890\) parameters. The estimated model has a total of \(\text{npar}(M) = 196\) parameters. Whilst complex, the model is rather restricted as compared to a saturated model. Detailed results are provided in Table 14.6 at the end of this chapter.

Graphical representation of a hierarchical factor model for all personality domains and an additional Acquiescence factor.

Figure 14.5: Graphical representation of a hierarchical factor model for all personality domains and an additional Acquiescence factor.

Table 14.4 shows selected fit measures of this model, as well as of an alternative Dependent model which allows the “big five” factors to be correlated whilst all are independent from the Acquiescence factor. Fit measures for the Independent model are less then convincing, with a significant overall model fit test, a too low value for the CFI, and a too high one for the SRMR. The Dependent model fits better, and is acceptable according to the RMSEA and SRMR, but the CFI indicates relatively poor fit, and this model is also rejected by the overall model fit. As the Independent model is nested in the Dependent model, the models can be compared with a likelihood ratio test. The result is significant, indicating that the Dependent model provides a better fit and hence the factors are unlikely to be independent.

Table 14.4: Fit measures for hierarchical factor models for the BFI-2 with an additional Acquiescence factor. The Independent model assumes all factors are independent. The Dependent model allows dependence between the “big five” factors, but assumes these are independent from the Acquiescence factor
\(\chi^2\) df \(p\) CFI RMSEA SRMR \(\Delta \chi^2\) df \(p\)
Dependent 10324 1684 0 0.79 0.05 0.08 NA - -
Independent 11043 1694 0 0.78 0.06 0.12 719 10 0

Whilst the Dependent model fits better, there are still signs of misfit. Table 14.5 shows the fixed-to-zero parameters with the ten highest modification indices. These indicate several routes for improvement in model fit. The highest modification index is for the residual covariance between the Negative Emotionality factor and Acquiescence factor. The expected parameter change indicates that this parameter would be estimated to be negative (i.e. there would be a negative correlation between these factors). This could indicate that those with higher Negative Emotionality are less likely to agree to statements in general. Whilst that could be plausible in one way or another, it is not something that was a priori expected. More importantly, the idea of the latent Acquiescence factor was to account for a tendency to agree or disagree with statements irrespective of their content, and by controlling for such a tendency to allow for better measurement of the remaining latent factors. Introducing a correlation between the Acquiescence factor and the factors of substantial interest would counteract this goal. When considering to change a model to improve empirical fit, such theoretical concerns are more important than high modification indices. The second-highest modification index is for the residual covariance between items 54 and 39, which appear to have a positive correlation which is not fully accounted for by the model. Item 54 and 39 both loads directly onto the \(\texttt{Nd}\) facet.

Table 14.5: Modification indices and expected change in parameter values for fixed-to-zero parameters in the hierarchical factor model for the BFI-2.
Parameter Modification index Expected change
\(\texttt{Acq} \leftrightarrow \texttt{N}\) 280 -0.188
\(\texttt{i54} \leftrightarrow \texttt{i39}\) 179 0.336
\(\texttt{A} \leftrightarrow \texttt{Ea}\) 172 -0.080
\(\texttt{Nd} \leftrightarrow \texttt{Ee}\) 169 -0.064
\(\texttt{i53} \leftrightarrow \texttt{i38}\) 165 0.251
\(\texttt{N} \leftrightarrow \texttt{Ee}\) 148 -0.105
\(\texttt{i41} \rightarrow \texttt{Nd}\) 141 -0.471
\(\texttt{i37} \leftrightarrow \texttt{i47}\) 140 0.317
\(\texttt{i45} \rightarrow \texttt{Oi}\) 139 1.361
\(\texttt{i52} \leftrightarrow \texttt{i7}\) 135 0.129

14.2.4 Predicting factor scores

Once a SEM with latent variables is estimated, we may be interested in inferring the values of the latent variables underlying each observation. For example, given their answers to the items in the BFI-2, is participant A relatively extravert? And how much more than participant B? In a confirmatory factor analysis model, the latent factors are assumed to cause the values of the observed variables. Factor loadings then represent the effect of the latent variable upon the observed variables. When predicting latent factor values, we need to go in the opposite direction, from observed to latent variables. Such reverse inference is not necessarily straightforward, and there are many possible methods which are all consistent with the underlying model, yet give different results. This is called “factor indeterminancy(Maraun, 1996).

Factor score predictions are weighted sums of the values of the observed variables (indicators). The weights given to each variable are not identical to the factor loadings, but related to these. The factor prediction weights take other aspects into consideration as well, such as residual variances and covariances between factors. There are two widely-used methods to predict the values of the latent variables: the regression predictor and the Bartlett predictor (Devlieger & Rosseel, 2023). It is beyond the scope of this book to go into the details of these methods. Predicting factor scores is a controversial issue. Bartholomew, Deary, & Lawn (2009) argue that the regression method has a more relevant justification than the Bartlett method.

14.3 Exploratory factor and principal components analysis

Although not really a part of Structural Equation Models, it is instructive to also briefly discuss data reduction techniques such as principal components analysis (PCA) and exploratory factor analysis (EFA). These methods aim to describe a \(P \times P\) covariance (or correlation) matrix by via a relatively small set of latent variables. Essentially, both methods rely on saturated models which can replicate the sample covariance matrix perfectly. Neither method concerns the means of the observed variables \(Y_j\), and assume that these are centered beforehand such that all means are equal to 0.

The differences between principal component analysis and exploratory factor analysis are somewhat subtle. Principal components analysis (PCA) can be seen as applying a formative model, where observed variables “cause” the latent variables (called a principal components in this context). Principal components (which we will denote as \(\text{PC}_j\)) are simply linear functions of the variables: \[\begin{aligned} \text{PC}_{1,i} &= w_{1,1} \times Y_{1,i} + w_{1,2} \times Y_{2,i} + \ldots + w_{1,P} \times Y_{P,i} \\ \text{PC}_2 &= w_{2,1} \times Y_{1,i} + w_{2,2} \times Y_{2,i} + \ldots + w_{2,m} \times Y_{P,i} \\ & \vdots \\ \text{PC}_P &= w_{P,1} \times Y_{1,i} + w_{P,2} \times Y_{2,i} + \ldots + w_{P,P} \times Y_{P,i} \end{aligned}\] Note that the principal components \(\text{PC}_j\) have no error terms. They don’t need to be estimated, but rather are computed from the variables \(Y_j\). Also note that we use a total of \(P\) principal components to model a total of \(P\) variables. In a PCA, the weights are chosen such that the principal components are uncorrelated. In addition, they are chosen such that the variances of the principal components are ordered, such that the first principal component \(\text{PC}_1\) has the largest variance, then the second component \(\text{PC}_2\), etc. Whilst there are an infinite number of other saturated models which also match the observed covariances in the data perfectly, these restrictions provide a unique solution to computing the weights.

By contrast, exploratory factor analysis (EFA) concerns a reflective model, where the latent variables “cause” the observed variables. An EFA model for \(P\) variables can at most contain \(P-1\) factors. The EFA model with \(P-1\) factors can be written as: \[\begin{aligned} Y_{1,i} &= \lambda_{1,1} \times \eta_{1,i} + \lambda_{1,2} \times \eta_{2,i} + \ldots + \lambda_{1,P-1} \times \eta_{P-1,i} + \epsilon_{1,i} && \epsilon_{1,i} \sim \mathbf{Normal}(0, \sigma_{\epsilon_1}) \\ Y_{2,i} &= \lambda_{2,1} \times \eta_{1,i} + \lambda_{2,2} \times \eta_{2,i} + \ldots + \lambda_{2,P-1} \times \eta_{P-1,i} + \epsilon_{2,i} && \epsilon_{2,i} \sim \mathbf{Normal}(0, \sigma_{\epsilon_2}) \\ \vdots \\ Y_{P,i} &= \lambda_{P,1} \times \eta_{1,i} + \lambda_{P,2} \times \eta_{2,i} + \ldots + \lambda_{P,P-1} \times \eta_{P-1,i} + \epsilon_{P,i} && \epsilon_{P,i} \sim \mathbf{Normal}(0, \sigma_{\epsilon_P}) \\ \eta_j &\sim \mathbf{Normal}(0,1) \end{aligned}\] Whilst scaling the latent factors to be standard Normal variables, this model has still more parameters than unique values in the variance-covariance matrix. Compared to the PCA model, the EFA model contains additional residual variances \(\sigma^2_{\epsilon_j}\). To allow some form of identification, EFA therefore first removes the (estimated) residual variances from the variance-covariance matrix. After this, the reduced variance-covariance matrix only contains variation in the observed variables which is shared with the latent factors, and there is no need to estimate the residual variances. After this step, a saturated model is possible, although factor loadings can not be uniquely estimated without further constraints.

Both PCA and EFA use saturated models. The key difference is that PCA targets the full variance-covariance matrix of the observed variables, whilst EFA targets a reduced variance-covariance matrix with the unique residual variance of each variable removed. PCA imposes further restrictions (independent components and maximising the variance of successive components) in order to arrive at a unique solution. These restrictions can also be applied in EFA, but are often relaxed to provide a potentially more meaningful solution.

14.3.1 Determining the number of factors

A main aim in both PCA and EFA is to determine the number of factors that could sufficiently explain the inter-relations between the variables in the data. For both PCA and EFA, a model with the same number of latent factors as observed variables will always fit the data perfectly (as each component or factor is allowed to be connected to all variables, unlike in CFA, where particular loadings tend to be fixed to zero).

A key consideration is how much variability in the dataset each component or factor accounts for (or “explains”). Underlying both PCA and EFA is a mathematical technique called “eigendecomposition”, which represents symmetric matrices such as covariance or correlation matrices in a unified manner by decomposing it into “eigenvalues” and “eigenvectors”. The specifics of this go beyond the scope of this book. Roughly, the elements in eigenvectors are related to component loadings: the values in the eigenvector multiplied by the square root of the eigenvalue gives the component loadings, which can be interpreted as the correlation of each item with the principal component. The eigenvalues represent the variance accounted for by each component on a standardized scale where each variable has a variance of 1. You can think of this is applying the analysis to the variables after \(Z\)-transforming each variable so they have a mean of 0 and a variance (and standard deviation) of 1. The sum of eigenvalues is equal to the number of variable, so each variable adds a value of 1 to the total amount of variability to be accounted for. An eigenvalue of 1 then tells you that a component accounts for more variance than a single variable. In other words, it accounts for shared variability in at least two variables. As we are looking for latent variables that account for relations between measured variables, it is reasonable to search for components that can account for more than the variance of a single variable. So an eigenvalue of at least 1 is a reasonable lower bound.

One method for determining the number of components of factors is then to determine how many have eigenvalues larger than 1. Ideally, one would consider the sampling error in this evaluation. This is not straightforward, as the sampling distribution of eigenvalues is complex. The current “gold standard” is to conduct a parallel analysis, which is effectively a bootstrap method method to compare estimated eigenvalues to those of randomly simulated data. Figure 14.6 shows the results of a parallel analysis for both PCA and EFA. Note that as EFA targets the shared, and not the unique variance of the variables in the dataset, the eigenvalues are expected to be less than 1. Hence, the comparison line is lower for EFA compared to PCA. For both, the main idea is to consider components or factors which account for more variation than those based on randomly generated data. Using this criterion, PCA would select 9 components, and EFA 12 factors.

Results of a parallel PCA and EFA to determine the number of componentd/factors for the BFI-2 data.

Figure 14.6: Results of a parallel PCA and EFA to determine the number of componentd/factors for the BFI-2 data.

## Parallel analysis suggests that the number of factors =  12  and the number of components =  9

14.3.2 Factor rotation

The initial solution for a EFA is obtained by eigendecopmposition. This constrains the solution to have independent factors which are ordered in the amount the variance they account for. Whilst these restrictions provide a unique solution, there are an infinite number of equivalent models which fit the date equally well. In EFA, it is common practice to search for equivalent solutions which might be more easily interpretable. The main aim is to search for a simple structure, such that each item loads highly only on a single factor.

There are two main types of factor rotation: in orthogonal rotation, the factors are constrained to be independent of each other (i.e. there is no residual covariation between the factors). In oblique rotation, factors are allowed to have residual covariation.

14.4 In practice

  1. Structural Equation modelling is a large sample technique. To give meaningful results, the means and (co-)variances in the data should be close to the true values in the DGP. So, the more observations, the better. But how many observations is enough? There is no definite answer to that question, but you could consider the \(n/\text{npar}(M)\) heuristic (Kline, 2015), which refers to the ratio of observations to model parameters. The recommended ratio is to have at least 20 observations for each parameter in the model. For example, if the model has \(\text{npar}(M) = 30\) parameters, you would want to have at least \(n=600\) observations. This is not a hard limit, as the required sample size depends on various aspects of the data. If the data is truly continuous and multivariate-Normal distributed, with linear relations between the variables, less observations may be needed. But when the data does not respect the assumptions of SEM, substantially more data may be needed to allow inferences to be meaningful. The \(n/\text{npar}(M) = 20\) heuristic aims to provide a reasonable guideline for most cases.
  2. SEM is commonly applied to data which is not truly continuous. The BFI-2 data considered here are responses to statements on a Likert scale wth 5 options. Such data can be consider ordinal, but is not metric or continuous. Whilst the assumption of multivariate-Normal distributed variables is therefore violated by definition, SEM may still be a reasonable technique to model the inter-relations between the variables. When there are at least 5 response options scored in a numerical way (e.g. as \(1, 2, \ldots, 5\)), covariances assuming continuity may not differ too much from those computed when respecting the ordinal nature of the data. The more principled way of dealing with ordinal data is to use polychoric correlations, which assume the ordinal responses are generated by an underlying continuous variable (based on the assumption that each ordinal response corresponds to a region or “bin” of the continuous variable) and aim to capture the correlation between these underlying continuous variables from the ordinal responses. Most SEM software allows you to estimate models based purely on sample correlation matrices. This could provide a reasonable alternative to the assumption of multivariate-Normal distributed variables. That said, the results will often not differ too much when the ordinal responses have at least 5 values and the data is sufficiently large.
  3. The flexibility of SEM, especially when introducing latent variables, opens up a world of theorizing. Always remember though that a SEM is just a way to identify the means and (co-)variances between variables. Although SEMs may contain causal links between variables, correlation is not causation. Keep in mind that there will be equivalent other models, with other causal paths, which imply exactly the same means and covariances.
Table 14.6: Results of a hierarchical factor model for the BFI-2 with an additional Acquiescence factor.
Model
Estimate Std. Err. z p
Factor Loadings
E
Es 1.00+
Ea 0.78 0.06 13.56 .000
Ee 0.41 0.04 9.42 .000
Es
i1 1.00+
i16 1.66 0.06 27.52 .000
i31 1.41 0.06 25.38 .000
i46 1.17 0.05 24.41 .000
Ea
i6 1.00+
i21 1.43 0.05 26.94 .000
i36 0.61 0.04 14.94 .000
i51 1.22 0.05 25.69 .000
Ee
i11 1.00+
i26 0.79 0.09 8.67 .000
i41 1.74 0.16 10.90 .000
i56 1.48 0.14 10.85 .000
A
Ac 1.00+
Ar 0.94 0.09 10.51 .000
At 1.46 0.13 11.65 .000
Ac
i2 1.00+
i17 1.34 0.07 18.13 .000
i32 0.78 0.05 15.31 .000
i47 1.48 0.09 16.00 .000
Ar
i7 1.00+
i22 1.66 0.12 14.12 .000
i37 2.14 0.15 14.41 .000
i52 0.92 0.07 12.67 .000
At
i12 1.00+
i27 0.59 0.06 10.27 .000
i42 0.65 0.06 11.39 .000
i57 0.89 0.07 13.47 .000
C
Co 1.00+
Cp 1.13 0.09 12.91 .000
Cr 1.05 0.08 13.02 .000
Co
i3 1.00+
i18 1.77 0.09 19.77 .000
i33 1.80 0.09 19.79 .000
i48 1.53 0.08 18.33 .000
Cp
i8 1.00+
i23 1.08 0.06 16.75 .000
i38 1.07 0.06 18.89 .000
i53 1.17 0.06 19.06 .000
Cr
i13 1.00+
i28 0.99 0.07 14.77 .000
i43 0.89 0.06 15.67 .000
i58 1.26 0.08 15.88 .000
N
Na 1.00+
Nd 0.77 0.04 20.71 .000
Ne 0.98 0.04 23.30 .000
Na
i4 1.00+
i19 0.84 0.03 24.76 .000
i34 0.79 0.03 23.67 .000
i49 0.81 0.04 23.22 .000
Nd
i9 1.00+
i24 0.99 0.05 20.26 .000
i39 1.46 0.06 25.55 .000
i54 1.50 0.06 25.58 .000
Ne
i14 1.00+
i29 1.09 0.04 30.65 .000
i44 0.79 0.03 24.28 .000
i59 0.97 0.03 28.23 .000
O
Oi 1.00+
Oa 2.07 0.31 6.71 .000
Oc 1.30 0.19 6.66 .000
Oi
i10 1.00+
i25 2.79 0.30 9.23 .000
i40 1.38 0.16 8.45 .000
i55 2.88 0.31 9.32 .000
Oa
i5 1.00+
i20 1.01 0.03 29.90 .000
i35 0.78 0.03 28.19 .000
i50 0.70 0.03 21.64 .000
Oc
i15 1.00+
i30 1.13 0.05 20.84 .000
i45 0.66 0.04 14.67 .000
i60 1.11 0.05 21.84 .000
Acq
i1 1.00+
i16 1.00+
i31 1.00+
i46 1.00+
i6 1.00+
i21 1.00+
i36 1.00+
i51 1.00+
i11 1.00+
i26 1.00+
i41 1.00+
i56 1.00+
i2 1.00+
i17 1.00+
i32 1.00+
i47 1.00+
i7 1.00+
i22 1.00+
i37 1.00+
i52 1.00+
i12 1.00+
i27 1.00+
i42 1.00+
i57 1.00+
i3 1.00+
i18 1.00+
i33 1.00+
i48 1.00+
i8 1.00+
i23 1.00+
i38 1.00+
i53 1.00+
i13 1.00+
i28 1.00+
i43 1.00+
i58 1.00+
i4 1.00+
i19 1.00+
i34 1.00+
i49 1.00+
i9 1.00+
i24 1.00+
i39 1.00+
i54 1.00+
i14 1.00+
i29 1.00+
i44 1.00+
i59 1.00+
i10 1.00+
i25 1.00+
i40 1.00+
i55 1.00+
i5 1.00+
i20 1.00+
i35 1.00+
i50 1.00+
i15 1.00+
i30 1.00+
i45 1.00+
i60 1.00+
Residual Variances
i1 0.57 0.02 25.72 .000
i16 0.48 0.03 16.83 .000
i31 0.73 0.03 23.43 .000
i46 0.62 0.02 24.72 .000
i6 0.56 0.02 24.50 .000
i21 0.40 0.03 15.22 .000
i36 0.88 0.03 28.10 .000
i51 0.54 0.03 21.51 .000
i11 0.96 0.04 27.41 .000
i26 0.63 0.02 27.28 .000
i41 0.55 0.03 18.66 .000
i56 0.49 0.02 20.66 .000
i2 0.37 0.02 22.54 .000
i17 0.43 0.02 18.78 .000
i32 0.40 0.02 25.58 .000
i47 1.17 0.05 24.59 .000
i7 0.44 0.02 25.99 .000
i22 0.61 0.03 22.58 .000
i37 0.72 0.04 19.51 .000
i52 0.37 0.01 25.85 .000
i12 0.81 0.04 21.00 .000
i27 0.94 0.03 27.05 .000
i42 0.84 0.03 26.14 .000
i57 0.71 0.03 21.84 .000
i3 0.96 0.03 28.11 .000
i18 0.34 0.02 17.55 .000
i33 0.34 0.02 17.13 .000
i48 0.73 0.03 25.56 .000
i8 0.84 0.03 26.12 .000
i23 0.86 0.03 25.66 .000
i38 0.41 0.02 21.49 .000
i53 0.44 0.02 20.59 .000
i13 0.59 0.02 25.17 .000
i28 0.69 0.03 25.86 .000
i43 0.44 0.02 24.73 .000
i58 0.79 0.03 24.25 .000
i4 0.64 0.03 23.13 .000
i19 0.69 0.03 25.46 .000
i34 0.72 0.03 26.05 .000
i49 0.81 0.03 26.28 .000
i9 0.88 0.03 27.06 .000
i24 0.92 0.03 27.21 .000
i39 0.48 0.02 19.83 .000
i54 0.50 0.03 19.62 .000
i14 0.83 0.03 24.93 .000
i29 0.46 0.02 19.43 .000
i44 0.76 0.03 26.38 .000
i59 0.62 0.03 23.67 .000
i10 0.74 0.03 27.96 .000
i25 0.89 0.04 21.34 .000
i40 0.62 0.02 26.63 .000
i55 0.52 0.03 15.31 .000
i5 0.68 0.03 21.24 .000
i20 0.47 0.03 17.37 .000
i35 0.46 0.02 22.18 .000
i50 0.95 0.04 26.71 .000
i15 0.46 0.02 21.54 .000
i30 0.62 0.03 22.05 .000
i45 0.74 0.03 27.30 .000
i60 0.36 0.02 17.02 .000
Latent Variances
E 0.39 0.04 10.84 .000
Es 0.06 0.02 2.62 .009
Ea 0.24 0.02 11.06 .000
Ee 0.07 0.01 5.61 .000
A 0.12 0.01 7.85 .000
Ac 0.12 0.01 8.75 .000
Ar 0.03 0.01 3.42 .001
At 0.17 0.03 6.13 .000
C 0.19 0.02 8.51 .000
Co 0.11 0.01 8.16 .000
Cp 0.12 0.02 7.77 .000
Cr 0.04 0.01 3.92 .000
N 0.69 0.05 15.19 .000
Na 0.06 0.02 3.06 .002
Nd 0.11 0.01 8.49 .000
Ne 0.20 0.02 9.72 .000
O 0.06 0.01 4.33 .000
Oi 0.01 0.01 1.05 .293
Oa 0.68 0.05 13.79 .000
Oc 0.29 0.02 11.88 .000
Acq 0.07 0.00 16.26 .000
Latent Covariances
E w/A 0.00+
E w/C 0.00+
E w/N 0.00+
E w/O 0.00+
E w/Acq 0.00+
A w/C 0.00+
A w/N 0.00+
A w/O 0.00+
A w/Acq 0.00+
C w/N 0.00+
C w/O 0.00+
C w/Acq 0.00+
N w/O 0.00+
N w/Acq 0.00+
O w/Acq 0.00+
Fit Indices
χ2 11042.55(1694) .000
CFI 0.77
SRMR 0.12
RMSEA 0.06
RMSEA (lower bound) 0.06
RMSEA (upper bound) 0.06
AIC 277382.98
BIC 278125.22
+Fixed parameter


References

Bartholomew, D. J., Deary, I. J., & Lawn, M. (2009). The origin of factor scores: Spearman, thomson and bartlett. British Journal of Mathematical and Statistical Psychology, 62, 569–582.
Devlieger, I., & Rosseel, Y. (2023). Using factor scores in structural equation modeling. In R. H. Hoyle (Ed.), Handbook of structural equation modelling (pp. 316–328). Guilford Press.
Hřebı́čková, M., Jelı́nek, M., Květon, P., Benkovič, A., Botek, M., Sudzina, F., … John, O. P. (2020). Big five inventory 2 (BFI-2): Hierarchick model s 15 subškálami. Ceskoslovenska Psychologie, 64.
John, O. P., Donahue, E. M., & Kentle, R. L. (1991). Big five inventory. Journal of Personality and Social Psychology.
Kline, R. B. (2015). Principles and practice of structural equation modeling (4th edition). Guilford Press.
Kolenikov, S., & Bollen, K. A. (2012). Testing negative error variances: Is a Heywood case a symptom of misspecification? Sociological Methods & Research, 41, 124–167.
Maraun, M. D. (1996). Metaphor taken as math: Indeterminancy in the factor analysis model. Multivariate Behavioral Research, 31, 517–538.
Soto, C. J., & John, O. P. (2017). The next big five inventory (BFI-2): Developing and assessing a hierarchical model with 15 facets to enhance bandwidth, fidelity, and predictive power. Journal of Personality and Social Psychology, 113, 117.
Spearman, C. (1904). General intelligence,objectively determined and measured. The American Journal of Psychology, 15, 201–292.