Chapter 5 Moderation and mediation
5.1 Moderation in linear models
Including an interaction in a linear model in R is straightforward. If you have two predictors, x1 and x2, and want to include both the “simple slopes” as well as the slope for the “product predictor” (i.e. x1 \(\times\) x2), then the model with y as dependent variable can be specified in formula form as
y ~ x1 * x2
which evaluates to
y ~ 1 + x1 + x2 + x1:x2
As discussed previously, 1 represents the intercept, which is automatically included in a model specification, unless you remove it explicitly by adding -1 to the formula. x1 represents the “simple effect” of x1, x2 the corresponding “simple effect” of x2, whilst x1:x2 represents the product-predictor for the interaction between x1 and x2 (perhaps a little confusing for those of you who know : as a division operator). Because you would generally want to include the simple effects for the predictors as well as the interaction, the authors of R have chosen to save you typing in the full model by expanding x1 * x2 in this way. This can be used to specify more complicated models, with three-way interactions. For instance,
y ~ x1 * x2 * x3
evaluates to
y ~ 1 + x1 + x2 + x3 + x1:x2 + x1:x3 + x2:x3 + x1:x2:x3
which is a model with all “simple effects”, all pairwise product-predictors, as well as a three-way product predictor which is x1 \(\times\) x2 \(\times\) x3. We won’t discuss such higher-order interactions until later, but it is good to be aware of this in advance.
As a linear model with a product-predictors is just another linear model, that is really all there is to say about specifying linear models with interactions/moderation in R. Through the formula interface, R will create the necessary additional product predictor(s), and then estimate the parameters of the resulting linear model.
Let’s have a look at how this works with the speeddate data which was also analysed in the SDAM book. The data is included in the sdamr package, and can be loaded and (partially) inspected as usual:
## iid pid gender age date_like other_like date_want other_want match self_attr
## 1 132 137 female 27 8 7 0 1 0 8
## 2 132 138 female 27 8 8 0 1 0 8
## 3 132 139 female 27 5 8 0 1 0 8
## 4 132 140 female 27 9 7 1 1 1 8
## 5 132 141 female 27 7 7 0 1 0 8
## 6 133 137 female 24 7 7 0 0 0 6
## self_sinc self_intel self_fun self_amb other_attr other_sinc other_intel
## 1 10 9 10 10 8 8 9
## 2 10 9 10 10 8 7 4
## 3 10 9 10 10 8 NA 8
## 4 10 9 10 10 7 7 7
## 5 10 9 10 10 8 7 7
## 6 8 8 7 7 6 10 10
## other_fun other_amb other_shar date_attr date_sinc date_intel date_fun
## 1 8 9 7 7 9 7 9
## 2 6 6 4 8 8 8 8
## 3 8 7 NA 5 7 9 5
## 4 7 7 7 8 9 9 9
## 5 7 7 8 5 8 8 8
## 6 6 7 5 7 7 7 8
## date_amb date_shar self_imp_attr self_imp_sinc self_imp_intel self_imp_fun
## 1 6 9 16.67 16.67 16.67 16.67
## 2 8 8 16.67 16.67 16.67 16.67
## 3 9 5 16.67 16.67 16.67 16.67
## 4 9 8 16.67 16.67 16.67 16.67
## 5 7 7 16.67 16.67 16.67 16.67
## 6 6 8 12.77 19.15 17.02 17.02
## self_imp_amb self_imp_shar other_imp_attr other_imp_sinc other_imp_intel
## 1 16.67 16.67 17.39 17.39 15.22
## 2 16.67 16.67 20.00 20.00 20.00
## 3 16.67 16.67 18.75 16.67 18.75
## 4 16.67 16.67 18.60 16.28 18.60
## 5 16.67 16.67 20.83 20.83 16.67
## 6 14.89 19.15 17.39 17.39 15.22
## other_imp_fun other_imp_amb other_imp_shar
## 1 17.39 13.04 19.57
## 2 20.00 6.67 13.33
## 3 20.83 12.50 12.50
## 4 18.60 11.63 16.28
## 5 16.67 6.25 18.75
## 6 17.39 13.04 19.57
There are rather a large number of variables in the dataset. You can obtain more information about each variable in the documentation of the dataset, by calling ?speeddate. In my humble opinion, the (generally quite) good documentation of R packages is a real benefit of R over some other systems, and I strongly recommend you to check out and read the documentation of functions and datasets before you use them. Functions in R are generally quite flexible and it is infeasible to discuss all the nuances and possibilities in introductory notes like these. In the book, we mainly focused on the variables starting with other_, which are the perceptions of the participant by their dating partner. For example, the model
\[\begin{align}
\texttt{like}_i =& \beta_0 + \beta_{\texttt{attr}} \times \texttt{attr}_i + \beta_{\texttt{intel}} \times \texttt{intel}_i + \beta_{\texttt{fun}} \times \texttt{fun}_i \\
&+ \beta_{\texttt{attr} \times \texttt{intel}} \times (\texttt{attr} \times \texttt{intel})_i + \beta_{\texttt{fun} \times \texttt{intel}} \times (\texttt{fun} \times \texttt{intel})_i + \epsilon_i
\end{align}\]
referred to in the SDAM book can be estimated by calling:
Hypothesis tests with the \(t\) statistic are obtained as usual though
##
## Call:
## lm(formula = other_like ~ other_attr * other_intel + other_fun *
## other_intel, data = speeddate)
##
## Residuals:
## Min 1Q Median 3Q Max
## -5.2380 -0.6632 0.0239 0.6583 4.6484
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -0.91118 0.44269 -2.058 0.039740 *
## other_attr 0.67123 0.09027 7.436 1.76e-13 ***
## other_intel 0.32393 0.05994 5.405 7.57e-08 ***
## other_fun 0.14331 0.08734 1.641 0.101045
## other_attr:other_intel -0.04333 0.01171 -3.700 0.000224 ***
## other_intel:other_fun 0.03186 0.01139 2.798 0.005209 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.133 on 1472 degrees of freedom
## (84 observations deleted due to missingness)
## Multiple R-squared: 0.6227, Adjusted R-squared: 0.6214
## F-statistic: 485.8 on 5 and 1472 DF, p-value: < 2.2e-16
The equivalent tests with the \(F\) statistic are easily obtained through
## Anova Table (Type III tests)
##
## Response: other_like
## Sum Sq Df F value Pr(>F)
## (Intercept) 5.43 1 4.2365 0.0397398 *
## other_attr 70.93 1 55.2931 1.755e-13 ***
## other_intel 37.47 1 29.2094 7.568e-08 ***
## other_fun 3.45 1 2.6923 0.1010448
## other_attr:other_intel 17.56 1 13.6898 0.0002236 ***
## other_intel:other_fun 10.04 1 7.8287 0.0052093 **
## Residuals 1888.24 1472
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
5.1.1 A reminder about namespaces
In the code above, I’m using car:: to refer to a function in the car package. Technically, a statement like package_name:: denotes the namespace of the package with the name package_name (i.e., car in this case). This allows you to use functions from a package without loading the package completely (without loading all the functions of the package in memory). This can be (and often is!) better than first loading the package through e.g. library(car) and then calling Anova. The issue is that different packages can use the same name for a function, and when you call a function, it will be the one of the package that was last loaded. When packages use the same name for functions, the function with the same name from a package that was loaded earlier will be “masked” and R will print this as a warning in the R console. For example, if you load the dplyr package (a rather useful package for data manipulation, that is a little too much to discuss here in detail), you will see the following warnings:
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
The line The following objects are masked from 'package:stats' indicates that the functions filter and lag are “masked from the stats package. Whenever you call these functions, they will be the corresponding functions from the dplyr package, and not those from the stats package. This does not break the functionality of the packages themselves, as a properly written package that needs the filter function from the stats package will still use the function from that namespace (package), and not the one from the dplyr one. But when you call the a function, R will try find the definition of that function in the global namespace, and the global namespace contains the version of the last provided definition of any R object. Just like you can overwrite an R object by giving it the same name as an already existing r object (as I often do deliberately through e.g. mod <- lm()), a function in R is just another object. So if I specify a function like
Then next time I call that function, I will get as a result
## [1] "Where's your head? It's almost Halloween!"
and not the result from
5.2 Centering
As discussed in the SDAM book, sometimes you might want to center variables by subtracting their (sample) mean from each value. This can be done in a number of ways. You can either create new variables in a dataframe by subtracting the single value obtained through mean() from a vector, as in
and then using the new variable other_like_c in your lm model. Alternatively, you can do by calling the scale function. By default, the scale function creates \(Z\) transformed variables by subtracting the mean and then dividing this mean-deviation by the standard deviation:
\[\text{scale}(Y_i) = \frac{Y_i - \overline{Y}}{S_Y}\]
The scale function has three arguments:
x: the variable (or matrix of variables) which you want to scalecenter: a logical value indicating whether you want to subtract the meanscale: a logical value indicating whether you want to divide by the standard deviation
Centering (subtracting the mean, but not dividing by the standard deviation) is thus obtained by calling scale(x, scale=FALSE). Personally, I find calling a function scale with argument scale = FALSE a little confusing. The sdamr package therefore provides the function center which is basically just a version of scale which by default sets the argument scale = FALSE:
Because R is a functional programming language, you can call the scale or center function directly within the call to the lm function. This saves you having to create centered variables in a dataframe first. For instance, if you have loaded the sdamr package (or if you ran the code above defining the center function), you can obtain the results of the model with centered predictors by calling
modg_c <- lm(other_like ~ center(other_attr)*center(other_intel) + center(other_fun)*center(other_intel), data=speeddate)
summary(modg_c)##
## Call:
## lm(formula = other_like ~ center(other_attr) * center(other_intel) +
## center(other_fun) * center(other_intel), data = speeddate)
##
## Residuals:
## Min 1Q Median 3Q Max
## -5.2380 -0.6632 0.0239 0.6583 4.6484
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 6.19613 0.03151 196.609 < 2e-16 ***
## center(other_attr) 0.34539 0.01922 17.967 < 2e-16 ***
## center(other_intel) 0.25791 0.02351 10.970 < 2e-16 ***
## center(other_fun) 0.38292 0.02094 18.287 < 2e-16 ***
## center(other_attr):center(other_intel) -0.04333 0.01171 -3.700 0.000224 ***
## center(other_intel):center(other_fun) 0.03186 0.01139 2.798 0.005209 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.133 on 1472 degrees of freedom
## (84 observations deleted due to missingness)
## Multiple R-squared: 0.6227, Adjusted R-squared: 0.6214
## F-statistic: 485.8 on 5 and 1472 DF, p-value: < 2.2e-16
5.3 Mediation analysis
In the SDAM book, we looked at mediation with a different dataset called legacy2015. This is also part of the sdamr package, and we can make it available and look at the initial cases as usual:
## [1] "Where's your head? It's almost Halloween!"
Oops! We had overwritten (“masked”) the head function before. We can get rid of our new definition of the head function by removing it from the global namespace with the rm() function (the name refers to remove):
After this, we can try again:
## id sex age legacy belief intention education income donation
## 2 2 male 23 4.38 5.4 3.57 5 5 0
## 3 3 female 59 4.75 5.2 2.43 5 6 0
## 4 4 male 30 5.00 5.2 2.43 5 3 1
## 5 5 female 25 2.88 5.0 2.00 4 3 0
## 6 6 female 32 4.88 4.4 2.57 2 1 0
## 7 7 male 38 5.12 4.6 3.71 3 2 7
5.3.1 Causal steps
The causal steps approach to assessing mediation is done through testing significance of effects in three regression models. This can be done straightforwardly with the lm() function which we have used quite a bit already. For instance, to assess whether the relation between legacy and donation is mediated by intention, we would estimate and test the parameters of the following models:
5.3.2 Investigating the mediated (indirect) effect with a bootstrap test
There are quite a few packages in R which will allow you to test the mediated effect of a predictor on the dependent variable “via” the mediator. I have chosen here for the mediate function from the mediation package (Tingley et al. 2019), as it is a versatile option, and doesn’t require too much additional explanation. Another very good option to conduct mediation analysis is to specify the mediation model through a generalization of linear models generally called Structural Equation Models. These are multivariate models and something we will cover later on. For present purposes, this can be seen as a way to link different regression models (i.e. the dependent variable of one regression model becomes a predictor in another regression model) into what are conventionally called path models. The current “go-to” and most comprehensive SEM package in R is called lavaan and we will consider that package in a later chapter.
Let’s focus on the mediation package for now. If you haven’t done so already, you will need to install it with
If you check the documentation of mediate in this package (i.e type ?mediation::mediate), you will see there are lots of arguments to specify. We will only focus one the ones important for present purposes:
model.m: the name of the R object which contains the linear model predicting the mediator from the predictor, e.g.mod2abovemodel.y: the name of the R object which contains the linear model predicting the dependent variable from both the mediator and predictor, e.g.mod3abovesims: the number of simulations to use for the bootstrap testboot: (logical) whether to use a bootstrap test. You should set this toTRUEboot.ci.type: the type of bootstrap confidence interval to be computed. This can either beperc, which stands for percentile, and is a simple way where the empirical 2.5th and 97.5th percentiles are calculated from the ordered outcomes. This is the option chosen by default. The other option isbcawhich stands for bias-corrected and accelerated. This includes a correction of the percentile method to try and reduce bias. It is generally recommended to use this option in mediation analysis. Hence, you should setboot.ci.type = "bca"treat: the name of the predictor in the linear models specified undermodel.mandmodel.y. This is would be e.g.legacyin the models abovemediator: the name of the mediator in the linear models specified undermodel.mandmodel.y. This is would be e.g.intentionin the models above
To run a bootstrap mediation test with 2000 simulations, we can run the following command:
set.seed(20201027)
med <- mediation::mediate(model.m = mod2, model.y = mod3, sims = 2000, boot = TRUE, boot.ci.type = "bca", treat = "legacy", mediator = "intention")## Running nonparametric bootstrap
##
## Causal Mediation Analysis
##
## Nonparametric Bootstrap Confidence Intervals with the BCa Method
##
## Estimate 95% CI Lower 95% CI Upper p-value
## ACME 0.245 0.104 0.45 <2e-16 ***
## ADE 0.488 0.186 0.82 <2e-16 ***
## Total Effect 0.733 0.425 1.07 <2e-16 ***
## Prop. Mediated 0.334 0.155 0.67 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Sample Size Used: 237
##
##
## Simulations: 2000
Note that rather than loading the mediation package with e.g. library(mediation) I’m calling the mediate function through mediation::mediate. Loading the mediation package results in loading quite a few other R packages, which are not all necessary to perform a mediation analysis with linear models. Calling the mediate function directly through the appropriate namespace avoids loading these other add-on packages (which then will mask some functions that I don’t want masked).
The output from calling the summary function on the results of the bootstrap procedure (the R object I named med) has four rows:
ACME: this is the average causal mediation effect, i.e. the average of \(\hat{a} \times \hat{b}\) in the simulations. The value underestimateis the average, and you will also see the lower and upper bound of the 95% confidence interval under95% CI Lowerand95% CI Upperrespectively. Finally, thep-valueis the probability of the foundACMEor more extreme given that in reality, theACMEequals 0. I.e., this is the p-value of the hypothesis test that the true mediated effect equals 0.ADE: this is the average direct effect, and reflects the effect of the predictor which is not mediated. It is the average of \(\hat{c}'\) in the simulations.
Total Effectis the total effect of the predictor on the dependent variable, which is the sum of theACMEandADE.Prop. Mediatedis the proportion of the effect of the predictor on the dependent variable which is mediated. This isACMEdivided byTotal Effect.
The results under ACME show that the bootstrap confidence interval of the mediated effect does not include 0. The p-value for this effect is also smaller than \(\alpha = .05\). As such, the null hypothesis that \(a \times b = 0\) is rejected, and we have found evidence that the effect of legacy on donation is mediated by intention. Because the confidence interval of the ADE also does not include 0, this analysis indicates that the mediation is partial. There is also a significant direct effect of legacy on donation. About 33.4% of the effect of legacy on donation is mediated by intention, so the residual direct effect is quite substantial.