Chapter 5 Moderation and mediation
5.1 Moderation in linear models
Including an interaction in a linear model in R is straightforward. If you have two predictors, x1
and x2
, and want to include both the “simple slopes” as well as the slope for the “product predictor” (i.e. x1
\(\times\) x2
), then the model with y
as dependent variable can be specified in formula form as
y ~ x1 * x2
which evaluates to
y ~ 1 + x1 + x2 + x1:x2
As discussed previously, 1
represents the intercept, which is automatically included in a model specification, unless you remove it explicitly by adding -1
to the formula. x1
represents the “simple effect” of x1
, x2
the corresponding “simple effect” of x2
, whilst x1:x2
represents the product-predictor for the interaction between x1
and x2
(perhaps a little confusing for those of you who know :
as a division operator). Because you would generally want to include the simple effects for the predictors as well as the interaction, the authors of R have chosen to save you typing in the full model by expanding x1 * x2
in this way. This can be used to specify more complicated models, with three-way interactions. For instance,
y ~ x1 * x2 * x3
evaluates to
y ~ 1 + x1 + x2 + x3 + x1:x2 + x1:x3 + x2:x3 + x1:x2:x3
which is a model with all “simple effects”, all pairwise product-predictors, as well as a three-way product predictor which is x1
\(\times\) x2
\(\times\) x3
. We won’t discuss such higher-order interactions until later, but it is good to be aware of this in advance.
As a linear model with a product-predictors is just another linear model, that is really all there is to say about specifying linear models with interactions/moderation in R. Through the formula interface, R will create the necessary additional product predictor(s), and then estimate the parameters of the resulting linear model.
Let’s have a look at how this works with the speeddate
data which was also analysed in the SDAM book. The data is included in the sdamr
package, and can be loaded and (partially) inspected as usual:
## iid pid gender age date_like other_like date_want other_want match self_attr
## 1 132 137 female 27 8 7 0 1 0 8
## 2 132 138 female 27 8 8 0 1 0 8
## 3 132 139 female 27 5 8 0 1 0 8
## 4 132 140 female 27 9 7 1 1 1 8
## 5 132 141 female 27 7 7 0 1 0 8
## 6 133 137 female 24 7 7 0 0 0 6
## self_sinc self_intel self_fun self_amb other_attr other_sinc other_intel
## 1 10 9 10 10 8 8 9
## 2 10 9 10 10 8 7 4
## 3 10 9 10 10 8 NA 8
## 4 10 9 10 10 7 7 7
## 5 10 9 10 10 8 7 7
## 6 8 8 7 7 6 10 10
## other_fun other_amb other_shar date_attr date_sinc date_intel date_fun
## 1 8 9 7 7 9 7 9
## 2 6 6 4 8 8 8 8
## 3 8 7 NA 5 7 9 5
## 4 7 7 7 8 9 9 9
## 5 7 7 8 5 8 8 8
## 6 6 7 5 7 7 7 8
## date_amb date_shar self_imp_attr self_imp_sinc self_imp_intel self_imp_fun
## 1 6 9 16.67 16.67 16.67 16.67
## 2 8 8 16.67 16.67 16.67 16.67
## 3 9 5 16.67 16.67 16.67 16.67
## 4 9 8 16.67 16.67 16.67 16.67
## 5 7 7 16.67 16.67 16.67 16.67
## 6 6 8 12.77 19.15 17.02 17.02
## self_imp_amb self_imp_shar other_imp_attr other_imp_sinc other_imp_intel
## 1 16.67 16.67 17.39 17.39 15.22
## 2 16.67 16.67 20.00 20.00 20.00
## 3 16.67 16.67 18.75 16.67 18.75
## 4 16.67 16.67 18.60 16.28 18.60
## 5 16.67 16.67 20.83 20.83 16.67
## 6 14.89 19.15 17.39 17.39 15.22
## other_imp_fun other_imp_amb other_imp_shar
## 1 17.39 13.04 19.57
## 2 20.00 6.67 13.33
## 3 20.83 12.50 12.50
## 4 18.60 11.63 16.28
## 5 16.67 6.25 18.75
## 6 17.39 13.04 19.57
There are rather a large number of variables in the dataset. You can obtain more information about each variable in the documentation of the dataset, by calling ?speeddate
. In my humble opinion, the (generally quite) good documentation of R packages is a real benefit of R over some other systems, and I strongly recommend you to check out and read the documentation of functions and datasets before you use them. Functions in R are generally quite flexible and it is infeasible to discuss all the nuances and possibilities in introductory notes like these. In the book, we mainly focused on the variables starting with other_
, which are the perceptions of the participant by their dating partner. For example, the model
\[\begin{align}
\texttt{like}_i =& \beta_0 + \beta_{\texttt{attr}} \times \texttt{attr}_i + \beta_{\texttt{intel}} \times \texttt{intel}_i + \beta_{\texttt{fun}} \times \texttt{fun}_i \\
&+ \beta_{\texttt{attr} \times \texttt{intel}} \times (\texttt{attr} \times \texttt{intel})_i + \beta_{\texttt{fun} \times \texttt{intel}} \times (\texttt{fun} \times \texttt{intel})_i + \epsilon_i
\end{align}\]
referred to in the SDAM book can be estimated by calling:
Hypothesis tests with the \(t\) statistic are obtained as usual though
##
## Call:
## lm(formula = other_like ~ other_attr * other_intel + other_fun *
## other_intel, data = speeddate)
##
## Residuals:
## Min 1Q Median 3Q Max
## -5.2380 -0.6632 0.0239 0.6583 4.6484
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -0.91118 0.44269 -2.058 0.039740 *
## other_attr 0.67123 0.09027 7.436 1.76e-13 ***
## other_intel 0.32393 0.05994 5.405 7.57e-08 ***
## other_fun 0.14331 0.08734 1.641 0.101045
## other_attr:other_intel -0.04333 0.01171 -3.700 0.000224 ***
## other_intel:other_fun 0.03186 0.01139 2.798 0.005209 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.133 on 1472 degrees of freedom
## (84 observations deleted due to missingness)
## Multiple R-squared: 0.6227, Adjusted R-squared: 0.6214
## F-statistic: 485.8 on 5 and 1472 DF, p-value: < 2.2e-16
The equivalent tests with the \(F\) statistic are easily obtained through
## Anova Table (Type III tests)
##
## Response: other_like
## Sum Sq Df F value Pr(>F)
## (Intercept) 5.43 1 4.2365 0.0397398 *
## other_attr 70.93 1 55.2931 1.755e-13 ***
## other_intel 37.47 1 29.2094 7.568e-08 ***
## other_fun 3.45 1 2.6923 0.1010448
## other_attr:other_intel 17.56 1 13.6898 0.0002236 ***
## other_intel:other_fun 10.04 1 7.8287 0.0052093 **
## Residuals 1888.24 1472
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
5.1.1 A reminder about namespaces
In the code above, I’m using car::
to refer to a function in the car
package. Technically, a statement like package_name::
denotes the namespace of the package with the name package_name
(i.e., car
in this case). This allows you to use functions from a package without loading the package completely (without loading all the functions of the package in memory). This can be (and often is!) better than first loading the package through e.g. library(car)
and then calling Anova
. The issue is that different packages can use the same name for a function, and when you call a function, it will be the one of the package that was last loaded. When packages use the same name for functions, the function with the same name from a package that was loaded earlier will be “masked” and R will print this as a warning in the R console. For example, if you load the dplyr
package (a rather useful package for data manipulation, that is a little too much to discuss here in detail), you will see the following warnings:
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
The line The following objects are masked from 'package:stats'
indicates that the functions filter
and lag
are “masked from the stats
package. Whenever you call these functions, they will be the corresponding functions from the dplyr
package, and not those from the stats
package. This does not break the functionality of the packages themselves, as a properly written package that needs the filter
function from the stats
package will still use the function from that namespace (package), and not the one from the dplyr
one. But when you call the a function, R will try find the definition of that function in the global namespace, and the global namespace contains the version of the last provided definition of any R object. Just like you can overwrite an R object by giving it the same name as an already existing r object (as I often do deliberately through e.g. mod <- lm()
), a function in R is just another object. So if I specify a function like
Then next time I call that function, I will get as a result
## [1] "Where's your head? It's almost Halloween!"
and not the result from
5.2 Centering
As discussed in the SDAM book, sometimes you might want to center variables by subtracting their (sample) mean from each value. This can be done in a number of ways. You can either create new variables in a dataframe by subtracting the single value obtained through mean()
from a vector, as in
and then using the new variable other_like_c
in your lm
model. Alternatively, you can do by calling the scale
function. By default, the scale
function creates \(Z\) transformed variables by subtracting the mean and then dividing this mean-deviation by the standard deviation:
\[\text{scale}(Y_i) = \frac{Y_i - \overline{Y}}{S_Y}\]
The scale
function has three arguments:
x
: the variable (or matrix of variables) which you want to scalecenter
: a logical value indicating whether you want to subtract the meanscale
: a logical value indicating whether you want to divide by the standard deviation
Centering (subtracting the mean, but not dividing by the standard deviation) is thus obtained by calling scale(x, scale=FALSE)
. Personally, I find calling a function scale
with argument scale = FALSE
a little confusing. The sdamr
package therefore provides the function center
which is basically just a version of scale
which by default sets the argument scale = FALSE
:
Because R is a functional programming language, you can call the scale
or center
function directly within the call to the lm
function. This saves you having to create centered variables in a dataframe first. For instance, if you have loaded the sdamr
package (or if you ran the code above defining the center
function), you can obtain the results of the model with centered predictors by calling
modg_c <- lm(other_like ~ center(other_attr)*center(other_intel) + center(other_fun)*center(other_intel), data=speeddate)
summary(modg_c)
##
## Call:
## lm(formula = other_like ~ center(other_attr) * center(other_intel) +
## center(other_fun) * center(other_intel), data = speeddate)
##
## Residuals:
## Min 1Q Median 3Q Max
## -5.2380 -0.6632 0.0239 0.6583 4.6484
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 6.19613 0.03151 196.609 < 2e-16 ***
## center(other_attr) 0.34539 0.01922 17.967 < 2e-16 ***
## center(other_intel) 0.25791 0.02351 10.970 < 2e-16 ***
## center(other_fun) 0.38292 0.02094 18.287 < 2e-16 ***
## center(other_attr):center(other_intel) -0.04333 0.01171 -3.700 0.000224 ***
## center(other_intel):center(other_fun) 0.03186 0.01139 2.798 0.005209 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.133 on 1472 degrees of freedom
## (84 observations deleted due to missingness)
## Multiple R-squared: 0.6227, Adjusted R-squared: 0.6214
## F-statistic: 485.8 on 5 and 1472 DF, p-value: < 2.2e-16
5.3 Mediation analysis
In the SDAM book, we looked at mediation with a different dataset called legacy2015
. This is also part of the sdamr
package, and we can make it available and look at the initial cases as usual:
## [1] "Where's your head? It's almost Halloween!"
Oops! We had overwritten (“masked”) the head
function before. We can get rid of our new definition of the head
function by removing it from the global namespace with the rm()
function (the name refers to remove):
After this, we can try again:
## id sex age legacy belief intention education income donation
## 2 2 male 23 4.38 5.4 3.57 5 5 0
## 3 3 female 59 4.75 5.2 2.43 5 6 0
## 4 4 male 30 5.00 5.2 2.43 5 3 1
## 5 5 female 25 2.88 5.0 2.00 4 3 0
## 6 6 female 32 4.88 4.4 2.57 2 1 0
## 7 7 male 38 5.12 4.6 3.71 3 2 7
5.3.1 Causal steps
The causal steps approach to assessing mediation is done through testing significance of effects in three regression models. This can be done straightforwardly with the lm()
function which we have used quite a bit already. For instance, to assess whether the relation between legacy
and donation
is mediated by intention
, we would estimate and test the parameters of the following models:
5.3.2 Investigating the mediated (indirect) effect with a bootstrap test
There are quite a few packages in R which will allow you to test the mediated effect of a predictor on the dependent variable “via” the mediator. I have chosen here for the mediate
function from the mediation
package (Tingley et al. 2019), as it is a versatile option, and doesn’t require too much additional explanation. Another very good option to conduct mediation analysis is to specify the mediation model through a generalization of linear models generally called Structural Equation Models. These are multivariate models and something we will cover later on. For present purposes, this can be seen as a way to link different regression models (i.e. the dependent variable of one regression model becomes a predictor in another regression model) into what are conventionally called path models. The current “go-to” and most comprehensive SEM package in R is called lavaan
and we will consider that package in a later chapter.
Let’s focus on the mediation
package for now. If you haven’t done so already, you will need to install it with
If you check the documentation of mediate
in this package (i.e type ?mediation::mediate
), you will see there are lots of arguments to specify. We will only focus one the ones important for present purposes:
model.m
: the name of the R object which contains the linear model predicting the mediator from the predictor, e.g.mod2
abovemodel.y
: the name of the R object which contains the linear model predicting the dependent variable from both the mediator and predictor, e.g.mod3
abovesims
: the number of simulations to use for the bootstrap testboot
: (logical) whether to use a bootstrap test. You should set this toTRUE
boot.ci.type
: the type of bootstrap confidence interval to be computed. This can either beperc
, which stands for percentile, and is a simple way where the empirical 2.5th and 97.5th percentiles are calculated from the ordered outcomes. This is the option chosen by default. The other option isbca
which stands for bias-corrected and accelerated. This includes a correction of the percentile method to try and reduce bias. It is generally recommended to use this option in mediation analysis. Hence, you should setboot.ci.type = "bca"
treat
: the name of the predictor in the linear models specified undermodel.m
andmodel.y
. This is would be e.g.legacy
in the models abovemediator
: the name of the mediator in the linear models specified undermodel.m
andmodel.y
. This is would be e.g.intention
in the models above
To run a bootstrap mediation test with 2000 simulations, we can run the following command:
set.seed(20201027)
med <- mediation::mediate(model.m = mod2, model.y = mod3, sims = 2000, boot = TRUE, boot.ci.type = "bca", treat = "legacy", mediator = "intention")
## Running nonparametric bootstrap
##
## Causal Mediation Analysis
##
## Nonparametric Bootstrap Confidence Intervals with the BCa Method
##
## Estimate 95% CI Lower 95% CI Upper p-value
## ACME 0.245 0.104 0.45 <2e-16 ***
## ADE 0.488 0.186 0.82 <2e-16 ***
## Total Effect 0.733 0.425 1.07 <2e-16 ***
## Prop. Mediated 0.334 0.155 0.67 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Sample Size Used: 237
##
##
## Simulations: 2000
Note that rather than loading the mediation
package with e.g. library(mediation)
I’m calling the mediate
function through mediation::mediate
. Loading the mediation
package results in loading quite a few other R packages, which are not all necessary to perform a mediation analysis with linear models. Calling the mediate function directly through the appropriate namespace avoids loading these other add-on packages (which then will mask some functions that I don’t want masked).
The output from calling the summary
function on the results of the bootstrap procedure (the R object I named med
) has four rows:
ACME
: this is the average causal mediation effect, i.e. the average of \(\hat{a} \times \hat{b}\) in the simulations. The value underestimate
is the average, and you will also see the lower and upper bound of the 95% confidence interval under95% CI Lower
and95% CI Upper
respectively. Finally, thep-value
is the probability of the foundACME
or more extreme given that in reality, theACME
equals 0. I.e., this is the p-value of the hypothesis test that the true mediated effect equals 0.ADE
: this is the average direct effect, and reflects the effect of the predictor which is not mediated. It is the average of \(\hat{c}'\) in the simulations.
Total Effect
is the total effect of the predictor on the dependent variable, which is the sum of theACME
andADE
.Prop. Mediated
is the proportion of the effect of the predictor on the dependent variable which is mediated. This isACME
divided byTotal Effect
.
The results under ACME
show that the bootstrap confidence interval of the mediated effect does not include 0. The p-value for this effect is also smaller than \(\alpha = .05\). As such, the null hypothesis that \(a \times b = 0\) is rejected, and we have found evidence that the effect of legacy
on donation
is mediated by intention
. Because the confidence interval of the ADE
also does not include 0, this analysis indicates that the mediation is partial. There is also a significant direct effect of legacy
on donation
. About 33.4% of the effect of legacy
on donation
is mediated by intention
, so the residual direct effect is quite substantial.