# Chapter 5 Moderation and mediation

## 5.1 Moderation in linear models

Including an interaction in a linear model in R is straightforward. If you have two predictors, `x1`

and `x2`

, and want to include both the “simple slopes” as well as the slope for the “product predictor” (i.e. `x1`

\(\times\) `x2`

), then the model with `y`

as dependent variable can be specified in formula form as

`y ~ x1 * x2`

which evaluates to

`y ~ 1 + x1 + x2 + x1:x2`

As discussed previously, `1`

represents the intercept, which is automatically included in a model specification, unless you remove it explicitly by adding `-1`

to the formula. `x1`

represents the “simple effect” of `x1`

, `x2`

the corresponding “simple effect” of `x2`

, whilst `x1:x2`

represents the *product-predictor* for the interaction between `x1`

and `x2`

(perhaps a little confusing for those of you who know `:`

as a division operator). Because you would generally want to include the simple effects for the predictors as well as the interaction, the authors of R have chosen to save you typing in the full model by expanding `x1 * x2`

in this way. This can be used to specify more complicated models, with three-way interactions. For instance,

`y ~ x1 * x2 * x3`

evaluates to

`y ~ 1 + x1 + x2 + x3 + x1:x2 + x1:x3 + x2:x3 + x1:x2:x3`

which is a model with all “simple effects,” all pairwise product-predictors, as well as a three-way product predictor which is `x1`

\(\times\) `x2`

\(\times\) `x3`

. We won’t discuss such higher-order interactions until later, but it is good to be aware of this in advance.

As a linear model with a product-predictors is just another linear model, that is really all there is to say about specifying linear models with interactions/moderation in R. Through the formula interface, R will create the necessary additional product predictor(s), and then estimate the parameters of the resulting linear model.

Let’s have a look at how this works with the `speeddate`

data which was also analysed in the SDAM book. The data is included in the `sdamr`

package, and can be loaded and (partially) inspected as usual:

```
library(sdamr)
data("speeddate")
head(speeddate)
```

```
## iid pid gender age date_like other_like date_want other_want match self_attr
## 1 132 137 female 27 8 7 0 1 0 8
## 2 132 138 female 27 8 8 0 1 0 8
## 3 132 139 female 27 5 8 0 1 0 8
## 4 132 140 female 27 9 7 1 1 1 8
## 5 132 141 female 27 7 7 0 1 0 8
## 6 133 137 female 24 7 7 0 0 0 6
## self_sinc self_intel self_fun self_amb other_attr other_sinc other_intel
## 1 10 9 10 10 8 8 9
## 2 10 9 10 10 8 7 4
## 3 10 9 10 10 8 NA 8
## 4 10 9 10 10 7 7 7
## 5 10 9 10 10 8 7 7
## 6 8 8 7 7 6 10 10
## other_fun other_amb other_shar date_attr date_sinc date_intel date_fun
## 1 8 9 7 7 9 7 9
## 2 6 6 4 8 8 8 8
## 3 8 7 NA 5 7 9 5
## 4 7 7 7 8 9 9 9
## 5 7 7 8 5 8 8 8
## 6 6 7 5 7 7 7 8
## date_amb date_shar self_imp_attr self_imp_sinc self_imp_intel self_imp_fun
## 1 6 9 16.67 16.67 16.67 16.67
## 2 8 8 16.67 16.67 16.67 16.67
## 3 9 5 16.67 16.67 16.67 16.67
## 4 9 8 16.67 16.67 16.67 16.67
## 5 7 7 16.67 16.67 16.67 16.67
## 6 6 8 12.77 19.15 17.02 17.02
## self_imp_amb self_imp_shar other_imp_attr other_imp_sinc other_imp_intel
## 1 16.67 16.67 17.39 17.39 15.22
## 2 16.67 16.67 20.00 20.00 20.00
## 3 16.67 16.67 18.75 16.67 18.75
## 4 16.67 16.67 18.60 16.28 18.60
## 5 16.67 16.67 20.83 20.83 16.67
## 6 14.89 19.15 17.39 17.39 15.22
## other_imp_fun other_imp_amb other_imp_shar
## 1 17.39 13.04 19.57
## 2 20.00 6.67 13.33
## 3 20.83 12.50 12.50
## 4 18.60 11.63 16.28
## 5 16.67 6.25 18.75
## 6 17.39 13.04 19.57
```

There are rather a large number of variables in the dataset. You can obtain more information about each variable in the documentation of the dataset, by calling `?speeddate`

. In my humble opinion, the (generally quite) good documentation of R packages is a real benefit of R over some other systems, and I **strongly recommend** you to check out and read the documentation of functions and datasets before you use them. Functions in R are generally quite flexible and it is infeasible to discuss all the nuances and possibilities in introductory notes like these. In the book, we mainly focused on the variables starting with `other_`

, which are the perceptions of the participant by their dating partner. For example, the model
\[\begin{align}
\texttt{like}_i =& \beta_0 + \beta_{\texttt{attr}} \times \texttt{attr}_i + \beta_{\texttt{intel}} \times \texttt{intel}_i + \beta_{\texttt{fun}} \times \texttt{fun}_i \\
&+ \beta_{\texttt{attr} \times \texttt{intel}} \times (\texttt{attr} \times \texttt{intel})_i + \beta_{\texttt{fun} \times \texttt{intel}} \times (\texttt{fun} \times \texttt{intel})_i + \epsilon_i
\end{align}\]
referred to in the SDAM book can be estimated by calling:

`<- lm(other_like ~ other_attr*other_intel + other_fun*other_intel, data=speeddate) modg `

Hypothesis tests with the \(t\) statistic are obtained as usual though

`summary(modg)`

```
##
## Call:
## lm(formula = other_like ~ other_attr * other_intel + other_fun *
## other_intel, data = speeddate)
##
## Residuals:
## Min 1Q Median 3Q Max
## -5.2380 -0.6632 0.0239 0.6583 4.6484
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -0.91118 0.44269 -2.058 0.039740 *
## other_attr 0.67123 0.09027 7.436 1.76e-13 ***
## other_intel 0.32393 0.05994 5.405 7.57e-08 ***
## other_fun 0.14331 0.08734 1.641 0.101045
## other_attr:other_intel -0.04333 0.01171 -3.700 0.000224 ***
## other_intel:other_fun 0.03186 0.01139 2.798 0.005209 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.133 on 1472 degrees of freedom
## (84 observations deleted due to missingness)
## Multiple R-squared: 0.6227, Adjusted R-squared: 0.6214
## F-statistic: 485.8 on 5 and 1472 DF, p-value: < 2.2e-16
```

The equivalent tests with the \(F\) statistic are easily obtained through

`::Anova(modg,type=3) car`

```
## Anova Table (Type III tests)
##
## Response: other_like
## Sum Sq Df F value Pr(>F)
## (Intercept) 5.43 1 4.2365 0.0397398 *
## other_attr 70.93 1 55.2931 1.755e-13 ***
## other_intel 37.47 1 29.2094 7.568e-08 ***
## other_fun 3.45 1 2.6923 0.1010448
## other_attr:other_intel 17.56 1 13.6898 0.0002236 ***
## other_intel:other_fun 10.04 1 7.8287 0.0052093 **
## Residuals 1888.24 1472
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
```

### 5.1.1 A bit about namespaces

In the code above, I’m using `car::`

to refer to a function in the `car`

package. Technically, a statement like `package_name::`

denotes the **namespace** of the package with the name `package_name`

(i.e., `car`

in this case). This allows you to use functions from a package without loading the package completely (without loading all the functions of the package in memory). This can be (and often is!) better than first loading the package through e.g. `library(car)`

and then calling `Anova`

. The issue is that different packages can use the same name for a function, and when you call a function, it will be the one of the package that was last loaded. When packages use the same name for functions, the function with the same name from a package that was loaded earlier will be “masked” and R will print this as a warning in the R console. For example, if you load the `dplyr`

package (a rather useful package for data manipulation, that is a little too much to discuss here in detail), you will see the following warnings:

`library(dplyr)`

```
##
## Attaching package: 'dplyr'
```

```
## The following objects are masked from 'package:stats':
##
## filter, lag
```

```
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
```

The line `The following objects are masked from 'package:stats'`

indicates that the functions `filter`

and `lag`

are “masked from the `stats`

package. Whenever you call these functions, they will be the corresponding functions from the `dplyr`

package, and *not* those from the `stats`

package. This does not break the functionality of the packages themselves, as a properly written package that needs the `filter`

function from the `stats`

package will still use the function from that namespace (package), and not the one from the `dplyr`

one. But when *you* call the a function, R will try find the definition of that function in the *global* namespace, and the global namespace contains the version of the last provided definition of any R object. Just like you can overwrite an R object by giving it the same name as an already existing r object (as I often do deliberately through e.g. `mod <- lm()`

), a function in R is just another object. So if I specify a function like

```
<- function(x, na.rm = FALSE) {
head return("Where's your head? It's almost Halloween!")
}
```

Then next time I call that function, I will get as a result

`head(speeddate)`

`## [1] "Where's your head? It's almost Halloween!"`

and **not** the result from

`::head(speeddate) utils`

## 5.2 Centering

As discussed in the SDAM book, sometimes you might want to center variables by subtracting their (sample) mean from each value. This can be done in a number of ways. You can either create new variables in a dataframe by subtracting the single value obtained through `mean()`

from a vector, as in

`$other_like_c <- speeddate$other_like - mean(speeddate$other_like, na.rm = TRUE) speeddate`

and then using the new variable `other_like_c`

in your `lm`

model. Alternatively, you can do by calling the `scale`

function. By default, the `scale`

function creates \(Z\) transformed variables by subtracting the mean and then dividing this mean-deviation by the standard deviation:
\[\text{scale}(Y_i) = \frac{Y_i - \overline{Y}}{S_Y}\]
The `scale`

function has three arguments:

`x`

: the variable (or matrix of variables) which you want to scale`center`

: a logical value indicating whether you want to subtract the mean`scale`

: a logical value indicating whether you want to divide by the standard deviation

Centering (subtracting the mean, but not dividing by the standard deviation) is thus obtained by calling `scale(x, scale=FALSE)`

. Personally, I find calling a function `scale`

with argument `scale = FALSE`

a little confusing. The `sdamr`

package therefore provides the function `center`

which is basically just a version of `scale`

which by default sets the argument `scale = FALSE`

:

```
<- function(x) {
center scale(x, center = TRUE, scale = FALSE)
}
```

Because R is a functional programming language, you can call the `scale`

or `center`

function directly within the call to the `lm`

function. This saves you having to create centered variables in a dataframe first. For instance, if you have loaded the `sdamr`

package (or if you ran the code above defining the `center`

function), you can obtain the results of the model with centered predictors by calling

```
<- lm(other_like ~ center(other_attr)*center(other_intel) + center(other_fun)*center(other_intel), data=speeddate)
modg_c summary(modg_c)
```

```
##
## Call:
## lm(formula = other_like ~ center(other_attr) * center(other_intel) +
## center(other_fun) * center(other_intel), data = speeddate)
##
## Residuals:
## Min 1Q Median 3Q Max
## -5.2380 -0.6632 0.0239 0.6583 4.6484
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 6.19613 0.03151 196.609 < 2e-16 ***
## center(other_attr) 0.34539 0.01922 17.967 < 2e-16 ***
## center(other_intel) 0.25791 0.02351 10.970 < 2e-16 ***
## center(other_fun) 0.38292 0.02094 18.287 < 2e-16 ***
## center(other_attr):center(other_intel) -0.04333 0.01171 -3.700 0.000224 ***
## center(other_intel):center(other_fun) 0.03186 0.01139 2.798 0.005209 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.133 on 1472 degrees of freedom
## (84 observations deleted due to missingness)
## Multiple R-squared: 0.6227, Adjusted R-squared: 0.6214
## F-statistic: 485.8 on 5 and 1472 DF, p-value: < 2.2e-16
```

## 5.3 Mediation analysis

In the SDAM book, we looked at mediation with a different dataset called `legacy2015`

. This is also part of the `sdamr`

package, and we can make it available and look at the initial cases as usual:

```
data("legacy2015")
head(legacy2015)
```

`## [1] "Where's your head? It's almost Halloween!"`

Oops! We had overwritten (“masked”) the `head`

function before. We can get rid of our new definition of the `head`

function by removining it from the global namespace with the `rm()`

function (the name refers to *remove*):

`rm("head")`

After this, we can try again:

`head(legacy2015)`

```
## id sex age legacy belief intention education income donation
## 2 2 male 23 4.38 5.4 3.57 5 5 0
## 3 3 female 59 4.75 5.2 2.43 5 6 0
## 4 4 male 30 5.00 5.2 2.43 5 3 1
## 5 5 female 25 2.88 5.0 2.00 4 3 0
## 6 6 female 32 4.88 4.4 2.57 2 1 0
## 7 7 male 38 5.12 4.6 3.71 3 2 7
```

### 5.3.1 Causal steps

The causal steps approach to assessing mediation is done through testing significance in three regression models. This can be done straightforwardly with the `lm()`

function which we have used quite a bit already. For instance, to assess whether the relation between `legacy`

and `donation`

is mediated by `intention`

, we would estimate and test the parameters of the following models:

```
<- lm(donation ~ legacy, data = legacy2015)
mod1 <- lm(intention ~ legacy, data = legacy2015)
mod2 <- lm(donation ~ intention + legacy, data = legacy2015) mod3
```

### 5.3.2 Investigating the moderated (indirect) effect with a bootstrap test

There are quite a few packages in R which will allow you to test the moderated effect of a predictor on the dependent variable “via” the mediator. I have chosen here for the `mediate`

function from the `mediation`

package (Tingley et al. 2019), as it is a versatile option, and doesn’t require too much additional explanation. Another very good option to conduct mediation analysis is to specify the mediation model through a generalization of linear models generally called *Structural Equation Models*. These are multivariate models and not something we will cover in this course. For present purposes, this can be seen as a way to link different regression models (i.e. the dependent variable of one regression model becomes a predictor in another regression model) into what are conventionally called *path models*. The current “go-to” and most comprehensive SEM package in R is called `lavaan`

and if you ever need to use this kind of analysis, that is my recommendation at the moment.

So let’s focus on the `mediation`

package for now. If you haven’t done so already, you will need to install it with

`install.packages("mediation")`

If you check the documentation of `mediate`

in this package (i.e type `?mediation::mediate`

), you will see there are lots of arguments to specify. We will only focus one the ones important for present purposes:

`model.m`

: the name of the R object which contains the linear model predicting the mediator from the predictor, e.g.`mod2`

above`model.y`

: the name of the R object which contains the linear model predicting the dependent variable from both the mediator and predictor, e.g.`mod3`

above`sims`

: the number of simulations to use for the bootstrap test`boot`

: (logical) whether to use a bootstrap test. You should set this to`TRUE`

`boot.ci.type`

: the type of bootstrap confidence interval to be computed. This can either be`perc`

, which stands for*percentile*, and is a simple way where the empirical 2.5th and 97.5th percentiles are calculated from the ordered outcomes. This is the option chosen by default. The other option is`bca`

which stands for*bias-corrected and accelerated*. This includes a correction of the percentile method to try and reduce bias. It is generally recommended to use this option in mediation analysis. Hence, you should set`boot.ci.type = "bca"`

`treat`

: the name of the predictor in the linear models specified under`model.m`

and`model.y`

. This is would be e.g.`legacy`

in the models above`mediator`

: the name of the mediator in the linear models specified under`model.m`

and`model.y`

. This is would be e.g.`intention`

in the models above

To run a bootstrap mediation test with 2000 simulations, we can run the following command:

```
set.seed(20201027)
<- mediation::mediate(model.m = mod2, model.y = mod3, sims = 2000, boot = TRUE, boot.ci.type = "bca", treat = "legacy", mediator = "intention") med
```

`## Running nonparametric bootstrap`

`summary(med)`

```
##
## Causal Mediation Analysis
##
## Nonparametric Bootstrap Confidence Intervals with the BCa Method
##
## Estimate 95% CI Lower 95% CI Upper p-value
## ACME 0.245 0.104 0.45 <2e-16 ***
## ADE 0.488 0.186 0.82 <2e-16 ***
## Total Effect 0.733 0.425 1.07 <2e-16 ***
## Prop. Mediated 0.334 0.155 0.67 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Sample Size Used: 237
##
##
## Simulations: 2000
```

Note that rather than loading the `mediation`

package with e.g. `library(mediation)`

I’m calling the `mediate`

function through `mediation::mediate`

. Loading the `mediation`

package results in loading quite a few other R packages, which are not all necessary to perform a mediation analysis with linear models. Calling the mediate function directly through the appropriate namespace avoids loading these other add-on packages (which then will mask some functions that I don’t want masked).

The output from calling the `summary`

function on the results of the bootstrap procedure (the R object I named `med`

) has four rows:

`ACME`

: this is the*average causal mediation effect*, i.e. the average of \(\hat{a} \times \hat{b}\) in the simulations. The value under`estimate`

is the average, and you will also see the lower and upper bound of the 95% confidence interval under`95% CI Lower`

and`95% CI Upper`

respectively. Finally, the`p-value`

is the probability of the found`ACME`

or more extreme given that in reality, the`ACME`

equals 0. I.e., this is the p-value of the hypothesis test that the true mediated effect equals 0.`ADE`

: this is the*average direct effect*, and reflects the effect of the predictor which is not mediated. It is the average of \(\hat{c}'\) in the simulations.

`Total Effect`

is the total effect of the predictor on the dependent variable, which is the sum of the`ACME`

and`ADE`

.`Prop. Mediated`

is the proportion of the effect of the predictor on the dependent variable which is mediated. This is`ACME`

divided by`Total Effect`

.

The results under `ACME`

show that the bootstrap confidence interval of the mediated effect does not include 0. The p-value for this effect is also smaller than \(\alpha = .05\). As such, the null hypothesis that \(a \times b = 0\) is rejected, and we have found evidence that the effect of `legacy`

on `donation`

is mediated by `intention`

. Because the confidence interval of the `ADE`

also does not include 0, this analysis indicates that the mediation is partial. There is also a significant direct effect of `legacy`

on `donation`

. About 33.4% of the effect of `legacy`

on `donation`

is mediated by `intention`

, so the residual direct effect is quite substantial.

### References

*Mediation: Causal Mediation Analysis*. https://imai.princeton.edu/projects/mechanisms.html.