I had a phd candidate in systems engineering ask why their mediation analysis was not agreeing with their t-test. (wut?)
Let’s start with the t-test. To keep it simple we call their
dependent variable Y
and the treatment grp
,
which is a 0/1 indicator variable. We see that the mean of Y for grp 1
is higher than the mean of Y for grp 0.
aggregate(Y ~ grp, data = d, mean)
## grp Y
## 1 0 2.649523
## 2 1 3.045414
Is that difference “significant”? They carried out a t-test:
t.test(Y ~ grp, data = d)
##
## Welch Two Sample t-test
##
## data: Y by grp
## t = -35.635, df = 397.93, p-value < 2.2e-16
## alternative hypothesis: true difference in means between group 0 and group 1 is not equal to 0
## 95 percent confidence interval:
## -0.4177318 -0.3740500
## sample estimates:
## mean in group 0 mean in group 1
## 2.649523 3.045414
Sure does appear so. The 95% CI is (-0.42, -0.37). We’re pretty sure the mean of grp 0 is at least -0.37 less than the mean of grp 1.
The difference in means can be thought of as the effect
size. Below we take the difference between grp1 and grp 0. I use
the base R pipe to pass the estimate (ie, the two means) into the
diff()
function, which takes the first element of a vector
and subtracts it from the next element. Then I use
as.numeric()
to strip off some metadata.
t.test(Y ~ grp, data = d)$estimate |> diff() |> as.numeric()
## [1] 0.3958909
The effect of the treatment (grp = 1) is about 0.396. In other words, if you’re in grp 1, we expect your Y value to be about 0.396 higher than it otherwise would have been if you were in grp 0.
The student’s advisor suggested there was another variable acting as
a mediator and that they should do a mediation analysis. Let’s
call the mediator variable M
. The basic idea is that the
treatment variable, grp
is actually affecting
M
, which is then affecting Y
. This is usually
visualized with a diagram. A former StatLab grad student associate wrote
a good
blog post on this topic and works through an example with some
pictures.
Three quantities of interest usually come out of a mediation analysis:
One way to perform a mediation analysis is with the
lavaan package. (“lavaan” is short for “latent variable
analysis”.) This is a whole different beast, but the basic idea is you
specify a structural equation model as lines of text using lavaan model
syntax. Then you fit the model using the sem()
function. In
the formula below, “a”, “b”, “ADE”, and “ACME” are simply labels for the
coefficients being estimated. The latter two are what we care about. The
:=
notation says that the ACME is a latent
variable that we do not directly observe. So we estimate by
multiplying the coefficients “a” and “b”.
Here is what the grad student did:
library(lavaan)
m <- '
M ~ a*grp
Y ~ b*M
Y ~ ADE*grp
ACME := a*b
'
fitm <- sem(m, data=d)
summary(fitm)
## lavaan 0.6-11 ended normally after 1 iterations
##
## Estimator ML
## Optimization method NLMINB
## Number of model parameters 5
##
## Number of observations 400
##
## Model Test User Model:
##
## Test statistic 0.000
## Degrees of freedom 0
##
## Parameter Estimates:
##
## Standard errors Standard
## Information Expected
## Information saturated (h1) model Structured
##
## Regressions:
## Estimate Std.Err z-value P(>|z|)
## M ~
## grp (a) 0.601 0.010 61.223 0.000
## Y ~
## M (b) 0.476 0.051 9.285 0.000
## grp (ADE) 0.110 0.032 3.378 0.001
##
## Variances:
## Estimate Std.Err z-value P(>|z|)
## .M 0.010 0.001 14.142 0.000
## .Y 0.010 0.001 14.142 0.000
##
## Defined Parameters:
## Estimate Std.Err z-value P(>|z|)
## ACME 0.286 0.031 9.180 0.000
Now to their question: why is the estimated ADE of 0.110 so much smaller than the estimated effect size of 0.396 from the t-test? Isn’t the Direct Effect the same as the effect size returned from the t-test?
As it turns out, no, they are not the same things. The average direct effect (ADE) is the effect of treatment controlling for mediation. (I find the terminology of “direct effect” very confusing. especially when the effect is being adjusted for another variable!) The difference in means from a t-test is the effect of treatment not controlling for anything else. It is called the “Total Effect” in mediation analysis.
If we want to use the lavaan model to estimate the Total Effect (ie, diff in means from t-test), we sum ADE and ACME. So I modified their R code as follows.
m <- '
M ~ a*grp
Y ~ b*M
Y ~ ADE*grp
ACME := a*b
Total := ADE + ACME ## t-test effect size, difference in means
'
fitm <- sem(m, data=d)
summary(fitm)
## lavaan 0.6-11 ended normally after 1 iterations
##
## Estimator ML
## Optimization method NLMINB
## Number of model parameters 5
##
## Number of observations 400
##
## Model Test User Model:
##
## Test statistic 0.000
## Degrees of freedom 0
##
## Parameter Estimates:
##
## Standard errors Standard
## Information Expected
## Information saturated (h1) model Structured
##
## Regressions:
## Estimate Std.Err z-value P(>|z|)
## M ~
## grp (a) 0.601 0.010 61.223 0.000
## Y ~
## M (b) 0.476 0.051 9.285 0.000
## grp (ADE) 0.110 0.032 3.378 0.001
##
## Variances:
## Estimate Std.Err z-value P(>|z|)
## .M 0.010 0.001 14.142 0.000
## .Y 0.010 0.001 14.142 0.000
##
## Defined Parameters:
## Estimate Std.Err z-value P(>|z|)
## ACME 0.286 0.031 9.180 0.000
## Total 0.396 0.011 35.646 0.000
Notice the last line under “Defined Parameters” shows Total as 0.396, which is the same as what we got from the t-test:
t.test(Y ~ grp, data = d)$estimate |>
diff() |>
as.numeric() |>
round(3)
## [1] 0.396
That was basically the consultation. Me figuring out the distinction between Average Direct Effect and Total Effect!
For what it’s worth, here’s how I simulated the data for this
example. First I simulate M
using grp
and some
noise from a Normal distribution. Then I use both M
and
grp
to simulate Y
, one again with some noise.
So in the simulation we see that grp
affected
M
, and then M
and grp
affected
Y
. We might call this “partial mediation” since
grp
affects both M
and Y
.
n <- 400
set.seed(999)
grp <- sample(0:1, size = n, replace = TRUE) |> factor()
M <- 1.5 + 0.6*(grp=="1") + rnorm(n, sd = 0.1)
Y <- 1.9 + 0.1*(grp=="1") + 0.5*M + rnorm(n, sd = 0.1)
d <- data.frame(grp = grp, Y = Y, M = M)
Y
). The
lavaan model estimated it as 0.110.Notice if we divide ACME by Total Effect we get 0.3/0.4 = 0.75. We might say the proportion of the effect mediated is 0.75.
The mediate()
function in the mediation package formally
estimates this in addition to ACME, ADE, and Total Effect. Its estimates
are slightly different from lavaan’s because it uses something called
Quasi-Bayesian Approximation which involves simulation. To use the
mediate()
function we first have to fit a model for the
mediator and then fit a model for the dependent variable. Below it
estimates the proportion of the effect mediated to be about 0.73.
library(mediation)
b <- lm(M ~ grp, data = d)
c <- lm(Y ~ grp + M, data = d)
mout <- mediate(b, c, treat = "grp", mediator = "M")
summary(mout)
##
## Causal Mediation Analysis
##
## Quasi-Bayesian Confidence Intervals
##
## Estimate 95% CI Lower 95% CI Upper p-value
## ACME 0.2882 0.2271 0.35 <2e-16 ***
## ADE 0.1077 0.0414 0.17 <2e-16 ***
## Total Effect 0.3959 0.3730 0.42 <2e-16 ***
## Prop. Mediated 0.7262 0.5712 0.89 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Sample Size Used: 400
##
##
## Simulations: 1000
I hope you found this helpful.