Paired T Test as a Model

Generate some data

Generate something resembling data collected at five stations:

set.seed(1234)
time <- seq.Date(from = as.Date('2024-01-01'), to = as.Date('2024-03-01'), 
                 by = "days")
station <- rep(LETTERS[1:5], each = length(time))
d <- data.frame(time = rep(time, 5), station = station)
# random time effect
z <- rnorm(length(time), sd = 0.4)
# generate some outcome measure
d$y <- (10 + z[rep(1:length(time), 5)]) +
  (station == "B")*0.5 + 
  (station == "C")*2.3 + 
  (station == "D")*-0.7 + 
  (station == "E")*0.4 + 
  rnorm(nrow(d), sd = 0.3)

Quick visual of data:

library(ggplot2)
ggplot(d) +
  aes(x = time, y = y, color = station) +
  geom_line()

Fit model

Instead of individual paired t-tests we can fit one model.

library(lme4)
m <- lmer(y ~ station + (1|time), data = d)
summary(m, corr = F)

## Linear mixed model fit by REML ['lmerMod']
## Formula: y ~ station + (1 | time)
##    Data: d
## 
## REML criterion at convergence: 284.1
## 
## Scaled residuals: 
##      Min       1Q   Median       3Q      Max 
## -2.71898 -0.61101 -0.00457  0.61235  2.61870 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  time     (Intercept) 0.14571  0.3817  
##  Residual             0.09079  0.3013  
## Number of obs: 305, groups:  time, 61
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept)  9.86637    0.06227 158.457
## stationB     0.50146    0.05456   9.191
## stationC     2.31270    0.05456  42.389
## stationD    -0.69194    0.05456 -12.682
## stationE     0.38924    0.05456   7.134

Now run all the paired t-tests at once. Notice p-value corrections are automatically made for the multiple tests.

library(emmeans)
emmeans(m, specs = pairwise ~ station)$contrasts

##  contrast estimate     SE  df t.ratio p.value
##  A - B      -0.501 0.0546 240  -9.191  <.0001
##  A - C      -2.313 0.0546 240 -42.389  <.0001
##  A - D       0.692 0.0546 240  12.682  <.0001
##  A - E      -0.389 0.0546 240  -7.134  <.0001
##  B - C      -1.811 0.0546 240 -33.198  <.0001
##  B - D       1.193 0.0546 240  21.874  <.0001
##  B - E       0.112 0.0546 240   2.057  0.2425
##  C - D       3.005 0.0546 240  55.071  <.0001
##  C - E       1.923 0.0546 240  35.255  <.0001
##  D - E      -1.081 0.0546 240 -19.817  <.0001
## 
## Degrees-of-freedom method: kenward-roger 
## P value adjustment: tukey method for comparing a family of 5 estimates

And here are the confidence intervals on the average differences.

emmeans(m, specs = pairwise ~ station)$contrast |> 
  confint()

##  contrast estimate     SE  df lower.CL upper.CL
##  A - B      -0.501 0.0546 240  -0.6514   -0.351
##  A - C      -2.313 0.0546 240  -2.4627   -2.163
##  A - D       0.692 0.0546 240   0.5420    0.842
##  A - E      -0.389 0.0546 240  -0.5392   -0.239
##  B - C      -1.811 0.0546 240  -1.9612   -1.661
##  B - D       1.193 0.0546 240   1.0434    1.343
##  B - E       0.112 0.0546 240  -0.0377    0.262
##  C - D       3.005 0.0546 240   2.8547    3.155
##  C - E       1.923 0.0546 240   1.7735    2.073
##  D - E      -1.081 0.0546 240  -1.2311   -0.931
## 
## Degrees-of-freedom method: kenward-roger 
## Confidence level used: 0.95 
## Conf-level adjustment: tukey method for comparing a family of 5 estimates

Paired T Test as a Model

Clay Ford

2024-11-22

Generate some data

Fit model