Multilevel model for discrete ordered data

Author

Clay Ford

Read in data

library(ggplot2)
library(lmerTest)
library(emmeans)
library(ggeffects)
library(ordinal)

d <- readRDS("d.rds")

Explore data

64 people were observed, 32 in the Control group, 32 in the Treated group. Each subject was observed 20 times. Each observation was counted as a “round”. There is distinction what happened in the even rounds (2, 4, 6…) versus the odd rounds (1, 3, 5…).

The dependent variable is Decision, which is discrete and ranges from 0 - 4. There is a two-level treatment factor, and two-level factor indicating whether an observation was in an odd-numbered round or not.

summary(d$Decision)

   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
  0.000   1.000   2.000   1.773   3.000   4.000

table(d$Decision)


  0   1   2   3   4 
265 320 335 160 200

ggplot(d) +
  aes(x = Decision) +
  geom_bar() +
  facet_grid(Odd ~ Treat, labeller = "label_both")

Option 1: linear mixed effect model

Model mean Decision as a function of Treat, Odd, and their interaction, with a random intercept for subject ID. The interaction appears to be significant. The effect of Treat depends on whether is was odd-numbered round or not.

m <- lmer(Decision ~ Treat * Odd + (1|ID), data = d)
summary(m, corr = FALSE)

Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: Decision ~ Treat * Odd + (1 | ID)
   Data: d

REML criterion at convergence: 3945.7

Scaled residuals: 
     Min       1Q   Median       3Q      Max 
-3.11981 -0.66200 -0.03065  0.67276  3.04016 

Random effects:
 Groups   Name        Variance Std.Dev.
 ID       (Intercept) 0.6037   0.777   
 Residual             1.1237   1.060   
Number of obs: 1280, groups:  ID, 64

Fixed effects:
             Estimate Std. Error        df t value Pr(>|t|)    
(Intercept)    2.0469     0.1496   72.9796  13.683  < 2e-16 ***
Treat1        -0.2250     0.2116   72.9796  -1.064  0.29104    
Odd1          -0.1531     0.0838 1214.0000  -1.827  0.06791 .  
Treat1:Odd1   -0.3375     0.1185 1214.0000  -2.848  0.00448 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

car::Anova(m)

Analysis of Deviance Table (Type II Wald chisquare tests)

Response: Decision
            Chisq Df Pr(>Chisq)    
Treat      3.7591  1   0.052522 .  
Odd       29.5043  1  5.579e-08 ***
Treat:Odd  8.1096  1   0.004403 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Comparing means, there appears to be a difference between groups in the odd rounds. But is that difference practical? Is 1.3 vs 1.9 interesting when the dependent variable is discrete and ranges from 0 - 4?

emmeans(m, pairwise ~ Treat | Odd)

$emmeans
Odd = 0:
 Treat emmean   SE df lower.CL upper.CL
 0       2.05 0.15 73     1.75     2.35
 1       1.82 0.15 73     1.52     2.12

Odd = 1:
 Treat emmean   SE df lower.CL upper.CL
 0       1.89 0.15 73     1.60     2.19
 1       1.33 0.15 73     1.03     1.63

Degrees-of-freedom method: kenward-roger 
Confidence level used: 0.95 

$contrasts
Odd = 0:
 contrast        estimate    SE df t.ratio p.value
 Treat0 - Treat1    0.225 0.212 73   1.064  0.2910

Odd = 1:
 contrast        estimate    SE df t.ratio p.value
 Treat0 - Treat1    0.562 0.212 73   2.659  0.0096

Degrees-of-freedom method: kenward-roger

Model diagnostics. The first plot (constant variance assumption) looks suspect and weird because of the discreteness. The second (normality of residuals assumption) looks pretty good, all things considered.

plot(m) # weird

lattice::qqmath(m) # not bad actually

Visualize difference in expected means with an effect plot:

plot(ggpredict(m, terms = c("Odd", "Treat")))

Option 2: ordinal mixed-effect model

Treat response as ordered category and model cumulative probability.

# format Decision as ordered factor
d$DecisionF <- factor(d$Decision, levels = 0:4, labels = 0:4, 
                      ordered = TRUE)

Same basic result as mixed-effect model. Significant interaction between Odd and Treat.

m2 <- clmm(DecisionF ~ Treat * Odd + (1|ID), data = d)
summary(m2)

Cumulative Link Mixed Model fitted with the Laplace approximation

formula: DecisionF ~ Treat * Odd + (1 | ID)
data:    d

 link  threshold nobs logLik   AIC     niter     max.grad cond.H 
 logit flexible  1280 -1778.01 3572.02 499(3386) 1.61e-03 2.0e+02

Random effects:
 Groups Name        Variance Std.Dev.
 ID     (Intercept) 2.135    1.461   
Number of groups:  ID 64 

Coefficients:
            Estimate Std. Error z value Pr(>|z|)    
Treat1       -0.3396     0.3948  -0.860  0.38959    
Odd1         -0.2336     0.1385  -1.687  0.09161 .  
Treat1:Odd1  -0.7769     0.2096  -3.707  0.00021 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Threshold coefficients:
    Estimate Std. Error z value
0|1  -2.4242     0.2881  -8.414
1|2  -0.7487     0.2804  -2.670
2|3   0.8347     0.2804   2.977
3|4   1.8224     0.2851   6.393

How do they interact? Compare estimated probabilities for each Decision level. It appears the difference in Decision happens between Treat and Control in the Odd rounds. (Same result as linear mixed effect model.)

Instead of comparing estimated means, we compare estimated probability that Decision = x between Treated and Control groups, conditioned on odd vs even rounds. We see higher probability Treated group chooses 0 or 1, but higher probability control group chooses 2 through 4. (emmeans changes 0:4 to 1:5 for some reason. Irritating…)

emm_m2 <- emmeans(m2, specs = revpairwise ~ Treat | DecisionF * Odd, 
                  mode = "prob") |> confint()
emm_m2$contrasts

DecisionF = 1, Odd = 0:
 contrast        estimate     SE  df asymp.LCL asymp.UCL
 Treat1 - Treat0   0.0293 0.0346 Inf   -0.0386    0.0971

DecisionF = 2, Odd = 0:
 contrast        estimate     SE  df asymp.LCL asymp.UCL
 Treat1 - Treat0   0.0488 0.0563 Inf   -0.0615    0.1590

DecisionF = 3, Odd = 0:
 contrast        estimate     SE  df asymp.LCL asymp.UCL
 Treat1 - Treat0  -0.0115 0.0169 Inf   -0.0446    0.0217

DecisionF = 4, Odd = 0:
 contrast        estimate     SE  df asymp.LCL asymp.UCL
 Treat1 - Treat0  -0.0306 0.0354 Inf   -0.1000    0.0387

DecisionF = 5, Odd = 0:
 contrast        estimate     SE  df asymp.LCL asymp.UCL
 Treat1 - Treat0  -0.0359 0.0422 Inf   -0.1186    0.0468

DecisionF = 1, Odd = 1:
 contrast        estimate     SE  df asymp.LCL asymp.UCL
 Treat1 - Treat0   0.1540 0.0588 Inf    0.0388    0.2692

DecisionF = 2, Odd = 1:
 contrast        estimate     SE  df asymp.LCL asymp.UCL
 Treat1 - Treat0   0.1180 0.0421 Inf    0.0354    0.2005

DecisionF = 3, Odd = 1:
 contrast        estimate     SE  df asymp.LCL asymp.UCL
 Treat1 - Treat0  -0.1174 0.0413 Inf   -0.1983   -0.0365

DecisionF = 4, Odd = 1:
 contrast        estimate     SE  df asymp.LCL asymp.UCL
 Treat1 - Treat0  -0.0814 0.0294 Inf   -0.1389   -0.0238

DecisionF = 5, Odd = 1:
 contrast        estimate     SE  df asymp.LCL asymp.UCL
 Treat1 - Treat0  -0.0732 0.0301 Inf   -0.1322   -0.0142

Confidence level used: 0.95

A plot of the contrasts and their CIs makes this easier to see:

emm_df <- as.data.frame(emm_m2$contrasts)

ggplot(emm_df) +
  aes(x = DecisionF, y = estimate) +
  geom_point() +
  geom_errorbar(mapping = aes(ymin = asymp.LCL, ymax = asymp.UCL), 
                width = 0.1) +
  geom_hline(yintercept = 0, linetype = 2) +
  scale_x_discrete(labels = 0:4) +
  facet_wrap(~ Odd, labeller = "label_both") +
  ggtitle("Difference in estimated probabilities: Treat 1 - Treat 0")