Let’s fit a binary logistic regression model with an interaction. The predictors, x1 and x2 are both 2-level categorical variables. Notice the interaction, x1:x2, is highly significant.

m <- glm(y ~ x1 + x2 + x1:x2, data = d, family = binomial)
summary(m)
## 
## Call:
## glm(formula = y ~ x1 + x2 + x1:x2, family = binomial, data = d)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -2.0041  -1.0268   0.5369   1.0778   1.5284  
## 
## Coefficients:
##             Estimate Std. Error z value Pr(>|z|)    
## (Intercept)  -0.3651     0.1695  -2.155   0.0312 *  
## x11           0.6040     0.2389   2.528   0.0115 *  
## x21          -0.4304     0.2391  -1.800   0.0718 .  
## x11:x21       2.0556     0.3785   5.430 5.62e-08 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 829.37  on 599  degrees of freedom
## Residual deviance: 711.96  on 596  degrees of freedom
## AIC: 719.96
## 
## Number of Fisher Scoring iterations: 4

How can we better understand this interaction?

One way is effect displays. This is basically predicted probabilities at specific levels. Notice in the effect display below that the effect of x1 changes depending on the level of x2. The effect of x1 is much greater when x2 = 1 (the blue line). The effect of x1 is not as pronounced when x2 = 0. This is a textbook example of an interaction. What’s the effect of x1? It depends on x2.

The predicted probability of y is much higher when x1 == 1 and x2 == 1. Also, the effect of x2 appears to reverse when x1 = 0. Notice how red is higher than blue when x1 = 0, but blue is higher than red when x2 = 1.

Another way to understand interactions is to compute differences in estimated probabilities using the model. Below we estimate probabilities for y at the two levels of x2, conditional on x1. You’ll notice these are the probabilities that are plotted above. The contrasts section shows the difference in estimated probabilities for x2 at each level of x1. When x1 = 0, the estimated difference in probabilities between each level of x2 is not significant at the 0.05 level. However when x1 = 1, the estimated difference of 0.3063 between the two levels of x2 is significant.

## $emmeans
## x1 = 0:
##  x2  prob     SE  df asymp.LCL asymp.UCL
##  0  0.410 0.0410 Inf     0.329     0.490
##  1  0.311 0.0361 Inf     0.240     0.382
## 
## x1 = 1:
##  x2  prob     SE  df asymp.LCL asymp.UCL
##  0  0.559 0.0415 Inf     0.478     0.641
##  1  0.866 0.0279 Inf     0.811     0.921
## 
## Confidence level used: 0.95 
## 
## $contrasts
## x1 = 0:
##  contrast  estimate     SE  df z.ratio p.value
##  x21 - x20  -0.0987 0.0546 Inf  -1.807  0.0708
## 
## x1 = 1:
##  contrast  estimate     SE  df z.ratio p.value
##  x21 - x20   0.3063 0.0500 Inf   6.122  <.0001

Confidence intervals give a more complete picture. When x1 = 1, the estimated difference in probabilities between the levels of x2 is about (0.2, 0.4).

## $emmeans
## x1 = 0:
##  x2  prob     SE  df asymp.LCL asymp.UCL
##  0  0.410 0.0410 Inf     0.329     0.490
##  1  0.311 0.0361 Inf     0.240     0.382
## 
## x1 = 1:
##  x2  prob     SE  df asymp.LCL asymp.UCL
##  0  0.559 0.0415 Inf     0.478     0.641
##  1  0.866 0.0279 Inf     0.811     0.921
## 
## Confidence level used: 0.95 
## 
## $contrasts
## x1 = 0:
##  contrast  estimate     SE  df asymp.LCL asymp.UCL
##  x21 - x20  -0.0987 0.0546 Inf    -0.206   0.00835
## 
## x1 = 1:
##  contrast  estimate     SE  df asymp.LCL asymp.UCL
##  x21 - x20   0.3063 0.0500 Inf     0.208   0.40440
## 
## Confidence level used: 0.95

Finally a more traditional way to understand interactions is to use odds ratios. I find these more difficult to explain and understand.

When x1 = 0 the model coefficients simplify as follows:

\[y = -0.365 + 0.604(0) + -0.430\text{x2} + 2.055(0)x2\]

\[y = -0.365 + -0.430\text{x2}\]

Exponentiating the coefficient for x2 gives an odds ratio.

exp(-0.4304)
## [1] 0.6502489

When x2 = 1 and x1 = 0, the odds that y = 1 is about (1 - 0.65 = 0.35) 35% less than the odds that y = 1 when x2 = 0 and x1 = 1.

When x1 = 1 the model coefficients simplify as follows:

\[y = -0.365 + 0.604(1) + -0.430\text{x2} + 2.055(1)x2\]

\[y = 0.238 + 1.625\text{x2}\]

Exponentiating the coefficient for x2 gives an odds ratio.

exp(1.625)
## [1] 5.078419

When x2 = 1 and x1 = 1, the odds that y = 1 is about 5 times higher than the odds that y = 1 when x2 = 0 and x1 = 1.