5 Logistic Regression

Logistic regression models the probability of a “success” in a binary outcome, where success is defined as the outcome of interest. We demonstrate methods presented in Crespi (2025), which uses the WebPower package (Zhang and Mai 2023).

library(WebPower)

Because calculating power for logistic regression models depends on the distribution of predictors, there are limited cases where closed-form expressions can be derived. We present three examples: (1) a single binary predictor, (2) a single normally distributed predictor, and (3) a single predictor adjusted for multiple predictors. For more complex models, simulation can be used.

5.1 One Binary Predictor

With one binary predictor \(x\), we calculate power or sample size based on the null hypothesis test that the coefficient for the predictor (\(\beta_x\)) equals 0.

For sample size estimation based on power, we need the following:

Desired power
Probability of success when x = 0 (p0)
Probability of success when x = 1 (p1)
Significance level
Distribution family of the predictor
The parameter value of the predictor’s distribution
Direction of test: one or two sided

For power estimation based on sample size, we need the following:

Sample size
Probability of success when x = 0 (p0)
Probability of success when x = 1 (p1)
Significance level
Distribution family of the predictor
The parameter value of the predictor’s distribution
Direction of test: one or two sided

5.1.1 Example: sample size

A hospital clinic wants to investigate if follow-up phone calls with patients (versus no follow-up) increases compliance with a treatment regimen. Patients will be randomly assigned to two conditions: one where patients are given written instructions and asked to report back in six months, and one where patients are given written instructions and contacted three times by phone before the six month follow up. The binary outcome is compliance with the treatment regimen over the previous six months (yes or no). They want to sample enough subjects to detect a difference of 0.15 in the rate of compliance. Assume the following:

Desired power is 0.9
Compliance rate without follow-up is about 0.15 (i.e., probability of success when x = 0, p0)
Compliance rate with follow-up is 0.30 (i.e., probability of success when x = 1, p1)
Significance level of 0.05 (alpha)
Bernoulli distribution for the predictor since it’s binary
Bernoulli parameter of 0.5 (i.e., half of the subjects will be in one group, half in the other)
Two-sided test

Using the wp.logistic() function in the WebPower package:

wp.logistic(n = NULL, p0 = 0.15, p1 = 0.30, 
            alpha = 0.05, power = 0.9,
            family = "Bernoulli", parameter = 0.5,
            alternative ="two.sided")

Power for logistic regression

      p0  p1     beta0     beta1        n alpha power
    0.15 0.3 -1.734601 0.8873032 336.4544  0.05   0.9

URL: http://psychstat.org/logistic

Based on these assumptions, the clinic should plan on sampling at least 337 subjects to have probability of 0.9 of correctly rejecting the null of no difference.

5.1.2 Example: power

A hospital clinic wants to investigate if follow-up phone calls with patients (versus no follow-up) increases compliance with a treatment regimen. Patients will be randomly assigned to two conditions: one where patients are given written instructions and asked to report back in six months, and one where patients are given written instructions and contacted three times by phone before the six month follow up. The binary outcome is compliance with the treatment regimen over the previous six months (yes or no). They will only have the budget to sample 200 subjects. How powerful is the experiment if they want to detect a difference of 0.15 in the rate of compliance? Assume the following:

Sample size is 200
Compliance rate without follow-up is about 0.15 (i.e., probability of success when x = 0, p0)
Compliance rate with follow-up is 0.30 (i.e., probability of success when x = 1, p1)
Significance level of 0.05 (alpha)
Bernoulli distribution for the predictor since it’s binary
Bernoulli parameter of 0.5 (i.e., half of the subjects will be in one group, half in the other)
Two-sided test

Using the wp.logistic() function in the WebPower package:

wp.logistic(power = NULL, p0 = 0.15, p1 = 0.30, 
            alpha = 0.05, n = 200,
            family = "Bernoulli", parameter = 0.5,
            alternative ="two.sided")

Power for logistic regression

      p0  p1     beta0     beta1   n alpha     power
    0.15 0.3 -1.734601 0.8873032 200  0.05 0.7051399

URL: http://psychstat.org/logistic

The experiment has a probability of about 0.7 of correctly rejecting the null hypothesis of no difference in treatments with a sample size of 200.

5.2 One Continuous Predictor

The process of calculating power and sample size for a model with one continuous predictor is the same as a model with one binary predictor. We simply need to update the family and parameter arguments in the wp.logistic() function. In this case the p0 and p1 arguments represent a meaningful change on the continuous scale such as a one standard deviation increase or decrease.

5.2.1 Example: sample size

A hospital clinic wants to investigate if the weight of male patients is associated with the probability of exercising regularly, a yes/no question they ask patients. They want to sample enough male subjects to detect an increase of 0.1 on the probability of exercising regularly. Assume the following:

Desired power is 0.9
Probability of exercising regularly at average weight is 0.4 (i.e., probability of success when x = 0, p0)
Probability of exercising regularly at one standard deviation over the average weight is 0.3 (i.e., probability of success when x = 1, p1)
Significance level of 0.05 (alpha)
Normal distribution for the predictor
Normal parameters of 0 and 1 (i.e., standardized mean and standard deviation of 0 and 1)
Two-sided test

Using the wp.logistic() function in the WebPower package:

wp.logistic(n = NULL, p0 = 0.4, p1 = 0.3, 
            alpha = 0.05, power = 0.9,
            family = "normal", parameter = c(0,1),
            alternative ="two.sided")

Power for logistic regression

     p0  p1      beta0      beta1        n alpha power
    0.4 0.3 -0.4054651 -0.4418328 254.7801  0.05   0.9

URL: http://psychstat.org/logistic

Based on these assumptions, the clinic should plan on sampling at least 255 males subjects to have probability of 0.9 of correctly rejecting the null hypothesis of no effect of weight.

5.2.2 Example: power

A hospital clinic wants to investigate if the weight of male patients is associated with the probability of exercising regularly, a yes/no question they ask patients. They will only have the budget to sample 220 subjects. How powerful is the experiment if they want to detect a difference of 0.1 in the probability of exercising? Assume the following:

Sample size of 220
Probability of exercising regularly at average weight is 0.4 (i.e., probability of success when x = 0, p0)
Probability of exercising regularly at one standard deviation over the average weight is 0.3 (i.e., probability of success when x = 1, p1)
Significance level of 0.05 (alpha)
Normal distribution for the predictor
Normal parameters of 0 and 1 (i.e., standardized mean and standard deviation of 0 and 1)
Two-sided test

Using the wp.logistic() function in the WebPower package:

wp.logistic(power = NULL, p0 = 0.4, p1 = 0.3, 
            alpha = 0.05, n = 220,
            family = "normal", parameter = c(0,1),
            alternative ="two.sided")

Power for logistic regression

     p0  p1      beta0      beta1   n alpha     power
    0.4 0.3 -0.4054651 -0.4418328 220  0.05 0.8536433

URL: http://psychstat.org/logistic

The experiment has a probability of about 0.85 of correctly rejecting the null hypothesis of no effect of weight on regular exercise with a sample size of 220.

5.3 Multiple predictors

It is often of interest to assess the effect of one predictor after controlling or adjusting for other variables. Crespi (2025) presents a method of estimating sample sizes for these logistic regression models using Variance Inflation Factors (VIF). The general idea is to assume the predictors we adjust for are multivariate normal and that the predictor of interest is correlated with the other predictors. The higher the assumed correlation, the more sample size needs to be increased.

The correlation can be conceptualized by imagining that we regress the predictor of interest on the other predictors and take the R-squared. We then take the reciprocal of 1 - R² to derive a VIF. For example, if the imagined R² is 0.3, then the VIF would be

1/(1 - 0.3)

[1] 1.428571

This implies we increase the sample size by about 43%.

Returning to our previous example, a hospital clinic wants to investigate if the weight of male patients is associated with the probability of exercising regularly, a yes/no question they ask patients. They want to sample enough male subjects to detect an increase of 0.1 on the probability of exercising regularly. In addition to weight, they will collect other numeric predictors to adjust for, including age, blood pressure, and years of education.

The sample size for the individual predictor was previously calculated as follows:

wp.logistic(n = NULL, p0 = 0.4, p1 = 0.3, 
            alpha = 0.05, power = 0.9,
            family = "normal", parameter = c(0,1),
            alternative ="two.sided")

Power for logistic regression

     p0  p1      beta0      beta1        n alpha power
    0.4 0.3 -0.4054651 -0.4418328 254.7801  0.05   0.9

URL: http://psychstat.org/logistic

Assuming the predictor of interest has a correlation of 0.25 with the other predictors, the VIF is

1/(1 - 0.25)

[1] 1.333333

This means we should increase the sample size by 33%.

255 * 1.33

[1] 339.15

The updated sample is 340.