2  Proportion tests

2.1 Two-sample proportion test

For sample size estimation based on power, we need the following:

  • proportion in one group
  • proportion in other group
  • power of test
  • significance level of test
  • direction of test: one or two-sided

For power estimation based on sample size, we need the following:

  • proportion in one group
  • proportion in other group
  • sample size in each group
  • significance level of test
  • direction of test: one or two-sided

2.1.1 Example: sample size

We wish to plan an experiment to test if there is a difference in the proportion of male and female college undergraduate students who floss daily. Our null hypothesis is no difference in the proportion that answer yes. We want to sample enough students to detect a difference of at least 5%. Assume the following:

  • proportion of one group is 0.30
  • proportion of the other group is 0.25
  • power of 0.9
  • significance level of 0.05
  • two-sided test

Using base R (R Core Team 2025):

power.prop.test(p1 = 0.30, p2 = 0.25, sig.level = 0.05, power = 0.9, 
                alternative = "two.sided")

     Two-sample comparison of proportions power calculation 

              n = 1673.856
             p1 = 0.3
             p2 = 0.25
      sig.level = 0.05
          power = 0.9
    alternative = two.sided

NOTE: n is number in *each* group

We need to observe 1674 students in each group.

Notice the sample size decreases if assume proportions are closer to 0 or 1. For example, assume 0.05 in one group and 0.10 for the other.

power.prop.test(p1 = 0.05, p2 = 0.10, sig.level = 0.05, power = 0.9, 
                alternative = "two.sided")

     Two-sample comparison of proportions power calculation 

              n = 581.0821
             p1 = 0.05
             p2 = 0.1
      sig.level = 0.05
          power = 0.9
    alternative = two.sided

NOTE: n is number in *each* group

Now we need to observe 582 in each group.

Using the pwr package (Champely 2020), we need to express the difference in proportions as an effect size using ES.h():

library(pwr)
pwr.2p.test(h = ES.h(p1 = 0.05, p2 = 0.10), sig.level = 0.05, power = 0.9,
            alternative = "two.sided")

     Difference of proportion power calculation for binomial distribution (arcsine transformation) 

              h = 0.1924743
              n = 567.2579
      sig.level = 0.05
          power = 0.9
    alternative = two.sided

NOTE: same sample sizes

We need to observe 568 in each group. The result does not match the base R function because they each calculate effect size differently.

2.1.2 Example: power

We wish to plan an experiment to test if there is a difference in the proportion of male and female college undergraduate students who floss daily. Our null hypothesis is no difference in the proportion that answer yes. We want to detect a difference of at least 5%. What is the power of our experiment if we know in advance we will be able to sample 300 males and females each? Assume the following:

  • proportion of one group is 0.30
  • proportion of the other group is 0.25
  • sample size per group of 300
  • significance level of 0.05
  • two-sided test

Using base R (R Core Team 2025):

power.prop.test(p1 = 0.30, p2 = 0.25, sig.level = 0.05, n = 300, 
                alternative = "two.sided")

     Two-sample comparison of proportions power calculation 

              n = 300
             p1 = 0.3
             p2 = 0.25
      sig.level = 0.05
          power = 0.2777839
    alternative = two.sided

NOTE: n is number in *each* group

The power of this test will only be about 0.28 if our assumptions are true.

The pwr.2p.test() function from the pwr package returns the same answer.

pwr.2p.test(h = ES.h(p1 = 0.25, p2 = 0.30), sig.level = 0.05, n = 300,
            alternative = "two.sided")

     Difference of proportion power calculation for binomial distribution (arcsine transformation) 

              h = 0.1120819
              n = 300
      sig.level = 0.05
          power = 0.2789492
    alternative = two.sided

NOTE: same sample sizes

2.2 Difference in proportions

For sample size estimation based on precision, we need the following:

  • proportion in one group
  • proportion in other group
  • desired width of confidence interval

2.2.1 Example

We wish to plan an experiment to test if there is a difference in the proportion of male and female college undergraduate students who floss daily. If there truly is a difference of 0.05 in the population, we would like to estimate it within 0.025. That implies estimating a confidence interval with a width of 0.05. Assume the following:

  • proportion of one group is 0.30
  • proportion of the other group is 0.25
  • confidence interval width of 0.05

Using the prec_riskdiff() function from the presize package (Haynes et al. 2021), we can calculate this as follows. (The Newcombe method is the default method.)

library(presize)
prec_riskdiff(p1 = 0.30, p2 = 0.25, conf.width = 0.05, method = "newcombe")

     sample size for a risk difference with newcombe confidence interval 

   p1   p2       n1       n2     ntot r delta        lwr        upr conf.width
1 0.3 0.25 2441.299 2441.299 4882.598 1  0.05 0.02495861 0.07495861       0.05
  conf.level
1       0.95

To estimate a difference in proportions with this precision, assuming the group proportions are each 0.30 and 0.25, we need to sample 2442 subjects in each group.

Once again, assumed proportions closer to 0 or 1 require smaller samples:

prec_riskdiff(p1 = 0.05, p2 = 0.10, conf.width = 0.05, method = "newcombe")

     sample size for a risk difference with newcombe confidence interval 

    p1  p2       n1       n2     ntot r delta         lwr         upr
1 0.05 0.1 860.9835 860.9835 1721.967 1 -0.05 -0.07525428 -0.02525428
  conf.width conf.level
1       0.05       0.95

Larger confidence widths also require smaller sample sizes. For example, to estimate a difference in proportions within 0.05 (i.e., confidence width of 0.1):

prec_riskdiff(p1 = 0.05, p2 = 0.10, conf.width = 0.1, method = "newcombe")

     sample size for a risk difference with newcombe confidence interval 

    p1  p2       n1       n2     ntot r delta        lwr           upr
1 0.05 0.1 225.9885 225.9885 451.9769 1 -0.05 -0.1008756 -0.0008756156
  conf.width conf.level
1        0.1       0.95