1  T tests

1.1 Two sample t-test

For sample size estimation based on power, we need the following:

  • hypothesized difference in means (effect size)
  • population standard deviation
  • power of test
  • significance level of test
  • direction of test: one or two-sided

For power estimation based on sample size, we need the following:

  • hypothesized difference in means (effect size)
  • population standard deviation
  • sample size for each group
  • significance level of test
  • direction of test: one or two-sided

1.1.1 Example: sample size

We wish to run an experiment to test if the mean price of what male and female students pay at a library coffee shop is different. Our null hypothesis is no difference. We want to sample enough students to detect a difference of at least 75 cents. We’re not sure which group pays more, so we’ll do a two-sided test. Assume the following:

  • true difference is 0.75 cents
  • population standard deviation is $2.25.
  • we desire 0.9 power
  • our significance level will be 0.05
  • this will be two-sided test

Using base R (R Core Team 2025):

power.t.test(delta = 0.75, sd = 2.25, sig.level = 0.05, power = 0.9,
             alternative = "two.sided")

     Two-sample t test power calculation 

              n = 190.0991
          delta = 0.75
             sd = 2.25
      sig.level = 0.05
          power = 0.9
    alternative = two.sided

NOTE: n is number in *each* group

We should plan to observe 191 in each group.

Using the pwr package (Champely 2020), we must express effect size as Cohen’s d: difference between the means divided by the population (or pooled) standard deviation.

library(pwr)
pwr.t.test(d = 0.75/2.25, sig.level = 0.05, power = 0.9, 
           alternative = "two.sided")

     Two-sample t test power calculation 

              n = 190.0991
              d = 0.3333333
      sig.level = 0.05
          power = 0.9
    alternative = two.sided

NOTE: n is number in *each* group

1.1.2 Example: power

We wish to run an experiment to test if the mean price of what male and female students pay at a library coffee shop is different. Our null hypothesis is no difference. We want to detect a difference of at least 75 cents and we know our sample size will be 60 students per group. We’re not sure which group pays more, so we’ll do a two-sided test. What is the power of our test? Assume the following:

  • true difference is 0.75 cents
  • population standard deviation is $2.25.
  • 60 students per group
  • our significance level will be 0.05
  • this will be two-sided test

Using base R (R Core Team 2025):

power.t.test(delta = 0.75, sd = 2.25, sig.level = 0.05, n = 60,
             alternative = "two.sided")

     Two-sample t test power calculation 

              n = 60
          delta = 0.75
             sd = 2.25
      sig.level = 0.05
          power = 0.4407429
    alternative = two.sided

NOTE: n is number in *each* group

The power of this test will only be about 0.44 if our assumptions are true.

Using the pwr package we get the same result:

pwr.t.test(d = 0.75/2.25, sig.level = 0.05, n = 60, 
           alternative = "two.sided")

     Two-sample t test power calculation 

              n = 60
              d = 0.3333333
      sig.level = 0.05
          power = 0.4408242
    alternative = two.sided

NOTE: n is number in *each* group

1.2 Paired t-test

For sample size estimation based on power, we need the following:

  • hypothesized difference between pairs (effect size)
  • standard deviation of differences
  • power of test
  • significance level of test
  • direction of test: one or two-sided

For power estimation based on sample size, we need the following:

  • hypothesized difference between pairs (effect size)
  • standard deviation of differences
  • sample size
  • significance level of test
  • direction of test: one or two-sided

1.2.1 Example: sample size

We wish to run an experiment to see if an ultra-heavy rope-jumping program reduces 40-yard dash times. We will recruit young men ages 18 - 25 and measure their 40-yard dash time in seconds before the program and after. We’ll use a paired t-test to see if the difference in times is greater than 0 (before - after). We want to sample enough subjects to detect a difference of 0.08 seconds. Assume the following:

  • hypothesized difference of 0.08
  • standard deviation of differences is 0.25
  • we desire 0.9 power
  • significance level of 0.05
  • one-sided test since we assume dash times will only get faster

Using base R:

power.t.test(delta = 0.08, sd = 0.25, sig.level = 0.05, power = 0.9, 
             type = "paired", alternative = "one.sided")

     Paired t test power calculation 

              n = 85.00257
          delta = 0.08
             sd = 0.25
      sig.level = 0.05
          power = 0.9
    alternative = one.sided

NOTE: n is number of *pairs*, sd is std.dev. of *differences* within pairs

We should plan to sample at least 86 subjects.

Using the pwr package, we need to express the effect size as Cohen’s d: difference in pairs divided by the standard deviation of the differences.

library(pwr)
pwr.t.test(power = 0.9, d = 0.08 / 0.25,
           type = "paired", alternative = "greater")

     Paired t test power calculation 

              n = 85.00256
              d = 0.32
      sig.level = 0.05
          power = 0.9
    alternative = greater

NOTE: n is number of *pairs*

1.2.2 Example: power

We wish to run an experiment to see if an ultra-heavy rope-jumping program reduces 40-yard dash times. We will recruit young men ages 18 - 25 and measure their 40-yard dash time in seconds before the program and after. We’ll use a paired t-test to see if the difference in times is greater than 0 (before - after). We want to be able to detect a difference of 0.08 seconds. We will have access to 50 subjects. What is the power of our experiment. Assume the following:

  • hypothesized difference of 0.08
  • standard deviation of differences is 0.25
  • 50 subjects
  • significance level of 0.05
  • one-sided test since we assume dash times will only get faster

Using base R:

power.t.test(delta = 0.08, sd = 0.25, sig.level = 0.05, n = 50, 
             type = "paired", alternative = "one.sided")

     Paired t test power calculation 

              n = 50
          delta = 0.08
             sd = 0.25
      sig.level = 0.05
          power = 0.7212189
    alternative = one.sided

NOTE: n is number of *pairs*, sd is std.dev. of *differences* within pairs

The power of this test is about 0.72 if our assumptions are true.

Using the pwr package we get the same result:

pwr.t.test(n = 50, d = 0.08 / 0.25,
           type = "paired", alternative = "greater")

     Paired t test power calculation 

              n = 50
              d = 0.32
      sig.level = 0.05
          power = 0.7212189
    alternative = greater

NOTE: n is number of *pairs*