P-values

Activity 18

Author

Solution

Overview

The focus of Activity 18 will be on p-values. The p-value is the probability that we observe our dataset or one that is more extreme under a null hypothesis. In some sense you can think of a p-value as a measure of consistency of our observed data with a null hypothesis (or status quo). Where a low p-value indicates that the data is likely inconsistent with the null hypothesis.


Needed Packages

The following loads the packages that are needed for this activity.

# load packages
library(tidyverse)
library(skimr)

load("data/cdc.rda")


Exercise 1

Our sample: The Behavioral Risk Factor Surveillance System (BRFSS) is an annual telephone survey of 350,000 adults in the United States. As its name implies, the BRFSS is designed to identify risk factors in the adult population and report emerging health trends. The BRFSS Web site (http://www.cdc.gov/brfss) contains a complete description of the survey, including the research questions that motivate the study and many interesting results derived from the data.

The sample data is saved in data/cdc.rda.

  • genhlth, respondents were asked to evaluate their general health, responding either excellent, very good, good, fair, or poor;
  • exerany variable indicates whether the respondent exercised in the past month (1) or did not (0);
  • hlthplan indicates whether the respondent had some form of health coverage (1) or did not (0);
  • smoke100 variable indicates whether the respondent had smoked at least 100 cigarettes in their lifetime (1) or not (0);
  • height is respondent’s height in inches;
  • weight is respondent’s weight in pounds;
  • wtdesire is respondent’s desired weight in pounds;
  • age is respondent’s age in years;
  • gender coded m for male and f for female.

Important info

  • To calculate a p-value we need a claim and a sample.

  • You should still write a two-sided hypothesis no matter the direction of the claim

  • Helpful notation: \(\neq\), \(\mu_{variable}\), \(\pi_{variable}\), \(\beta_{variable}\)

Question 1

Google search claims the average height of men is 66 inches.

What is the null and alternative hypothesis?

\[H_0: \mu_{height} = 66\]

\[H_A: \mu_{height} \ne 66\] Calculate the p-value for our sample.

cdc_men <- cdc %>% 
  filter(gender == "m")

t.test(x = cdc_men$height, mu = 66)

    One Sample t-test

data:  cdc_men$height
t = 138.21, df = 9568, p-value < 2.2e-16
alternative hypothesis: true mean is not equal to 66
95 percent confidence interval:
 70.19135 70.31195
sample estimates:
mean of x 
 70.25165 

Interpret the p-value.

Assuming the average height of adult men in the US is 66 inches, there is almost a 0% chance of observing a sample as extreme as mine.

Question 2

Claim: 10% of US adults are in fair health.

What is the null and alternative hypothesis?

\[H_0: \pi_{fair} = 0.1\]

\[H_A: \pi_{fair} \ne 0.1\]

Calculate the p-value for our sample.

cdc %>% 
  count(genhlth)
# A tibble: 5 × 2
  genhlth       n
  <fct>     <int>
1 excellent  4657
2 very good  6972
3 good       5675
4 fair       2019
5 poor        677
prop.test(x = 2019, n = 20000, p = 0.1, correct = FALSE)

    1-sample proportions test without continuity correction

data:  2019 out of 20000, null probability 0.1
X-squared = 0.20056, df = 1, p-value = 0.6543
alternative hypothesis: true p is not equal to 0.1
95 percent confidence interval:
 0.09685112 0.10520214
sample estimates:
      p 
0.10095 

Interpret the p-value.

Assuming 10% of US adults consider themselves in fair health, there is a 65.43% chance of observing data as extreme as our sample.

Question 3

Claim: The proportion of smokers in excellent health is less than the proportion of smokers in poor health.

What is the null and alternative hypothesis?

\[H_0: \pi_{smoke excellent} - \pi_{smoke poor} = 0\]

\[H_A: \pi_{smoke excellent} - \pi_{smoke poor} \ne 0\]

We can always write a two sided test even if the claim is directional

Calculate the p-value for our sample.

# get numerators number of "yes" of proportions
# get count of excellent health and yes smoke
# get count of poor health and yes smoke
cdc %>% 
  count(genhlth, smoke100)
# A tibble: 10 × 3
   genhlth   smoke100     n
   <fct>     <fct>    <int>
 1 excellent no        2879
 2 excellent yes       1778
 3 very good no        3758
 4 very good yes       3214
 5 good      no        2782
 6 good      yes       2893
 7 fair      no         911
 8 fair      yes       1108
 9 poor      no         229
10 poor      yes        448
# get denominators of proportion
# count of people in excellent health
# count of people in poor health
cdc %>% count(genhlth)
# A tibble: 5 × 2
  genhlth       n
  <fct>     <int>
1 excellent  4657
2 very good  6972
3 good       5675
4 fair       2019
5 poor        677
prop.test(x = c(1778, 448), n = c(4657, 677), correct = FALSE)

    2-sample test for equality of proportions without continuity correction

data:  c(1778, 448) out of c(4657, 677)
X-squared = 190.51, df = 1, p-value < 2.2e-16
alternative hypothesis: two.sided
95 percent confidence interval:
 -0.3182250 -0.2416793
sample estimates:
   prop 1    prop 2 
0.3817909 0.6617430 

Interpret the p-value.

Assuming their is no difference in the proportion of smokers for US adults who consider themselves in excellent health and the proportion of smokers for US adults who who consider themselves in poor health, there is almost a 0% chance of observing a sample difference as extreme as ours.

Note: since there is a very small chance and our sample difference is 0.3818 - 0.6617 = -0.2799 this implies that the original claim is true and there are less people in excellent health who smoke compared to people in poor health.

Question 4

Claim: US women in good general health weigh less than US women in fair general health.

What is the null and alternative hypothesis?

\[H_0: \mu_{good} - \mu_{fair} = 0\]

\[H_A: \mu_{good} - \mu_{fair} \ne 0\]

Calculate the p-value for our sample.

# first dataset of interest
cdc_good <- cdc %>% 
  filter(genhlth == "good", gender == "f")

# second dataset of interest
cdc_fair <- cdc %>% 
  filter(genhlth == "fair", gender == "f")

t.test(x = cdc_good$weight, y = cdc_fair$weight)

    Welch Two Sample t-test

data:  cdc_good$weight and cdc_fair$weight
t = -5.3901, df = 1829.9, p-value = 7.953e-08
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -10.400986  -4.851285
sample estimates:
mean of x mean of y 
 154.8629  162.4890 

Interpret the p-value.

Assuming there is no difference in weight for US adult women who consider themselves in good and fair general health, there is a 7.953e-08 chance of observing a sample as extreme as ours.

Optional Challenge

For Questions 1 - 4, calculate the critical value and p-value by hand.

Q1 by hand

Google search claims the average height of men is 66 inches.

cdc %>% 
  filter(gender == "m") %>% 
  summarize(xbar = mean(height, na.rm = TRUE),
            s = sd(height, na.rm = TRUE),
            n = n(),
            se = s/sqrt(n),
            STAT = (xbar - 66)/se)
# A tibble: 1 × 5
   xbar     s     n     se  STAT
  <dbl> <dbl> <int>  <dbl> <dbl>
1  70.3  3.01  9569 0.0308  138.
#positive stat set lower.tail = FALSE to get area to right
#times 2 for two tailed test
2*pt(q = 138.2091, df = 9568, lower.tail = FALSE)
[1] 0

Q2 by hand:

Claim: 10% of US adults are in fair health.

cdc %>% 
  summarize(x = sum(genhlth == "fair"),
            n = n(),
            p = x/n,
            se = sqrt(p*(1-p)/n),
            STAT = (p - 0.1)/se)
# A tibble: 1 × 5
      x     n     p      se  STAT
  <int> <int> <dbl>   <dbl> <dbl>
1  2019 20000 0.101 0.00213 0.446
#positive stat needs to set lower.tail = FALSE
#times 2 for two tailed test
2*pnorm(q = 0.4459575, lower.tail = FALSE)
[1] 0.6556279

Q3 by hand:

Claim: The proportion of smokers in excellent health is less than the proportion of smokers in poor health.

cdc %>% 
  count(smoke100, genhlth)
# A tibble: 10 × 3
   smoke100 genhlth       n
   <fct>    <fct>     <int>
 1 no       excellent  2879
 2 no       very good  3758
 3 no       good       2782
 4 no       fair        911
 5 no       poor        229
 6 yes      excellent  1778
 7 yes      very good  3214
 8 yes      good       2893
 9 yes      fair       1108
10 yes      poor        448
cdc %>% 
  summarize(x1 = sum(genhlth == "excellent" & smoke100 == "yes"),
            n1 = sum(genhlth == "excellent"),
            x2 = sum(genhlth == "poor" & smoke100 == "yes"),
            n2 = sum(genhlth == "poor" ),
            p1 = x1/n1,
            p2 = x2/n2,
            dif = p1 - p2,
            se = sqrt(p1*(1-p1)/n1 + p2*(1-p2)/n2),
            STAT = (dif - 0)/se)
# A tibble: 1 × 9
     x1    n1    x2    n2    p1    p2    dif     se  STAT
  <int> <int> <int> <int> <dbl> <dbl>  <dbl>  <dbl> <dbl>
1  1778  4657   448   677 0.382 0.662 -0.280 0.0195 -14.3
#negative stat value calculate area to left
#times 2 for two tailed test
2*pnorm(q = -14.3364)
[1] 1.296063e-46

Q4 by hand:

Claim: US women in good general health weigh less than US women in fair general health.

cdc_women <- cdc %>% 
  filter(gender == "f")

cdc_women %>% 
  group_by(genhlth) %>% 
  summarize(xbar = mean(weight, na.rm = TRUE),
            s = sd(weight, na.rm = TRUE),
            n = n())
# A tibble: 5 × 4
  genhlth    xbar     s     n
  <fct>     <dbl> <dbl> <int>
1 excellent  142.  26.2  2359
2 very good  150.  31.2  3590
3 good       155.  36.5  2953
4 fair       162.  42.0  1135
5 poor       165.  44.6   394
xbar_dif = 154.8629 - 162.4890
se = sqrt(36.49003^2/2953 + 41.95475^2/1135)
STAT = (xbar_dif - 0)/se


#negative stat value calculate area to left
#times 2 for two tailed test
2*pt(q = -5.390119, df = 1134)
[1] 8.560954e-08