# load packages
library(tidyverse)
library(skimr)
load("data/cdc.rda")P-values
Activity 18
Overview
The focus of Activity 18 will be on p-values. The p-value is the probability that we observe our dataset or one that is more extreme under a null hypothesis. In some sense you can think of a p-value as a measure of consistency of our observed data with a null hypothesis (or status quo). Where a low p-value indicates that the data is likely inconsistent with the null hypothesis.
Needed Packages
The following loads the packages that are needed for this activity.
Exercise 1
Our sample: The Behavioral Risk Factor Surveillance System (BRFSS) is an annual telephone survey of 350,000 adults in the United States. As its name implies, the BRFSS is designed to identify risk factors in the adult population and report emerging health trends. The BRFSS Web site (http://www.cdc.gov/brfss) contains a complete description of the survey, including the research questions that motivate the study and many interesting results derived from the data.
The sample data is saved in data/cdc.rda.
genhlth, respondents were asked to evaluate their general health, responding either excellent, very good, good, fair, or poor;exeranyvariable indicates whether the respondent exercised in the past month (1) or did not (0);hlthplanindicates whether the respondent had some form of health coverage (1) or did not (0);smoke100variable indicates whether the respondent had smoked at least 100 cigarettes in their lifetime (1) or not (0);heightis respondent’s height in inches;weightis respondent’s weight in pounds;wtdesireis respondent’s desired weight in pounds;ageis respondent’s age in years;gendercodedmfor male andffor female.
Important info
To calculate a p-value we need a claim and a sample.
You should still write a two-sided hypothesis no matter the direction of the claim
Helpful notation: \(\neq\), \(\mu_{variable}\), \(\pi_{variable}\), \(\beta_{variable}\)
Question 1
Google search claims the average height of men is 66 inches.
What is the null and alternative hypothesis?
\[H_0: \mu_{height} = 66\]
\[H_A: \mu_{height} \ne 66\] Calculate the p-value for our sample.
cdc_men <- cdc %>%
filter(gender == "m")
t.test(x = cdc_men$height, mu = 66)
One Sample t-test
data: cdc_men$height
t = 138.21, df = 9568, p-value < 2.2e-16
alternative hypothesis: true mean is not equal to 66
95 percent confidence interval:
70.19135 70.31195
sample estimates:
mean of x
70.25165
Interpret the p-value.
Assuming the average height of adult men in the US is 66 inches, there is almost a 0% chance of observing a sample as extreme as mine.
Question 2
Claim: 10% of US adults are in fair health.
What is the null and alternative hypothesis?
\[H_0: \pi_{fair} = 0.1\]
\[H_A: \pi_{fair} \ne 0.1\]
Calculate the p-value for our sample.
cdc %>%
count(genhlth)# A tibble: 5 × 2
genhlth n
<fct> <int>
1 excellent 4657
2 very good 6972
3 good 5675
4 fair 2019
5 poor 677
prop.test(x = 2019, n = 20000, p = 0.1, correct = FALSE)
1-sample proportions test without continuity correction
data: 2019 out of 20000, null probability 0.1
X-squared = 0.20056, df = 1, p-value = 0.6543
alternative hypothesis: true p is not equal to 0.1
95 percent confidence interval:
0.09685112 0.10520214
sample estimates:
p
0.10095
Interpret the p-value.
Assuming 10% of US adults consider themselves in fair health, there is a 65.43% chance of observing data as extreme as our sample.
Question 3
Claim: The proportion of smokers in excellent health is less than the proportion of smokers in poor health.
What is the null and alternative hypothesis?
\[H_0: \pi_{smoke excellent} - \pi_{smoke poor} = 0\]
\[H_A: \pi_{smoke excellent} - \pi_{smoke poor} \ne 0\]
We can always write a two sided test even if the claim is directional
Calculate the p-value for our sample.
# get numerators number of "yes" of proportions
# get count of excellent health and yes smoke
# get count of poor health and yes smoke
cdc %>%
count(genhlth, smoke100)# A tibble: 10 × 3
genhlth smoke100 n
<fct> <fct> <int>
1 excellent no 2879
2 excellent yes 1778
3 very good no 3758
4 very good yes 3214
5 good no 2782
6 good yes 2893
7 fair no 911
8 fair yes 1108
9 poor no 229
10 poor yes 448
# get denominators of proportion
# count of people in excellent health
# count of people in poor health
cdc %>% count(genhlth)# A tibble: 5 × 2
genhlth n
<fct> <int>
1 excellent 4657
2 very good 6972
3 good 5675
4 fair 2019
5 poor 677
prop.test(x = c(1778, 448), n = c(4657, 677), correct = FALSE)
2-sample test for equality of proportions without continuity correction
data: c(1778, 448) out of c(4657, 677)
X-squared = 190.51, df = 1, p-value < 2.2e-16
alternative hypothesis: two.sided
95 percent confidence interval:
-0.3182250 -0.2416793
sample estimates:
prop 1 prop 2
0.3817909 0.6617430
Interpret the p-value.
Assuming their is no difference in the proportion of smokers for US adults who consider themselves in excellent health and the proportion of smokers for US adults who who consider themselves in poor health, there is almost a 0% chance of observing a sample difference as extreme as ours.
Note: since there is a very small chance and our sample difference is 0.3818 - 0.6617 = -0.2799 this implies that the original claim is true and there are less people in excellent health who smoke compared to people in poor health.
Question 4
Claim: US women in good general health weigh less than US women in fair general health.
What is the null and alternative hypothesis?
\[H_0: \mu_{good} - \mu_{fair} = 0\]
\[H_A: \mu_{good} - \mu_{fair} \ne 0\]
Calculate the p-value for our sample.
# first dataset of interest
cdc_good <- cdc %>%
filter(genhlth == "good", gender == "f")
# second dataset of interest
cdc_fair <- cdc %>%
filter(genhlth == "fair", gender == "f")
t.test(x = cdc_good$weight, y = cdc_fair$weight)
Welch Two Sample t-test
data: cdc_good$weight and cdc_fair$weight
t = -5.3901, df = 1829.9, p-value = 7.953e-08
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-10.400986 -4.851285
sample estimates:
mean of x mean of y
154.8629 162.4890
Interpret the p-value.
Assuming there is no difference in weight for US adult women who consider themselves in good and fair general health, there is a 7.953e-08 chance of observing a sample as extreme as ours.
Optional Challenge
For Questions 1 - 4, calculate the critical value and p-value by hand.
Q1 by hand
Google search claims the average height of men is 66 inches.
cdc %>%
filter(gender == "m") %>%
summarize(xbar = mean(height, na.rm = TRUE),
s = sd(height, na.rm = TRUE),
n = n(),
se = s/sqrt(n),
STAT = (xbar - 66)/se)# A tibble: 1 × 5
xbar s n se STAT
<dbl> <dbl> <int> <dbl> <dbl>
1 70.3 3.01 9569 0.0308 138.
#positive stat set lower.tail = FALSE to get area to right
#times 2 for two tailed test
2*pt(q = 138.2091, df = 9568, lower.tail = FALSE)[1] 0
Q2 by hand:
Claim: 10% of US adults are in fair health.
cdc %>%
summarize(x = sum(genhlth == "fair"),
n = n(),
p = x/n,
se = sqrt(p*(1-p)/n),
STAT = (p - 0.1)/se)# A tibble: 1 × 5
x n p se STAT
<int> <int> <dbl> <dbl> <dbl>
1 2019 20000 0.101 0.00213 0.446
#positive stat needs to set lower.tail = FALSE
#times 2 for two tailed test
2*pnorm(q = 0.4459575, lower.tail = FALSE)[1] 0.6556279
Q3 by hand:
Claim: The proportion of smokers in excellent health is less than the proportion of smokers in poor health.
cdc %>%
count(smoke100, genhlth)# A tibble: 10 × 3
smoke100 genhlth n
<fct> <fct> <int>
1 no excellent 2879
2 no very good 3758
3 no good 2782
4 no fair 911
5 no poor 229
6 yes excellent 1778
7 yes very good 3214
8 yes good 2893
9 yes fair 1108
10 yes poor 448
cdc %>%
summarize(x1 = sum(genhlth == "excellent" & smoke100 == "yes"),
n1 = sum(genhlth == "excellent"),
x2 = sum(genhlth == "poor" & smoke100 == "yes"),
n2 = sum(genhlth == "poor" ),
p1 = x1/n1,
p2 = x2/n2,
dif = p1 - p2,
se = sqrt(p1*(1-p1)/n1 + p2*(1-p2)/n2),
STAT = (dif - 0)/se)# A tibble: 1 × 9
x1 n1 x2 n2 p1 p2 dif se STAT
<int> <int> <int> <int> <dbl> <dbl> <dbl> <dbl> <dbl>
1 1778 4657 448 677 0.382 0.662 -0.280 0.0195 -14.3
#negative stat value calculate area to left
#times 2 for two tailed test
2*pnorm(q = -14.3364)[1] 1.296063e-46
Q4 by hand:
Claim: US women in good general health weigh less than US women in fair general health.
cdc_women <- cdc %>%
filter(gender == "f")
cdc_women %>%
group_by(genhlth) %>%
summarize(xbar = mean(weight, na.rm = TRUE),
s = sd(weight, na.rm = TRUE),
n = n())# A tibble: 5 × 4
genhlth xbar s n
<fct> <dbl> <dbl> <int>
1 excellent 142. 26.2 2359
2 very good 150. 31.2 3590
3 good 155. 36.5 2953
4 fair 162. 42.0 1135
5 poor 165. 44.6 394
xbar_dif = 154.8629 - 162.4890
se = sqrt(36.49003^2/2953 + 41.95475^2/1135)
STAT = (xbar_dif - 0)/se
#negative stat value calculate area to left
#times 2 for two tailed test
2*pt(q = -5.390119, df = 1134)[1] 8.560954e-08