Lesson 3.3: Hypothesis Testing
Software Lab 3.3
Hypothesis Testing
In the last lab, we used a sample to make inferences on a population proportion by calculating a confidence interval. In this lab, we’ll come at the same question from a different, but related, perspective: hypothesis testing.
As you work through the lab, answer the ungraded exercises in the shaded boxes. Check your answers by consulting the Software Lab 3.3 Solutions.
Remember to complete the graded Software Lab Questions for this section in Moodle.
Getting Started
The Data
We’ll use the same scenario as in the last lab: a total population size of 100,000 US adults, with 62,000 (62%) of those adults thinking that climate change impacts their community, and the remaining 38,000 thinking that climate change does not impact their community.
The name of the data frame is us_adults
and the name of the variable that contains responses to the question “Do you think climate change is affecting your local community?” is climate_change_affects
. The file representing the entire population can be found at us_adults [CSV file] (OpenIntro, n.d.).
In this lab, you’ll again work with a simple random sample of size 60 from this population. Create a new computed variable named sample1
using the formula SAMPLE(climate_change_affects,60)
. You’ll use this sample to make inferences on the population proportion using hypothesis testing. In this case, we have the rare luxury of knowing the true population proportion () since we have data on the entire population. So, you’ll be able to use this knowledge to learn about the properties of the hypothesis testing procedure.
Hypothesis Test for a Proportion
Analyses > Exploration > Descriptives
to create a frequency table for the sample1
variable to calculate the sample proportion of US adults that think climate change affects their local community, ![Rendered by QuickLaTeX.com \hat{p}](https://introprobabilityandstatistics.pressbooks.tru.ca/wp-content/ql-cache/quicklatex.com-1479a87898589a2cfb5f4d46853bb47a_l3.png)
![Rendered by QuickLaTeX.com \hat{p}](https://introprobabilityandstatistics.pressbooks.tru.ca/wp-content/ql-cache/quicklatex.com-1479a87898589a2cfb5f4d46853bb47a_l3.png)
![Rendered by QuickLaTeX.com p_0=0.62](https://introprobabilityandstatistics.pressbooks.tru.ca/wp-content/ql-cache/quicklatex.com-25ce4a33a2d01d3fd667b30c356c71b0_l3.png)
![Rendered by QuickLaTeX.com Z = \dfrac{\hat{p}-p_0}{\sqrt{\dfrac{p_0q_0}{n}}}](https://introprobabilityandstatistics.pressbooks.tru.ca/wp-content/ql-cache/quicklatex.com-0eaa98b08195503ffb27ceec375f2557_l3.png)
Analyses > R > Rj Editor
and use the R function pnorm
to calculate the p-value corresponding to the test statistic from the previous question. Hint: Review Example 2: One-Proportion Z-Test about the Green Party in Supplementary Notes 3.3.![Rendered by QuickLaTeX.com \alpha = 0.05](https://introprobabilityandstatistics.pressbooks.tru.ca/wp-content/ql-cache/quicklatex.com-ad6ce5c9ea5f3e49e839c4b3d5273902_l3.png)
Hypothesis Test Errors
Type 1 Error and the Significance Level
If we take multiple simple random samples of size 60 from the population, each sample would result in a slightly different sample proportion, , and hence a slightly different test statistic,
, and p-value. Some of those p-values would be greater than the significance level
, resulting in not rejecting the null hypothesis. This would be a correct decision since in this case we know the population proportion is 62%.
![Rendered by QuickLaTeX.com \alpha = 0.05](https://introprobabilityandstatistics.pressbooks.tru.ca/wp-content/ql-cache/quicklatex.com-ad6ce5c9ea5f3e49e839c4b3d5273902_l3.png)
![Rendered by QuickLaTeX.com \alpha = 0.05](https://introprobabilityandstatistics.pressbooks.tru.ca/wp-content/ql-cache/quicklatex.com-ad6ce5c9ea5f3e49e839c4b3d5273902_l3.png)
![Rendered by QuickLaTeX.com p_0 = 0.62](https://introprobabilityandstatistics.pressbooks.tru.ca/wp-content/ql-cache/quicklatex.com-dcf9037c1a8c4f5cb161674ca0047c3c_l3.png)
![Rendered by QuickLaTeX.com p \ne p_0](https://introprobabilityandstatistics.pressbooks.tru.ca/wp-content/ql-cache/quicklatex.com-3a22a9306de9a143791774bf6078f012_l3.png)
![Rendered by QuickLaTeX.com n = 60](https://introprobabilityandstatistics.pressbooks.tru.ca/wp-content/ql-cache/quicklatex.com-9769dd630ccb18b91a7772f1fb4d6e0f_l3.png)
![Rendered by QuickLaTeX.com \alpha = 0.05](https://introprobabilityandstatistics.pressbooks.tru.ca/wp-content/ql-cache/quicklatex.com-ad6ce5c9ea5f3e49e839c4b3d5273902_l3.png)
Note: Hypothesis errors are referred to as Type 1 and Type 2 in the textbook, but Type I and Type II in the Errors and Power app.
Type 2 Error and Power
Suppose your sample didn’t come from a population in which the proportion was 62%, but instead came from a population in which the proportion was 80%. Now, a correct decision would be to reject the null hypothesis in favour of the alternative hypothesis. An incorrect decision would be to not reject the null hypothesis, which would lead to a Type 2 error.
![Rendered by QuickLaTeX.com p_0 = 0.62](https://introprobabilityandstatistics.pressbooks.tru.ca/wp-content/ql-cache/quicklatex.com-dcf9037c1a8c4f5cb161674ca0047c3c_l3.png)
![Rendered by QuickLaTeX.com p \ne p_0](https://introprobabilityandstatistics.pressbooks.tru.ca/wp-content/ql-cache/quicklatex.com-3a22a9306de9a143791774bf6078f012_l3.png)
![Rendered by QuickLaTeX.com n = 60](https://introprobabilityandstatistics.pressbooks.tru.ca/wp-content/ql-cache/quicklatex.com-9769dd630ccb18b91a7772f1fb4d6e0f_l3.png)
![Rendered by QuickLaTeX.com \alpha = 0.05](https://introprobabilityandstatistics.pressbooks.tru.ca/wp-content/ql-cache/quicklatex.com-ad6ce5c9ea5f3e49e839c4b3d5273902_l3.png)
References
CPM Educational Program. (2023). Errors and power [Application]. https://stats.cpm.org/power/
OpenIntro. (n.d.-b). us_adults [Data set]. https://github.com/OpenIntroStat/oilabs-jamovi/raw/main/05b_confidence_intervals/more/us_adults.csv