Lesson 4.1: Inference for Proportions

Supplementary Notes 4.1

Confidence Interval for a Single Population Proportion

Experimental Situation: One categorical population with an unknown proportion (or percentage) p.

Objective: Based on the results of a random sample of n observations from this population, construct a confidence interval for the unknown proportion p.

Assumptions:

  • Independence: The individual responses in the sample are independent of each other.
  • Random: The sample is random.
  • Success/Failure Condition: n\hat{p} \ge 10 and n\hat{q} \ge 10 (\hat{q} = 1 - \hat{p}).
  • 10% Condition: The sample size n is no more than 10% of the population size.

Confidence Interval Construction: The general form for a confidence interval for a proportion p is \hat{p} \pm z^* \times \sqrt{\dfrac{\hat{p}\hat{q}}{n}}, where z^* is the “critical value” z-score from the standard normal distribution corresponding to the specified confidence level.

Confidence Interval Interpretation: We’re …% confident the population proportion is in the interval … to ….

Hypothesis Test for a Single Population Proportion

  1. Hypotheses
    • H0: p = p0 versus HA: p > p0 (upper-sided alternative)
    • H0: p = p0 versus HA: p < p0 (lower-sided alternative)
    • H0: p = p0 versus HA: pp0 (two-sided alternative)
  2. Model: Normal model for the sampling distribution of \hat{p} that has a mean of p_0 and a SD of \sqrt{p_0q_0/n}, where q_0 = 1-p_0. Assumptions:
    • Independent sample.
    • Random sample.
    • Success/failure condition: np_0 \ge 10 and nq_0 \ge 10.
    • 10% condition: The sample size n is no more than 10% of the population size.
  3. Mechanics:
    • Upper-sided alternative:
      H0: p = p0 versus HA: p > p0
      Calculate test statistic: Z = \dfrac{\hat{p}-p_0}{\sqrt{\dfrac{p_0q_0}{n}}}
Upper-sided proportion test
Figure 1: Upper-sided proportion test 

 

 

Obtain p-value using R code: 1 - pnorm(Z, mean=0, sd=1)

    • Lower-sided alternative:
      H0: p = p0 versus HA: p < p0
      Calculate test statistic: Z = \dfrac{\hat{p}-p_0}{\sqrt{\dfrac{p_0q_0}{n}}}
  • Lower-sided proportion test
    Figure 2: Lower-sided proportion test
    • Obtain p-value using R code: pnorm(Z, mean=0, sd=1)
    • Two-sided alternative:
      H0: p = p0 versus HA: pp0
      Calculate test statistic: Z = \dfrac{\hat{p}-p_0}{\sqrt{\dfrac{p_0q_0}{n}}}
Figure 3: Two-sided alternative proportion test where p-value is the sum of the two shaded areas. 

Obtain p-value using R code: 2 * (1 - pnorm(z, mean=0, sd=1))

  1. Conclusion
    • If the p-value < the significance level \alpha, reject H0 in favour of HA. Conclude that there is sufficient evidence that p > p0 (upper-sided alternative) or p < p0 (lower-sided alternative) or pp0 (two-sided alternative).
    • If the p-value > the significance level \alpha, do not reject H0. Conclude that there is insufficient evidence that p > p0 (upper-sided alternative) or p < p0 (lower-sided alternative) or pp0 (two-sided alternative).

Note: Alternatively, in the two-sided case, reject H0 in favour of HA if the test statistic is in the rejection region (either greater than the positive critical value or less than the negative critical value). Do not reject H0 if the test statistic is not in the rejection region (i.e., it is between the negative and positive critical values).

Determining Sample Size for Desired Accuracy and Confidence

Suppose that we wanted to estimate the percentage of adult Vancouverites that support a complete ban on smoking in public places. How large a random sample would we need to make this estimate?

First, we must decide on an acceptable margin of error and confidence level. Suppose that we want our estimate to be within 4% of the population proportion at a 95% confidence level. For proportions, the margin of error is ME(\hat{p}) = z^* \sqrt{\dfrac{\hat{p}\hat{q}}{n}}.

In our application ME = 4% = 0.04 and z* = 1.96 (to achieve 95% confidence).

Now we can find the size of the random sample required by solving for n.

  • Square each side: ME^2 = z^{*2} \dfrac{\hat{p}\hat{q}}{n}
  • Multiply each side by n: nME^2 = z^{*2} \hat{p}\hat{q}
  • Divide each side by: ME^2: n = \dfrac{z^{*2} \hat{p}\hat{q}}{ME^2}
  • So, in our application: n = \dfrac{1.96^2 \hat{p}\hat{q}}{0.04^2}

Problem! We haven’t sampled yet, so we don’t have a value for \hat{p}. What options do we have?

  1. Be cautious and possibly overstate the size of the sample needed by using
    \hat{p}=0.5, which gives the largest possible value for the product of \hat{p}\hat{q}=0.25. Convince yourself that all the other choices for \hat{p} give smaller products. For example, \hat{p}=0.4 or \hat{p}=0.6 gives \hat{p}\hat{q}=0.24 and \hat{p}=0.3 or \hat{p}=0.7 gives \hat{p}\hat{q}=0.21.
  2. If it is available, use an approximation for \hat{p} from a pilot study or prior knowledge.

Do we know anything about the proportion of Vancouverites that support a complete ban on smoking?

  1. If no, use the cautious value \hat{p}=0.5, which gives n = \dfrac{1.96^2 \hat{p}\hat{q}}{0.04^2} = \dfrac{1.96^2 (0.5)(0.5)}{0.04^2} = 600.25. So, we need to randomly sample 601 (always round up these sample size estimates) adult Vancouverites.
  2. If yes, use your prior knowledge to “sharpen the statistical pencil” in your determination of sample size. Likely, the proportion supporting the ban is greater than 50% . Let’s say it is at least 75%. Using \hat{p}=0.75, gives n = \dfrac{1.96^2 \hat{p}\hat{q}}{0.04^2} = \dfrac{1.96^2 (0.75)(0.25)}{0.04^2} = 450.19, this means we need to randomly sample only 451 adult Vancouverites. That’s quite a reduction from the 601. Of course, if we have doubts about our 75% approximation, then we should use the more cautious sample size of 601 that guarantees the desired 4% margin at 95% confidence.

What if we changed the scope of our study to all of Canada? Would we need a much larger sample size?

No! Nothing in the above sample size calculation would change. A random sample of 601 Canadians will estimate the proportion within 4%, “19 times out of 20.” To understand the intuition here, remember the “soup tasting” analogy in Supplementary Notes 1.2 on sample size: A spoonful from a well-mixed small bowl (i.e., Vancouver) will give you just as accurate an assessment of the soup’s flavour as a spoonful from a large bowl (i.e., Canada).

Inference for the Difference of Two Proportions

Is the ginseng-based COLD-FX® medication effective in reducing the frequency and severity of the common cold? This is an interesting and hotly debated question.

The company that manufactures COLD-FX® (Bausch Health Companies Inc., formerly CV Technologies Inc.) advertises “Trust the Science” and identifies the results from a number of studies as evidence that their product is effective. The Vancouver Sun columnist David Bains, with the support of Dr. James McCormack and Dr. Peter Loewen at UBC, questioned the results from these studies in a series of articles published on Feb. 25, 2006; Feb. 28, 2006; Mar. 8, 2006; Apr. 12, 2006; June 14, 2006; Oct. 12, 2006; and Nov. 11, 2006. Undoubtedly, this debate will extend into the future as the results from a new multi-million-dollar clinical trial come in. If you think that statistics is dull and without controversy (hard to believe for anyone coming this far in the course!), dig-out these articles and be prepared to change your mind.

We won’t wade into this controversy, but we’ll use one of the results from a study published in the Canadian Medical Association Journal (Predy, 2005) to explore the topic of finding confidence intervals for the difference between two proportions and testing the hypothesis that two proportions are equal.

Efficacy of an Extract of North American Ginseng Containing Poly-Furanosyl-Pyranosyl-Saccharides for Preventing Upper Respiratory Tract Infections: A Randomized Controlled Trial (Predy, 2005)

Used with permission.

Background: Upper respiratory tract infections are a major source of morbidity throughout the world. Extracts of the root of North American ginseng (Panax quinquefolium) have been found to have the potential to modulate both natural and acquired immune responses. We sought to examine the efficacy of an extract of North American ginseng root in preventing colds.

Methods: We conducted a randomized, double-blind, placebo-controlled study at the onset of the influenza season. A total of 323 subjects 18-65 years of age with a history of at least 2 colds in the previous year were recruited from the general population in Edmonton, Alberta. The participants were instructed to take 2 capsules per day of either the North American ginseng extract or a placebo for a period of 4 months. The primary outcome measure was the number of Jackson-verified colds.

Results: Subjects who did not start treatment were excluded from the analysis (23 in the ginseng group and 21 in the placebo group), leaving 130 in the ginseng group and 149 in the placebo group. (…) The proportion of subjects with 2 or more Jackson-verified colds during the 4-month period (10.0% v. 22.8%, 12.8% difference, 95% CI 4.3-21.3) was significantly lower in the ginseng group than in the placebo group ….

Here, two proportions are bring compared: The proportion from the ginseng group getting colds vs. the placebo proportion getting colds. A confidence interval for difference in these two proportions is given as 4.3% to 21.3%. Further, the proportion getting colds for the ginseng group is judged as being significantly lower than the proportion for the placebo group.

How were the results obtained? Please read on!

Confidence Interval for the Difference Between Two Proportions

The schematic below illustrates the two-proportion experimental situation as presented in the ginseng study.

Inference for two proportions
Figure 4: Inference for two proportions: placebo proportion and ginseng proportion

Our goal is to develop a confidence interval for the difference, p1p2.

This CI will have the usual basic form: Sample estimate ± margin of error, which in this case is \hat{p}_1-\hat{p}_2 ± margin of error.

This margin of error depends on the sampling distribution of \hat{p}_1-\hat{p}_2. Under certain conditions we know that \hat{p}_1 has a normal model with mean p_1 and standard deviation \sqrt{\dfrac{p_1q_1}{n_1}}. and \hat{p}_2 has a normal model with mean p_2 and standard deviation \sqrt{\dfrac{p_2q_2}{n_2}}. It turns out that the difference \hat{p}_1-\hat{p}_2 also has a normal model with mean p_1-p_2 and standard deviation \sqrt{\dfrac{p_1q_1}{n_1}+\dfrac{p_2q_2}{n_2}} (note there is a plus sign rather than a minus sign in the standard deviation).

Therefore, a confidence interval for the difference, p1p2, is \hat{p}_1-\hat{p}_2 \pm z^* \times \sqrt{\dfrac{p_1q_1}{n_1}+\dfrac{p_2q_2}{n_2}}. Since p_1 and p_2 are unknown, we estimate this by \hat{p}_1-\hat{p}_2 \pm z^* \times \sqrt{\dfrac{\hat{p}_1\hat{q}_1}{n_1}+\dfrac{\hat{p}_2\hat{q}_2}{n_2}}.

Now we’re ready to sub in the numbers for the ginseng application to get a 95% CI for the difference in the proportions getting two or more colds for the placebo vs. ginseng groups:

  • \hat{p}_1-\hat{p}_2 \pm z^* \times \sqrt{\dfrac{\hat{p}_1\hat{q}_1}{n_1}+\dfrac{\hat{p}_2\hat{q}_2}{n_2}}
  • = 0.228-0.10 \pm 1.96 \times \sqrt{\dfrac{(0.228)(0.772)}{149}+\dfrac{(0.10)(0.90)}{130}}
  • = 0.128 \pm 0.085
  • = 12.8\% \pm 8.5\%

This gives an interval of 4.3% to 21.3%, as given in the Results section of the Predy (2005) article.

Based on this study, we are 95% confident that the proportion of people getting two or more colds is between 4.3% and 21.3 % higher in the placebo population compared to the ginseng population. Since this interval doesn’t contain zero, it can also be interpreted that the two proportions are statistically significantly different at the 5% significance level, with the ginseng group proportion lower than the placebo group proportion.

Sampling Distribution of the Difference Between Two Proportions

We have already used the normal model for the sampling distribution of \hat{p}_1-\hat{p}_2 in the calculation of the confidence interval in the Predy (2005) ginseng study.

What conditions must be satisfied for this normal model to apply?

Inference for two proportions
Figure 5: Inference for two proportions: population 1 and population 2

Basically, we need the same conditions as for the one-proportion case, but now they must apply to each of our two samples, plus we need one more condition: that the two groups are independent. Here is the complete list of necessary conditions:

  • Independence Between Groups: The two groups that we are comparing are independent of each other. This means that there is no linkage or association between the two groups. This would be the case in a completely randomized experiment where the two groups are formed at random, but it would not be the case if we used twin pairs, for example, to form the two groups.
  • Independence Within Groups: Within each group, the individual responses are independent of each other.
  • Random: Each of the two samples is randomly drawn from their respective populations.
  • Success/Failure Condition: n_1\hat{p}_1 \ge 10, n_1\hat{q}_1 \ge 10, n_2\hat{p}_2 \ge 10, and n_2\hat{q}_2 \ge 10.
  • 10% Condition: Each of the two sample sizes, n_1 and n2, is no more than 10% of their respective population sizes.

Under these conditions we have:

  • \hat{p}_1-\hat{p}_2 has a normal model with mean p_1-p_2 and standard deviation \sqrt{\dfrac{p_1q_1}{n_1}+\dfrac{p_2q_2}{n_2}}.
  • A confidence interval for p_1-p_2 is \hat{p}_1-\hat{p}_2 \pm z^* \times \sqrt{\dfrac{\hat{p}_1\hat{q}_1}{n_1}+\dfrac{\hat{p}_2\hat{q}_2}{n_2}}.

A Two-Proportion Z-Test

Are the two population proportions equal? We answer this by testing H_0: p_1=p_2 or equivalently, H_0: p_1-p_2=0.

What test statistic do we use to judge the weight of the sample evidence against H0?

If this null hypothesis is true, the difference between the sample proportions, \hat{p}_1-\hat{p}_2, will fluctuate around a mean of zero. If we express the difference, \hat{p}_1-\hat{p}_2, in standardized form by dividing by \sqrt{\dfrac{p_1q_1}{n_1}+\dfrac{p_2q_2}{n_2}}, we’ll be able to judge whether or not \hat{p}_1-\hat{p}_2 is unusually far from the mean of zero.

However, how can we calculate \sqrt{\dfrac{p_1q_1}{n_1}+\dfrac{p_2q_2}{n_2}} when we don’t know the values for p_1 and p_2?

We could use \sqrt{\dfrac{\hat{p}_1\hat{q}_1}{n_1}+\dfrac{\hat{p}_2\hat{q}_2}{n_2}} as an approximation. However, there is one more little wrinkle in calculating the two-proportion test statistic for H_0: p_1=p_2.

This null hypothesis says that both unknown population proportions are equal, so rather than two separate estimates it is better to pool them together by taking the average of the two, weighted by their sample sizes, to get one overall estimate of the equal (under the null) unknown proportions: \hat{p} = \dfrac{n_1\hat{p}_1+n_2\hat{p}_2}{n_1+n_2}.

Then calculate the test statistic as Z = \dfrac{\hat{p}_1-\hat{p}_2}{\sqrt{\dfrac{\hat{p}\hat{q}}{n_1}+\dfrac{\hat{p}\hat{q}}{n_2}}} = \dfrac{\hat{p}_1-\hat{p}_2}{\sqrt{\hat{p}\hat{q}\left(\dfrac{1}{n_1}+\dfrac{1}{n_2}\right)}}.

The conditions are the same as listed above for the confidence interval for the difference between two proportions, except the success/failure conditions are now n_1\hat{p} \ge 10, n_1\hat{q} \ge 10, n_2\hat{p} \ge 10, and n_2\hat{q} \ge 10.

Example: Two-Proportion Z-Test

In a study designed to compare the efficacy of “the patch” versus “gum” in helping people quit smoking, 150 smokers were randomly assigned to the patch group (group 1) and 100 smokers to the gum group (group 2).

Do the data (Table 1) provide sufficient evidence to conclude a difference in the percentages quitting smoking with the two methods?

Table 1 Efficacy of Quitting Smoking: Patch or Gum

Group 1: Patch

Group 2: Gum

Total

Quit smoking 90 50 140
Did not quit 60 50 110
Total 150 100 250

Hypotheses: H_0: p_1 = p_2 versus H_A: p_1 \ne p_2.

Pooled proportion: \hat{p}} = \dfrac{n_1\hat{p}_1+n_2\hat{p}_2}{n_1+n_2} = \dfrac{150(90/150)+100(50/100)}{150+100} =  \dfrac{140}{250} = 0.56.

Conditions:

  • Independence Between Groups: Yes, since the two groups were created randomly.
  • Independence Within Groups: Reasonable to assume.
  • Random: Unclear how the 250 smokers were selected, but it may have been done randomly.
  • Success/Failure Condition: 150(0.56) = 84 \ge 10, 150(0.44) = 66 \ge 10, 100(0.56) = 56 \ge 10, and 100(0.44) = 44 \ge 10.
  • 10% Condition: Each of the two sample sizes is well below 10% of all smokers using the patch or gum.

Mechanics:

  • Test statistic, Z = \dfrac{90/150-50/100}{\sqrt{(0.56)(0.44)\left(\dfrac{1}{150}+\dfrac{1}{100}\right)}} = 1.5605.
  • P-value = 2 * (1 - pnorm(1.5605, mean=0, sd=1)) ≈ 0.1186. (Note the upper-tail area is doubled because of the two-sided alternative hypothesis.)
Smoking example
Figure 6: Two-sided alternative proportion test: The p-value is the sum of two shaded areas.  

So, if the rates quitting smoking really are the same for the two groups (H0 true), there is about a 11.86% chance that we would observe sample proportions that differ by 10% (60% – 50%) or more. This is not unusual enough to reject H0 at a 5% significance level.

Conclusion:

The sample evidence is not strong enough for us to conclude a difference in the percentages quitting smoking with the two methods.

Alternatively, reject H0 in favour of HA if the test statistic is in the rejection region (either greater than the positive critical value or less than the negative critical value). Do not reject H0 if the test statistic is not in the rejection region (i.e., it is between the negative and positive critical values). The critical value in the smoking example is 1.9600, the 97.5th percentile of the standard normal distribution. Since the test statistic, Z=1.5605 is between –1.9600 and 1.9600, it is not in the rejection region, so we do not reject H0.

References

Predy, G. N. (2005). Efficacy Of An Extract Of North American Ginseng Containing Poly-furanosyl-pyranosyl-saccharides For Preventing Upper Respiratory Tract Infections: A Randomized Controlled Trial. Canadian Medical Association Journal, 173(9), 1043-1048. https://doi.org/10.1503/cmaj.1041470

License

Icon for the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License

Introduction to Probability and Statistics Copyright © 2023 by Thompson Rivers University is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License, except where otherwise noted.

Share This Book