BIOSTATS 3.4 - CH.20: COMPARING TWO PROPORTIONS

Choose a study mode

Play Quiz
Study Flashcards
Spaced Repetition
Chat to Lesson

Podcast

Play an AI-generated podcast conversation about this lesson

Questions and Answers

What key assumption must be met to safely use t procedures in the great white shark length analysis, given a reasonably normal distribution and the presence of outliers?

  • The outliers must be removed from the dataset.
  • The mean and median of the data must be exactly equal.
  • The sample size must be greater than 100.
  • The outliers should not be extreme and should preserve the symmetry of the distribution. (correct)

In the context of the long-tailed finch study, why is it important to consider the dotplots of both male and female tail feather lengths?

  • To determine whether a two-sample t procedure is appropriate based on the distribution of the data. (correct)
  • To calculate the exact mean length of tail feathers for each gender.
  • To ensure that the data are normally distributed, regardless of sample size.
  • To visually confirm that the sample sizes are equal.

When comparing two independent samples, what condition regarding sample size is typically required for the sampling distribution of the difference to be approximately normal?

  • The samples must be large enough. (correct)
  • The samples must be randomly selected and normally distributed.
  • Both samples must have the same size.
  • At least one sample must be greater than 30.

What condition must be met regarding the number of successes and failures in each sample when using a large sample confidence interval for two proportions?

<p>The number of successes and the number of failures must each be at least 5 in each sample. (D)</p> Signup and view all the answers

In the context of comparing two proportions, what does the 'plus four' method aim to improve?

<p>It provides more accurate confidence intervals, especially with small sample sizes or extreme proportions. (D)</p> Signup and view all the answers

Why is the pooled sample proportion used in hypothesis testing for two proportions under the null hypothesis?

<p>Because the null hypothesis assumes the two population proportions are equal, and the pooled proportion estimates this common value. (A)</p> Signup and view all the answers

Why was gastric freezing abandoned as a treatment for ulcers despite initial positive results?

<p>Later studies using a control group showed it was not significantly better than a placebo. (A)</p> Signup and view all the answers

What is the primary purpose of calculating the Relative Risk Reduction (RRR)?

<p>To indicate how much better off one would be by receiving a treatment versus a control. (D)</p> Signup and view all the answers

Which measure provides a more direct indication of the treatment's practical impact on a per-patient basis?

<p>Absolute Risk Reduction (ARR) (A)</p> Signup and view all the answers

Given a scenario where the Relative Risk Reduction (RRR) is calculated to be 0.25 for a new drug compared to a placebo, how should this be interpreted?

<p>The drug reduces the risk of the outcome by 25% compared to the placebo. (A)</p> Signup and view all the answers

In a study comparing a new treatment to a placebo, if the absolute risk reduction (ARR) is found to be 0.02, what does this indicate?

<p>The new treatment reduces the risk by 2 percentage points. (D)</p> Signup and view all the answers

The number needed to treat (NNT) is calculated as 50 for a certain medication. What is the correct interpretation of this value?

<p>You need to treat 50 patients with the medication to prevent one additional negative outcome. (C)</p> Signup and view all the answers

Which of the following scenarios would justify the use of the 'plus four' method when comparing two proportions?

<p>When at least one of the observed counts (successes or failures) is very low. (C)</p> Signup and view all the answers

When constructing a large sample confidence interval for the difference between two proportions, what does the margin of error primarily depend on?

<p>The sample proportions, sample sizes, and the desired level of confidence. (D)</p> Signup and view all the answers

If a 95% confidence interval for the difference in mean length of great white sharks does not contain zero, what does this indicate about the null hypothesis?

<p>The null hypothesis can be rejected at the 5% significance level. (B)</p> Signup and view all the answers

What is the correct formula for calculating the pooled sample proportion ($\hat{p}$), where count1 and count2 represent the number of successes in two samples, and n1 and n2 represent their respective sample sizes?

<p>$\hat{p} = \frac{count1 + count2}{n1 + n2}$ (C)</p> Signup and view all the answers

Given a Relative Risk Reduction (RRR) of 0.4, what is a correct interpretation of this value in the context of a treatment's effectiveness?

<p>The treatment reduces the risk of the outcome by 40%. (A)</p> Signup and view all the answers

What does a large Number Needed to Treat (NNT) indicate about the effectiveness of a medical intervention?

<p>The intervention is not very effective, requiring a large number of patients to be treated to prevent one additional adverse outcome. (D)</p> Signup and view all the answers

In the large sample CI example for two proportions focusing on the cholesterol-lowering drug Gemfibrozil, how is the standard error (SE) of the difference between the two proportions calculated?

<p>With the formula: $SE = \sqrt{\frac{\hat{p_1}(1-\hat{p_1})}{n_1} + \frac{\hat{p_2}(1-\hat{p_2})}{n_2}}$ (B)</p> Signup and view all the answers

In the context of hypothesis testing for two proportions, what is assumed to be true when using a pooled sample proportion?

<p>Any difference is due to random variation, assuming the proportions are actually the same. (C)</p> Signup and view all the answers

Flashcards

Sampling Distribution of the Difference Between Two Proportions

A sampling distribution formed by the difference between two proportions obtained from independent samples.

Large Sample CI for Two Proportions

An interval estimate for the difference between two population proportions, using a normal distribution.

"Plus Four" CI

Addresses situations where the number of successes and failures may be small, adjusts the sample proportions by adding successes and failures.

Hypothesis Tests for Two Proportions

A test to determine if there's a statistically significant difference between the proportions of two populations.

Signup and view all the flashcards

Relative Risk Reduction (RRR)

A measure of how much the risk is reduced in a treatment group compared to a control group.

Signup and view all the flashcards

Absolute Risk Reduction (ARR)

The actual difference in rates of outcomes between a control and treatment group.

Signup and view all the flashcards

Number Needed to Treat (NNT)

Indicates the number of patients that need to be treated to prevent one additional bad outcome.

Signup and view all the flashcards

Study Notes

  • Chapter 20 discusses comparing two proportions

Homework #7, Problem 1

  • Lengths of 44 great white sharks in feet are provided
  • The distribution of the data is reasonably normal with one outlier in each direction
  • The outliers are not extreme and preserve the symmetry of the distribution
  • It is safe to use t procedures with 44 observations
  • Calculate a 95% confidence interval for the mean length of great white sharks
  • Determine if there is significant evidence at the 5% level to reject the claim that great white sharks average 20 feet in length
  • Before accepting any conclusions, further data is needed

Homework #7, Problem 2

  • The central tail feathers of long-tailed finches (Poephila acuticauda) are a sexually dimorphic trait that may play a role in sexual selection
  • Longer tail feathers in males cost energy to produce, signaling the male's excellent health
  • The average lengths of the two central feathers (in millimeters) of 20 male and 21 female long-tailed finches are given
  • Treat the data as Simple Random Samples (SRSs) from the population of adult long-tailed finches
  • Create dotplots of both data sets to see if the use of a two-sample t procedure is appropriate
  • Calculate how much longer the central tail feathers of male long-tailed finches are (on average) than those of females
  • Calculate a 95% confidence interval for the difference in population mean length between the male and female adult long-tailed finches

Previous Learning Objectives

  • Applying inference for a population proportion has been previously discussed
  • Sample proportion p̂, large and more accurate sample confidence intervals for a proportion have been covered
  • Hypothesis tests for a proportion and choosing the sample size have also been previously discussed

Learning Objectives

  • The learning objectives include two-sample problems for proportions
  • Examine the sampling distribution of the difference between two proportions
  • Calculating large sample confidence intervals for comparing proportions
  • Calculating more accurate confidence intervals for comparing proportions
  • Performing hypothesis tests for comparing proportions
  • Find relative risk and odds ratio

Comparing Two Independent Samples

  • Comparing two treatments with two independent samples is often required
  • For large enough samples, the sampling distribution is approximately Normal
  • Neither p₁ nor p₂ are known

Large Sample CI for Two Proportions

  • For two independent Simple Random Samples of sizes n₁ and n₂ with sample proportions of successes p̂₁ and p̂₂, an approximate level C confidence interval for p₁ - p₂ is (p̂₁ - p̂₂) ± m, where m is the margin of error
  • m = zSEdiff = z√((p̂₁(1 - p̂₁) / n₁) + (p̂₂(1 - p̂₂) / n₂))
  • C is the area under the standard Normal curve between -z* and z*
  • This method is used when the number of successes and the number of failures are each at least 10 in each sample.

Large Sample CI Example

  • Assess how much the cholesterol-lowering drug Gemfibrozil reduces heart attack risk
  • Incidence of heart attack is compared over a 5-year period for 2 random samples of middle-aged men taking either the drug or a placebo
  • The standard error of the difference p̂₁ - p̂₂ is calculated
  • SE = √((p̂₁(1 - p̂₁) / n₁) + (p̂₂(1 - p̂₂) / n₂))
  • The confidence interval is (p̂₁ - p̂₂) ± z* SE
  • In a study, 56 out of 2051 men on Gemfibrozil had a heart attack (2.73%), while 84 out of 2030 men on a placebo had a heart attack (4.14%)
  • So the 90% CI is (0.0414 – 0.0273) ± 1.645*0.0057 = 0.014 ± 0.009
  • It is 90% certain that the percent of middle-aged men who suffer a heart attack is 0.5 to 2.3 percentage points lower when taking the cholesterol-lowering drug than when taking a placebo

“Plus Four” CI for Two Proportions

  • The “plus four” method produces more accurate confidence intervals
  • The method acts as if there were four additional observations: one success and one failure in each of the two samples
  • The new combined sample size is n₁ + n₂ + 4, and the proportions of successes are: p̃₁ = (X₁ + 1) / (n₁ + 2) and p̃₂ = (X₂ + 1) / (n₂ + 2)
  • An approximate level C confidence interval is: CI: (p̃₁ - p̃₂) ± z* √((p̃₁(1 - p̃₁) / (n₁ + 2)) + (p̃₂(1 - p̃₂) / (n₂ + 2)))
  • Use this method when C is at least 90% and both sample sizes are at least 5

“Plus Four” CI Example

  • Researchers compared oral health in 46 young adult males wearing a tongue piercing (TP) and a control group of 46 young adult males without tongue piercing
  • They found that 38 individuals in the TP group and 26 in the control group had enamel cracks
  • A question to answer is how to estimate with 95% confidence the difference between the proportions of individuals with enamel cracks among young adult males with and without TP
  • One count is too low for the large sample method, so the plus-four method can be used

Hypothesis Tests for Two Proportions

  • The hypothesis is to test H₀: p₁ = p₂ = p
  • If H₀ is true, we are sampling twice from the same population and we can pool the information from both samples to estimate p
  • The pooled sample proportion is p̂ = (total successes) / (total observations) = (count₁ + count₂) / (n₁ + n₂)
  • The z-score is z = (p̂₁ - p̂₂) / √(p̂(1 - p̂)(1/n₁ + 1/n₂))
  • Appropriate when all counts (successes and failures in each sample) are 5 or more.

Hypothesis Test Example

  • Gastric freezing was once a treatment for ulcers
  • The treatment was shown to be safe and significantly reduced ulcer pain and was widely used for years
  • A randomized comparative experiment compared the outcome of gastric freezing with that of a placebo
  • 28 of the 82 patients subjected to gastric freezing improved, while 30 of the 78 in the control group improved
  • H₀: p(gastric freezing) = p(placebo)
  • Hₐ: p(gastric freezing) > p(placebo)
  • Gastric freezing was not significantly better than a placebo (P-value > 0.1), and this treatment was abandoned
  • The P-value is greater than 50%
  • Always use a control

Relative Risk and Odds Ratio

  • In the health sciences, a given health risk in the treatment group vs. the same risk in the control group is often compared
  • One measure of this is the Relative Risk Reduction (RRR), which indicates how much better off one would be relative to receiving a placebo or control treatment
  • RRR = (p(control) - p(treatment)) / p(control)

Relative Risk Example

  • Determine how much the cholesterol-lowering drug Gemfibrozil helps reduce the risk of heart attack
  • The risk of a heart attack is compared over a 5-year period for two random samples of middle-aged men taking either the drug or a placebo
  • In the drug group 56 out of 2051 had a heart attack
  • In the placebo group 84 out of 2030 had a heart attack
  • RRR = (p(placebo) - p(drug)) / p(placebo) = (0.0414 - 0.0273) / 0.0414 ≈ 0.34
  • Gemfibrozil reduces the risk of a heart attack in middle-aged men by about 34% over a 5-year period of continuous treatment, compared with middle-aged men taking a placebo (RRR = 34%)
  • The risk of a heart attack over that period is 34% smaller in the Gemfibrozil group than in the placebo group
  • The Absolute Risk Reduction (ARR) is simply the absolute difference in outcome rates between the control and treatment groups: ARR = p(control) - p(treatment)
  • The Number Needed to Treat (NNT) is the number of patients that would need to be treated to prevent one additional negative outcome: NNT=1/ARR
  • ARR and NNT are better indicators of treatment efficacy than RRR
  • ARR = p(placebo) - p(drug) = 0.0414 - 0.0273 = 0.0141
  • For the group taking Gemfibrozil, the rate of heart attack was a 1.4 percentage point lower than that of the placebo group (ARR = 1.4%)
  • NNT = 1/ARR = 1/0.014 ≈ 70.9
  • On average, we need to treat 71 men for 5 years with Gemfibrozil to avoid 1 heart attack (NNT = 71)

Group Project

Studying That Suits You

Use AI to generate personalized quizzes and flashcards to suit your learning preferences.

Quiz Team

Related Documents

More Like This

Use Quizgecko on...
Browser
Browser