BIOSTATS 3.3 - CH. 19: INFERENCE ABOUT A POPULATION PROPORTION

Choose a study mode

Play Quiz
Study Flashcards
Spaced Repetition
Chat to Lesson

Podcast

Play an AI-generated podcast conversation about this lesson

Questions and Answers

Which assumption is crucial when performing inference on proportions?

  • The data used for the estimate are a random sample from the population studied. (correct)
  • The sample size is small, leading to a non-normal sampling distribution.
  • The population is at least 5 times as large as the sample size.
  • The data constitutes a non-random sample from the population.

If you increase your confidence level while keeping the sample size constant, what happens to the width of the confidence interval?

  • The width increases. (correct)
  • The width stays the same.
  • The effect on the width is unpredictable.
  • The width decreases.

In hypothesis testing for a population proportion, what is the role of the null hypothesis?

  • It represents the alternative claim you are trying to find evidence for.
  • It helps in estimating the sample size.
  • Provides a baseline against which the sample data is evaluated. (correct)
  • It is used to calculate the confidence interval.

Why is it important to check conditions before making inferences about a population proportion?

<p>To validate assumptions required for the statistical methods to be reliable. (C)</p> Signup and view all the answers

Consider an experiment where 30 out of 120 patients improved with a new drug. What is the sample proportion?

<p>0.25 (C)</p> Signup and view all the answers

The sampling distribution of a sample proportion is approximated by a Normal curve when:

<p>The sample size is large enough. (A)</p> Signup and view all the answers

The mean and standard deviation of the sampling distribution are determined by:

<p>The population proportion, p, and the sample size, n. (D)</p> Signup and view all the answers

Under what conditions should the 'large sample method' for calculating confidence intervals be used cautiously?

<p>When both the number of successes and failures are very small. (A)</p> Signup and view all the answers

In an arthritis study, out of 440 patients, 23 reported side effects with a new pain reliever. What is the sample proportion?

<p>Approximately 0.052 (D)</p> Signup and view all the answers

What is the critical value (z*) for a 90% confidence level?

<p>1.645 (C)</p> Signup and view all the answers

What does the 'plus four' method adjust in the context of estimating population proportions?

<p>Adjusts the sample proportion by adding four observations (two successes and two failures). (B)</p> Signup and view all the answers

When is the 'plus four' method most applicable for constructing confidence intervals?

<p>When the sample size is small, and the confidence level is at least 90%. (C)</p> Signup and view all the answers

Under what circumstance is it most appropriate to use p* = 0.5 when choosing sample size for estimating a population proportion?

<p>When you have no prior knowledge or educated guess about the population proportion. (D)</p> Signup and view all the answers

What is the primary reason for choosing a larger sample size when estimating a population proportion?

<p>To reduce variability and increase the precision of the estimate. (D)</p> Signup and view all the answers

If the null hypothesis (H₀: p = p₀) is true, what does the test statistic for hypothesis tests for proportions measure?

<p>The standardized value of the sample proportion. (A)</p> Signup and view all the answers

Which condition needs to be satisfied to ensure the validity of a hypothesis test for a population proportion?

<p>The expected counts of both successes and failures must be at least 10. (B)</p> Signup and view all the answers

What is the P-value in hypothesis testing?

<p>The probability of observing a test statistic as extreme as, or more extreme than, the one computed if the null hypothesis is true. (C)</p> Signup and view all the answers

In a hypothesis test for a population proportion, if the P-value is less than the significance level, what conclusion can be made?

<p>Reject the null hypothesis. (B)</p> Signup and view all the answers

When testing the hypothesis that aphids land on their ventral side 50% of the time versus the alternative that it is greater than 50%, and a test statistic of z = 4.02 is obtained, what does this suggest?

<p>The probability of aphids landing on their ventral side is significantly higher than 50%. (C)</p> Signup and view all the answers

In Mendel's experiment, if the observed proportion of smooth peas deviates slightly from the expected 75%, and the resulting P-value is high (e.g., 0.61), the data is:

<p>Consistent with the expectations of a dominant-recessive genetic model. (C)</p> Signup and view all the answers

Flashcards

Assumptions for inference on proportions

A random sample from the population studied, where the population is at least 20 times the sample size. The sample size is large enough for a normal sampling distribution.

Sample Proportion (p̂)

The number of successes in the sample divided by the total number of observations in the sample.

Confidence Interval for p

A range of values likely to contain the true population proportion, calculated from sample data.

Large Sample Method

A method for computing a confidence interval for a proportion, best used with large samples.

Signup and view all the flashcards

"Plus Four" Method

A method for computing a confidence interval for a proportion that adds four observations (two successes, two failures) to the sample. It is more accurate even for small samples.

Signup and view all the flashcards

Approximate Level C Confidence Interval

A range that estimates population proportion, calculated by adding and subtracting a margin of error (m) from the sample.

Signup and view all the flashcards

Choosing the Sample Size

Calculating the sample size that will achieve a desired margin of error, informed by initial estimates.

Signup and view all the flashcards

Hypothesis Tests for p

A statistical test used to determine whether there is enough evidence to reject a null hypothesis about a population proportion.

Signup and view all the flashcards

Null hypothesis (H0)

p = p0, we assume a given value is true.

Signup and view all the flashcards

P-value

The probability of obtaining a test statistic as extreme as, or more extreme than, the one actually observed, assuming the null hypothesis is true.

Signup and view all the flashcards

Study Notes

  • Chapter focuses on inference about a population proportion.
  • The material is based on copyright 2018 W Freeman and Company
  • The content includes previous learning objectives, learning objectives, conditions of inference, sample proportions, sampling distribution, confidence intervals for p, choosing sample size and hypothesis tests

Previous Learning Objectives

  • Covers comparing two means
  • Includes two-sample situations
  • Two sample t procedures are relevant
  • Robustness is assumed
  • Pooled procedures should be avoided
  • Avoid inferences on standard deviations

Learning Objectives

  • How to apply inference for a population proportion
  • Use the sample proportion p^
  • Calculate large sample confidence intervals for a proportion
  • How to find more accurate confidence intervals for a proportion
  • Selection of the sample size
  • Use of hypothesis tests for a proportion

Conditions for Inference on Proportions

  • The data used for the estimate must come from a random sample of the population
  • The population must be at least 20 times larger than the sample to ensure independence in random sampling
  • The sample size n must be large enough to assume a Normal shape of the sampling distribution
  • The required minimum size of n depends on the type of inference conducted

The Sample Proportion p^

  • Focuses on categorical data to infer the proportion/percentage of a population with a specific trait
  • If a categorical trait is labelled as a "success", the sample proportion of successes is p^
  • p^ is calculated as: (count of successes in the sample) / (count of observations in the sample)
  • In an example, a group of 120 Herpes patients are treated with a new drug, and 30 improve, so p^ = (30/120)
  • Therefore p^ = 0.25, meaning proportion of patients improving is at 0.25 (in the sample)

Sampling Distribution of p^

  • The sampling distribution of p^ is never exactly Normal
  • However, it can be approximated by a Normal curve if the sample is large enough
  • The mean and standard deviation (width) of the sampling distribution are determined by p and n
  • Population parameter to estimate is p
  • N (p,√p (1-p)/n)

Confidence Interval for p

  • If p is unknown, the sampling distribution's center and spread are unknown
  • A value for p has to be "guessed"
  • Two options exist:
  • Use p^, the sample proportion, called the large sample method, performing is poorly unless n is extremely large
  • Use p̃, an improved estimate of p, called the plus four method, being reasonably accurate even for samples as small as 10

Large Sample Confidence Interval for p

  • Confidence intervals contain the population proportion 'p' in C% of samples
  • For a SRS (simple random sample) of size n, and with sample proportion p^calculated from the data, an approximate level C confidence interval for p is given by
  • CI: p^± m, with m = zŜE = z√p^(1 – p^)/n
  • Use this method when the number of successes and the number of failures are both at least 15
  • Medication side effects example
  • Arthritis is a painful inflammation of the joints
  • Experiments tested the side effects of pain relievers with arthritis patients to determine what proportion of patients suffer side effects
  • A 90% confidence interval computed calculating the population proportion of arthritis patients who suffer from "adverse symptoms”.
  • Serious side effects of ibuprofen:
  • Allergic reactions
  • Muscle cramps, numbness, or tingling
  • Ulcers in the mouth
  • Rapid weight gain (fluid retention)
  • Seizures
  • Black, bloody, or tarry stools
  • Blood in urine or vomit
  • Decreased hearing or ringing in the ears
  • Jaundice
  • Abdominal cramping, indigestion, or heartburn
  • Less serious side effects of ibuprofen
  • Dizziness or headache
  • Nausea, gaseousness, diarrhea, or constipation
  • Depression
  • Fatigue or weakness
  • Dry mouth
  • Irregular menstrual periods
  • From a sample (n=440), 23 patients reported side effects
  • p^= 23/440 = 0.052
  • For a 90% confidence level, z*=1.645
  • m = 1.645*√0.052 (1 – 0.052)/440 = 0.017
  • 90% CI for p: 0.052 ± 0.017
  • Thus, it can be determined that with 90% confidence, that between 3.5% and 6.9% of arthritis patients taking the mentioned medicine will experience some adverse symptoms

Plus Four Confidence Interval for p

  • The plus four method gives more accurate confidence intervals than the large sample method
  • The method works as if four additional observations were made, where two were successes and two were failures
  • The new sample size: n + 4, and the count of successes: X + 2
  • The "plus four" estimate of p is: p̃ = (count of successes + 2) / (count of all observations + 4)
  • The approximate level C confidence interval is CI: p̃ ± m, with m = zŜE = z√p̃(1 – p̃)/(n + 4)
  • This method is best employed when C is at least 90% and sample size is at least 10
  • Arthritis example with 90% confidence interval for the population proportion of arthritis patients who suffer "adverse symptoms.”
  • The value of the “plus four” estimate of p is: p̃ =(23+2)/(440+4)= 25/444 ≈ 0.056
  • An approximate 90% confidence interval for p using the plus four method is:
  • m = 1.645*√0.056(1 – 0.056)/444 = 0.018
  • 90% CI for p: 0.056 ± 0.018.
  • Therefore, with 90% confidence, means that between 3.8% and 7.4% of the population of arthritis patients taking the specific pain medication experience adverse symptoms

Choosing the Sample Size

  • In some cases, a sample size needs to be chosen to achieve a specified margin of error
  • The sampling distribution of p^is a function of the unknown population proportion p; therefore, a likely value for p must be guessed, as p*
  • p~N (p,√p(1-p)/n)→n=((z^)/m)^2 p^(1-p^*)
  • Make an educated guess, or use p* = 0.5 for the most conservative estimate
  • Example
  • Need to find the sample size to achieve a margin of error no more than 0.01 (1 percentage point) with a 90% confidence level
  • Could use 0.5 for the guessed p*, because the drug has been approved for sale over the counter, it can be assumed less than 10% of patients will suffer symptoms, a better guess than 50%
  • The 90% confidence level z*= 1.645.
  • n=((1.645)/0.01)^2 (0.1)(0.9)=2434.4
  • To obtain a margin of error no more than 0.01, need a sample size n of at least 2435 arthritis patients, but using 0.5 for the guess would have resulted in a sample size of 6766 patients

Hypothesis Tests for p

  • When testing, Ho: p = po (a given value being tested).
  • If Ho is true, the sampling distribution is known.
  • The test statistic is the standardized value of p^
  • z = (p^-po)/√(po(1-po)/n
  • Valid when both expected counts/ successes npo and expected failure n(1-po) are each 10 or larger
  • The P-value is the probability, if Ho was true, of obtaining a test statistic like the one computed or more extreme in the direction of Ha
  • Ha: p > po is P(Z ≥ z)
  • Ha: p < po is P(Z ≤ z)
  • Ha: p ≠ po is 2P(Z > |z|)
  • Aphid Example
  • Live aphids dropped upside-down, landed on their ventral side in 95% of trials versus dead aphids who landed on their ventral side 52.2% of the trials
  • Test for evidence (at significance level 5%) to see if live aphids land right side up more often than chance
  • Test “chance” to see if it would be 50% ventral landings.
  • Test with Ho: p = 0.5 versus Ha: p > 0.5
  • z =. (0.95 -0.5)/√(0.5× 0.5)/20≈4.02
  • The expected counts of success and failure are each 10, so the z procedure is valid.
  • The test P-value is P(z = 4.02). From Table B, P = 1 – P(z < 4.02) < 0.0002 and is highly significant.
  • Therefore reject Ho due to strong evidence (P < 0.0002)
  • Thus, the righting behavior of live aphids is better than chance.
  • Mendels Test
  • States that crossing dominant and recessive homozygote parents yields a second generation with 75% dominant traits
  • When Mendel crossed pure breeds of plants producing smooth peas and plants producing wrinkled peas, the second generation (F2), was made of 5474 smooth peas and 1850 wrinkled peas
  • Test for evidence that the smooth peas in the F2 population is not at 75%
  • Test: Ho: p = 0.75 versus Ha: p ≠ 0.75
  • The sample proportion is p^ = 5474/ (5474 + 1850)= 0.7474
  • z = (0.7474 -0.75)/√(0.75 × 0.25)/7324= -0.513
  • From Table B, it is determined that P = 2P(z < -0.51) = 2 × 0.3050 = 0.61 (not significant data)
  • Therefore, the claim cannot be rejected because the data are consistent with a dominant-recessive genetic model.

Previous on Biostats: From the exam

  • Sample space: A list or description of all possible outcomes of a random process
  • An event is a subset of the sample space
  • P(A or B) = P(A) + P(B) – P(A and B)
  • The probability that a randomly chosen person will test positive depends on True among patients and False among healthy patients
  • Concepts regarding parameter vs. statistic are relevant
  • Margin of error

Studying That Suits You

Use AI to generate personalized quizzes and flashcards to suit your learning preferences.

Quiz Team

Related Documents

More Like This

Use Quizgecko on...
Browser
Browser