BIOSTATS 3.5 - CH. 21: THE CHI-SQUARE TEST FOR GOODNESS OF FIT

Choose a study mode

Play Quiz
Study Flashcards
Spaced Repetition
Chat to Lesson

Podcast

Play an AI-generated podcast conversation about this lesson

Questions and Answers

A researcher is testing if the number of births is the same across all days of the week. What null hypothesis is appropriate for a chi-square goodness-of-fit test?

  • The proportion of births is equal across all days of the week. (correct)
  • Births are normally distributed across the week.
  • The mean number of births is equal to the median number of births.
  • The number of births on weekends is equal to the number of births on weekdays.

In a chi-square goodness-of-fit test, what do 'observed counts' represent?

  • The standardized residuals.
  • The actual number of observations in each category. (correct)
  • The degrees of freedom.
  • The counts expected under the null hypothesis.

In the chi-square statistic formula, what do large values of the chi-square statistic indicate?

  • That the observed distribution closely matches the expected distribution.
  • Strong evidence in favor of the null hypothesis.
  • That the sample size is too small.
  • Strong deviations from the expected distribution under the null hypothesis. (correct)

The chi-square distributions are a family of distributions described by what parameter?

<p>The degrees of freedom. (A)</p> Signup and view all the answers

A researcher conducts a chi-square goodness-of-fit test and obtains a p-value of 0.015. If the significance level is 0.05, what conclusion can be drawn?

<p>Reject the null hypothesis. (C)</p> Signup and view all the answers

In a genetics experiment, the expected ratio of phenotypes is 1:2:1. If the observed counts are 22, 50, and 28, respectively, which calculation is needed to determine the chi-square statistic for the first category?

<p>$(22-25)^2 / 25$ (D)</p> Signup and view all the answers

What is the purpose of comparing observed and expected counts in a chi-square goodness-of-fit test?

<p>To assess how well the observed data fit the expected distribution. (B)</p> Signup and view all the answers

Consider a scenario where the individual chi-square components are very large. What do these large components indicate?

<p>Each category significantly deviates from what is expected under the null hypothesis. (D)</p> Signup and view all the answers

What does a non-significant p-value in a chi-square goodness-of-fit test imply?

<p>The data are not inconsistent with the model. (A)</p> Signup and view all the answers

Which of the following scenarios violates the conditions for a chi-square goodness-of-fit test?

<p>More than 20% of the expected counts are less than 5. (B)</p> Signup and view all the answers

In a plant genetics experiment, a researcher expects a 3:1 ratio of yellow to green plants. The researcher observes 84 yellow and 16 green plants. What should the researcher do next to test the hypothesis?

<p>Conduct a chi-square goodness-of-fit test. (D)</p> Signup and view all the answers

A chi-square test yields a test statistic of 4.32 with 1 degree of freedom. Using this information, select the statement which best reflects the result.

<p>The p-value is less than 0.05, suggesting a statistically significant result. (B)</p> Signup and view all the answers

A researcher hypothesizes that the distribution of eye colors in a population is 60% brown, 30% blue, and 10% green. After sampling 200 individuals, they observe 110 brown, 50 blue, and 40 green-eyed people. What are the expected counts for each category under the null hypothesis?

<p>120, 60, 20 (D)</p> Signup and view all the answers

A researcher investigates the distribution of M&M colors and finds a non-significant p-value. What is the most appropriate conclusion?

<p>There is insufficient evidence to conclude that the observed distribution differs from the expected distribution. (C)</p> Signup and view all the answers

What information can be gleaned from the individual components of a chi-square statistic?

<p>Which categories contribute most to the overall chi-square value. (B)</p> Signup and view all the answers

Why is it important to ensure that expected counts are not too small when conducting a chi-square test?

<p>Small expected counts can lead to an inflated chi-square statistic and an inaccurate p-value. (A)</p> Signup and view all the answers

When can the chi-square test for goodness of fit be appropriately used?

<p>When the data consist of a single SRS from a population where the variable is categorical. (D)</p> Signup and view all the answers

In a chi-square test, having all expected counts greater than or equal to 1.0 and no more than 20% of the expected counts less than 5.0 ensures what?

<p>The chi-square test is valid. (B)</p> Signup and view all the answers

What is a common mistake to avoid when interpreting a non-significant result in a chi-square goodness-of-fit test?

<p>Assuming the null hypothesis has been validated. (C)</p> Signup and view all the answers

How does the chi-square goodness-of-fit test extend beyond just assessing a 1:1 ratio?

<p>It can accommodate multiple categories and complex ratios like 12:7:4:1. (C)</p> Signup and view all the answers

Flashcards

Chi-square Test

A test for a categorical variable (one SRS) with any number of levels.

Expected Counts

The number of observations of each type that we would expect to see if the null hypothesis were true.

Observed Counts

The actual number of observations of each type.

Chi-Square Statistic

Compares observed and expected counts. High values mean strong deviations from the expected distribution under the null hypothesis.

Signup and view all the flashcards

Chi-square Distributions

A family of distributions that take only positive values, are skewed to the right, and are described by a specific degrees of freedom.

Signup and view all the flashcards

Conditions for Chi-Square

Must have a single SRS, variable must be categorical with mutually exclusive levels, all expected counts >= 1.0, no >20% expected counts < 5.0

Signup and view all the flashcards

Interpreting Chi-Square

Largest components tell which conditions most differ from the expected Ho. Helps to compare observed and expected counts/proportions.

Signup and view all the flashcards

Study Notes

  • The chi-square test for goodness of fit is covered in Chapter 21

Comparing Two Proportions

  • Two-sample problems involve proportions
  • The sampling distribution assesses the difference between two proportions.
  • Large sample confidence intervals are used for comparing proportions.
  • More accurate confidence intervals assist in comparing proportions.
  • Hypothesis tests facilitate the comparison of proportions.
  • Relative risk and odds ratio relate to comparing two proportions

Large Sample CI for Two Proportions

  • Approximate level C confidence interval (CI) for p1 - p2 with two independent simple random samples (SRS) of sizes n1 and n2, and sample proportions of successes pÌ‚1 and pÌ‚2: (pÌ‚1 - pÌ‚2) ± m
  • m is the margin of error
  • m = zSEdiff = z** √(pÌ‚1(1 - pÌ‚1)/n1 + pÌ‚2(1 - pÌ‚2)/n2)
  • C signifies the area under the standard Normal curve between -z* and z*.
  • Use the method when the number of successes and failures are at least 10 in each sample.

Hypothesis Tests for Two Proportions

  • The null hypothesis (H0) is that p1 = p2 = p
  • The information from both samples can be pooled to estimate p if H0 is true, assuming sampling is done twice from the same population.
  • The pooled sample proportion is pÌ‚ = total successes / total observations = (count1 + count2) / (n1 + n2)
  • Z = pÌ‚1 - pÌ‚2 / √(pÌ‚(1 - pÌ‚)(1/n1 + 1/n2))
  • This is appropriate when all counts (successes and failures in each sample) are 5 or more.

Chi-Square Test for Goodness of Fit

  • The chi-square test for goodness of fit is applied to a categorical variable (one SRS) with any number of levels.
  • The null hypothesis posits that all population proportions are equal (uniform hypothesis).
  • A goodness-of-fit hypothesis example question: Are hospital births uniformly distributed in the week? H0: p1 = p2 = p3 = p4 = p5 = p6 = p7 = 1/7
  • Population proportions may be equal to specific values, provided they sum to 1 in H0. Co-dominant phenotype H0 example: When crossing homozygote parents expressing two co-dominant phenotypes A and B, expect in F2 H0: pA = 1/4, pAB = 1/2, pB = 1/4, where AB is an intermediate phenotype.
  • The chi-square test is used when data are categorical. It checks how different the observed data are from expectation under H0.

Chi-Square Statistic

  • The chi-square statistic (X2) compares observed and expected counts.
  • Observed counts: The actual number of observations of each type.
  • Expected counts: The number of observations expected of each type under the null hypothesis.
  • X2 = Σ ((observed count - expected count)² / expected count), calculated separately for each type and then summed.
  • Statistically significant large X2 values represent strong deviations from the expected distribution under H0.

Chi-Square Distributions

  • Chi-square distributions are a family of right-skewed distributions that take only positive values, characterized by specific degrees of freedom.
  • Published tables and software provide upper-tail area for critical values across many chi-square distributions.

Chi-Square Test for Goodness of Fit

  • With k proportions, the chi-square statistic for goodness of fit measures the divergence between observed and expected counts.
  • It follows the chi-square distribution with k - 1 degrees of freedom, as described by: X2 = Σ ((count of outcome i – npio)² / npio)
  • The P-value is the tail area under the chi-square distribution, where df = k − 1.

Goodness of Fit Example

  • Aphids avoid ladybugs by dropping off leaves. An experiment examines this mechanism.
  • In the experiment, live aphids landed on their ventral side in 95% of the trials (19 out of 20) when dropped upside-down from tweezers; dead aphids landed on their ventral side in 52.2% of the trials (12 out of 23).
  • Question: At a 5% significance level, is there evidence live aphids’ landing right side up (on their ventral side) is not a chance event?
  • H0: pventral = 0.5 and pdorsal = 0.5
  • Ha: H0 is not true.
  • Expected counts are large enough (both are greater than 5.0).
  • X2 contributions calculation: Σ (obs-exp)² / exp = 8.1 + 8.1 = 16.2. Since X2 > 12.12, P < 0.0005 (software presents P-value = 0.000057).
  • This is a highly significant result, there is very strong evidence the righting behavior of live aphids is not chance (P < 0.0005). The H0 is rejected.

Interpreting Significant Chi-Square Results

  • The individual values summed in the X2 statistic are the chi-square components (or contributions).
  • When the test is statistically significant, the largest components indicate conditions differing most from the expected H0.
  • Compare observed and expected counts to interpret the findings.
  • The actual proportions can be compared qualitatively in a graph.

Lack of Significance

  • A non-significant P-value is not conclusive.
  • H0 could be true (or not).
  • This is relevant in the chi-square goodness of fit test where we are often interested in the hypothesis that the data fit a particular model.
  • A significant P-value indicates the data do not follow that model.
  • A non-significant P-value is NOT validation of the null hypothesis.
  • It does NOT indicate data follow the hypothesized model, shows the data are not inconsistent with the model.

Lack of Significance Example

  • The Frizzle fowl chicken variety has curled feathers, a genetic cross between Frizzle and Leghorn (straight-feathered).
  • F1 chickens produced all had slightly frizzled feathers.
  • Phenotype counts include 23 Frizzled, 50 Slightly frizzled, and 20 Straight.
  • The most likely genetic model is that of a single gene locus with two co-dominant alleles producing a 1:2:1 ratio in F2.
  • This raises the question as to whether the data is consistent with such a model.
  • H0: pF = 1/4; psF = 2/4; ps = 1/4
  • Ha: H0 is not true
  • X2 = ((23-23.25)²/23.25)+((50-46.5)²/46.5)+((20-23.25)²/23.25) ≈ 0.72
  • Test results are 0.002688 for Frizzled, 0.263441 for Slightly frizzled, and 0.454301 for Straight

Conditions for the Goodness of Fit Test

  • The chi-square test for goodness of fit is used when there is a single SRS from a population.
  • The variable is categorical with k mutually exclusive levels.
  • The chi-square test is safely used when all expected counts have values ≥ 1.0, and no more than 20% of the k expected counts have values < 5.0.

Chi-Square Statistics Example

  • Consider a plant geneticist analyzing 100 progeny from a cross with a 3:1 phenotypic ratio hypothesis of yellow-flowered to green-flowered plants.
  • The observed ratio of 84 yellow: 16 green is compared to the 75 yellow: 25 green ratio predicted by the geneticist’s hypothesis.
  • The core question is whether the observed frequencies significantly deviate from the expected frequencies if the hypothesis were accurate.
  • The null hypothesis in this case is that the population has a 3:1 ratio of yellow-flowered to green-flowered plants.
  • The alternative hypothesis would be that the population has a flower-color ratio which is not 3 yellow: 1 green.
  • 100 observations are made: X2 = ((84 - 75)2 / 75) + ((16 - 25)2 / 25) = 4.320.
  • Larger disagreements between observed and expected frequencies result in larger X2 values. This calculation measures goodness of fit.
  • The test statistic generalizes for more than just 2 categories.
  • A X2 value of 0 indicates a perfect fit between the observed and the expected.

Studying That Suits You

Use AI to generate personalized quizzes and flashcards to suit your learning preferences.

Quiz Team

Related Documents

More Like This

Use Quizgecko on...
Browser
Browser