Chi-Square (χ²) Test PDF

Summary

This document provides an overview of the Chi-Square (χ²) test, a statistical method used to analyze categorical data. It explains the types of Chi-Square tests, including goodness-of-fit and tests of independence, and covers concepts like degrees of freedom and significance levels. The document also includes example scenarios and formulas related to the calculations of the Chi-Square test.

Full Transcript

CHI-SQUARE (Χ²) TEST CATEGORICAL DATA Χ2 (chi-square) test: χ2 value is computed A statistical test to determine if there is a significant association between two categorical variables. Types: Goodness-of-Fit Test: Checks if sample data matches a population. (Observed  Expected...

CHI-SQUARE (Χ²) TEST CATEGORICAL DATA Χ2 (chi-square) test: χ2 value is computed A statistical test to determine if there is a significant association between two categorical variables. Types: Goodness-of-Fit Test: Checks if sample data matches a population. (Observed  Expected ) 2 Test of Independence: 2 Assesses if two variables are Expected independent. CATEGORICAL DATA Categorical Data Frequencies Hypothesis Testing Basics Degrees of Freedom (d f) Significance Level (α) CATEGORICAL DATA Data that can be divided into groups or categories (e.g., gender, brand preference, satisfaction levels). Categorical data is the statistical data comprising categorical variables of data that are converted into categories. One of the examples is a grouped data. More precisely, categorical data could be derived from qualitative data analysis that are countable, or from quantitative data analysis grouped within given intervals. Example: Yes/No responses, product categories. CATEGORICAL DATA Suppose a survey asks 100 people which brand they prefer. The responses are: Brand A: 40 people Brand B: 35 people Brand C: 25 people HYPOTHESIS TESTING BASICS Hypothesis: Regular exercise leads to weight loss. Null Hypothesis: Regular exercise does not lead to weight loss (the average weight of individuals who exercise regularly is equal to those who do not). Alternative Hypothesis: Regular exercise does lead to weight loss (the average weight of individuals who exercise regularly is less than that of those who do not). HYPOTHESIS TESTING BASICS Hypothesis testing is a statistical method used to make inferences or decisions about a population based on sample data. It involves formulating and testing assumptions (hypotheses) about the population. Null Hypothesis (H0​): The default assumption that there is no effect, no difference, or no relationship in the population. Example: "There is no preference among brands." Alternative Hypothesis (Ha): The competing claim, stating that there is an effect, a difference, or a relationship. Example: "There is a preference among brands." FREQUENCIES Observed Frequencies (O) These are the actual counts or values collected directly from your data. Example: Suppose a survey asks 100 people which brand they prefer. The responses are:  Brand A: 40 people  Brand B: 35 people  Brand C: 25 people Here, OA​=40, OB=35, OC​=25 FREQUENCIES Expected frequency (E) the null hypothesis (𝐻0) is true. These are the theoretical values under the assumption that For Goodness-of-Fit Test: FREQUENCIES For Test of Independence: To test whether two categorical variables are independent or associated. Product Prefer Not Prefer Total FREQUENCIES For Test of Independence: To test whether two categorical variables are independent or associated. Feature Goodness-of-Fit Test Test of Independence DEGREES OF FREEDOM Imagine degrees of freedom (df) as the "choices you have left" when analyzing data after considering certain restrictions or rules. The degree of freedom (df) is a key concept in statistics that refers to the number of independent pieces of information available to estimate a parameter or calculate a statistic. It varies depending on the test being performed. It adjusts for the number of constraints or restrictions imposed on the data. Helps calculate accurate critical values and p-values for hypothesis tests. DEGREES OF FREEDOM Goodness-of-Fit Test: TYPE I ERROR FALSE POSITIVES Definition: Rejecting the null hypothesis (H0​) when it is actually true. Example: Concluding a new drug works better when it doesn’t. Impact: False positive; detecting an effect that doesn’t exist. Controlled by: Significance level (α). TYPE II ERROR FALSE NEGATIVES Definition: Failing to reject the null hypothesis (H0) when it is actually false. Example: Concluding a new drug has no effect when it actually does. Impact: False negative; missing a real effect. SIGNIFICANCE LEVEL The significance level (α) is the probability of rejecting the null hypothesis when it is actually true. It is also known as the probability of a Type I error. Type I error: Rejecting H0​when H0​is true. Common Values: The most common significance levels used are: α=0.05: A 5% risk of concluding that there is a significant effect when there is none. 𝛼=0.01: A 1% risk of making a Type I error. 𝛼=0.10: A 10% risk, often used in exploratory research. SIGNIFICANCE LEVEL P≤ 0.05; Hypothesis rejected P>0.05; Hypothesis Accepted STEPS TO PERFORM CHI SQUARE TEST

Use Quizgecko on...
Browser
Browser