Lecture 7 - Hypothesis Testing for Nominal and Ordinal Variables (Chi Square) PDF
Document Details
Uploaded by ConscientiousEvergreenForest1127
Toronto Metropolitan University
2024
Michael E. Campbell
Tags
Summary
This lecture covers hypothesis testing for nominal and ordinal variables, specifically using the Chi-Square test. It explains the concept of hypotheses, null hypotheses, and demonstrates how to use these for research, providing examples and a detailed step-by-step approach.
Full Transcript
Quantitative Research Methods in Political Science Lecture 7: Hypothesis Testing for Nominal and Ordinal Variables (Chi Square) Course Instructor: Michael E. Campbell Course Number: PSCI 2702 (A) Date: 10/31/2024 ...
Quantitative Research Methods in Political Science Lecture 7: Hypothesis Testing for Nominal and Ordinal Variables (Chi Square) Course Instructor: Michael E. Campbell Course Number: PSCI 2702 (A) Date: 10/31/2024 Hypotheses Hypotheses must be testable (only the most important elements should be identified) A testable hypothesis should be a clear and measurable statement that can be supported or refuted through experimentation or observation A hypothesis “a statement about the relationship between variables that is derived from a theory. Hypotheses are more specific than theories, and all terms and concepts are fully defined” (Healey, Donoghue, and Prus 2023, 33). Hypotheses Cont’d 1. First, it is a “statement about the relationship between variables…” Therefore, a hypothesis will express the expected relationship between variables 2. Second, the statement of this relationship will be “derived from a theory…” Therefore, a theory should underpin the type of relationship you expect to find 3. Third, “hypotheses are more specific than theories, and all terms and concepts are defined.” Therefore, a hypothesis must be as straightforward as possible (requires conceptualization/operationalization) Developing Testable Hypothesis Example RQ: Does legal enforcement affect the level of unethical behavior in a country? Must conceptualize…. 1. Legal enforcement systematic laws that are regularly enforced 2. Unethical behavior the abuse of private office for personal or partisan gain Operationalize and move forward to variable selection… Developing Testable Hypothesis Example Cont’d Abuse of Private Office for Personal or Systematic Laws that are Partisan Gain Regularly Enforced (X) (Y) Developing Testable Hypothesis Example Cont’d A bad hypothesis: “The level of unethical behavior is problematic, and it is possible that it may be higher in countries where governments enforce the laws less, as opposed to countries where governments do more to enforce laws, and because individuals will be less likely to maximize their preferences if they deviate from mandated laws, this may cause them to participate in unethical behavior.” It would be very difficult to test this empirically… Developing A testable hypothesis: “Transparent Laws Testable with Predictable Enforcement cause lower Hypothesis levels of executive bribery and corruption” Example It simply means this: : X > Y (i.e., the higher the level of X, the lower the level of Y) Cont’d You want to be clear – so that you can test with regularity and results can be replicated Null vs. Research Hypotheses Researchers typically believe that there will be a difference between groups or a relationship between variables… In our previous example, we hypothesized that there is a relationship between transparent laws with predictable enforcement and executive corruption We might also hypothesize that “Students who study in groups score higher on exams than students who study alone.” These are ‘Research Hypotheses’ The notation for the Research Hypothesis is () Null vs. Research Hypotheses Cont’d A null hypothesis is the exact opposite of a research hypothesis (or vice versa) It is a statement of “no difference” or “no relationship” The notation for the Null Hypothesis is () Null vs. Research Hypotheses Examples Example 1: = Transparent Laws with Predictable Enforcement cause lower levels of Executive Bribery and Corrupt Exchanges = Transparent Laws with Predictable Enforcement have no effect on levels of Executive Bribery and Corrupt Exchanges Example 2: = Students who study in groups will score higher on exams than students who study alone. = There is no difference in exam scores between students who study in groups and students who study alone Hypothesis Testing Hypothesis Testing Hypothesis Testing: “an inferential statistical procedure designed to test for the relationship between variables, or a difference between groups of cases, at the level of the population” (Healey, Donoghue, and Prus 2023, 213). Also called ‘Significance Testing’ – because we are trying to find statistically significant results When we find statistically significant results, we can reject the Null Hypothesis Compares empirical reality (i.e., samples) to a standard of what we would expect if no relationship or difference between groups of cases existed in reality Five-Step Model for Hypothesis Testing Research is defined as “any process by which information is systematically and carefully gathered for the purpose of answering questions, examining ideas, or testing theories” (Healey, Donoghue, and Prus 2023, 10). Not all hypothesis test/techniques are the same, but they follow the same systematic processes 1. Make assumptions and meet test requirements 2. State the null hypothesis 3. Select the sampling distribution and establish the critical region 4. Compute the test (obtained) statistic 5. Make a decision and interpret the results of the test Step 1 – Make Assumptions and Meet Test Requirements Assumptions must always be made when testing hypotheses… Assumptions may change depending on techniques used A constant assumption is that samples must be randomly selected Other assumptions include the shape of the population distribution, levels of measurement, etc… Step 2 – State the Null Hypothesis You must state the null hypothesis () (i.e., a statement that indicates “no difference” or “no relationship” Opposite the null hypothesis is the research hypothesis () that identifies expected relationship or difference Primary goal of hypothesis testing is to reject the null… We always begin our analyses assuming that the null hypothesis is true and we gather empirical evidence to reject it Step 3 – Select Sampling Distribution and Establish Critical Region You must select a sampling distribution depending on the technique/type of test you are conducting (e.g., Z distribution, t distribution, Chi Square distribution, F distribution) You also need to select your critical region (i.e., the area under the sampling distribution that include unlikely sample outcomes) If the sample we have does not have values in this critical region, we will fail to reject the null hypothesis To select the critical region, we need to set alpha () at a specific level (e.g., 0.10, 0.05, 0.01) Step 4 – Compute the Test Statistic The sample value must be converted into test score (e.g., a Z score, a t score, etc.) The resulting value when we compute a test score is called the ‘obtained score’ This is different than the critical value (critical value marks beginning of critical region) We must also take the p value into account… The p Value p value is not the same as alpha () , but concepts are closely related When we set alpha, it tells us the amount of probability, or the proportion of the area in the critical region of the sampling distribution Conversely, the p value is “the probability, or proportion of area in the sampling distribution beyond the obtained score” (Healey, Donoghue, and Prus 2023, 216). This means that the p value is the product of our actual data, while the critical value and its alpha level are theoretical and used to compare with the p value Quite simply “the critical value is the point in the sampling distribution that is compared to the test (obtained) score to decide if the null hypothesis should be rejected” (Healey, Donoghue, Prus 2023, 216). p Value Cont’d Step 5 – Make a Decision and Interpret the Results of the Test The decision we must make is “do we reject or fail to reject the null hypothesis?” Compare the test statistic (i.e., the obtained score) to the critical value If the test statistic is larger than the critical value, it falls into the critical region and we can reject the null hypothesis (if smaller, we cannot reject the null hypothesis) Conversely, rather than comparing the test statistic, we can also look to the p value If the p value is smaller than alpha (), we can reject null If the p value is larger than alpha (), we cannot reject null Step 5 – Make a Decision and Interpret the Results of Note: if you are using statistical software, you the Test will almost always use the p value to determine Cont’d to decide whether to reject the null or not. When we set alpha (), we are defining the critical region - which is the area under the sampling distribution that includes unlikely sample outcomes But alpha () also tells us the probability that the decision to reject the null is incorrect Type I A Type I error (also called Alpha Error) occurs Errors when we incorrectly reject the null hypothesis (i.e., we reject a null hypothesis that is true) It can be thought of as a ‘false positive’ Example: finding a person guilty in court who is innocent To avoid Type I errors, set alpha () at small level By setting alpha at lower level, critical region is smaller and the distance between the mean of the sampling distribution and the critical region Type I is larger Errors When a sample outcome falls into the critical Cont’d region, we can reject the null hypothesis By setting alpha () at a lower level, it becomes harder to reject the null hypothesis In turn, it becomes more difficult to reject the null hypothesis A Type II error occurs when we fail to reject a null hypothesis that is false It can be thought of as a ‘false negative’ Type II When we decrease the size of alpha (), the non- critical region becomes larger Errors Consequently, the less likely the value will be to fall into the critical region Example: finding a person innocent in court who is guilty Type I and II Errors Summarized Type I Errors are considered worse than Type II Errors The two types of error are inversely related The more you work to reduce Type I Error, the more likely you are to make Type II Error (and vice versa) It is impossible to reduce the risk of both errors at the same time There will always be risk when conducting hypothesis tests Most common alpha levels = 0.10, 0.05, and 0.01 Chi Square Introduction to Chi Square Most frequently used hypothesis test Chi square tests can use variables at any level of measurement (nominal, ordinal, interval) Most appropriate for nominal and ordinal level variables Is ‘Non-Parametric’ - i.e., it requires no assumptions about the shape of the population distribution Bivariate Tables There are two dimensions to bivariate tables (horizontal and vertical) To compute Chi Square (), we use bivariate tables We refer to the horizontal dimension as Rows Bivariate tables: “display the scores of We refer to the vertical dimension as cases on two different variables at the same Columns time” (Healey, Donoghue, and Prus 2023, 220). The intersection of the rows and columns are called Cells They are used to determine if there is a significant relationship between variables Each column or row represents a score on a variable, and the cells represent the various combined scores on each variable Bivariate Table Example In this example, a researcher is studying the relationship between people’s place of birth and their likelihood of volunteering Variables used: 1. Place of birth (X) 2. Level of involvement in volunteer association (Y) Each variable has two categories 1. One can be born in Canada or abroad 2. One can have a high or low level of participation in volunteer associations Bivariate Table Example Cont’d Independent Variable placed in columns, dependent in rows Place of birth (X) in columns Participation in volunteer association (Y) in rows Subtotals added for each row and column – these are marginals When you title a bivariate table, the DV should come first In this example, the table is called “Level of Participation in Volunteer Associations by Place of Birth for 100 Citizens” Bivariate Table Example Cont’d Notice these cells are empty… We need to classify each member of the sample in terms of their place of birth and their level of participation Since each variable has only two scores, there are four possible combinations: 1. people born in Canada with high levels of participation 2. people born in Canada with low levels of participation 3. people born outside Canada with high levels of participation 4. people born outside Canada with low levels of participation The Logic of Chi Square Chi Square has multiple uses (goodness of fit tests, tests for homogeneity, assessing model fit, and tests for independence) We are concerned with Tests for Independence Independence (in this context) refers to the relationship between variables Two variables are considered independent “if the classification of a case into a particular category of one variable has no effect on the probability that the case will fall into any particular category on the second variable” (Healey, Donoghue, and Prus 2023, 222). In other words, variables are ‘independent’ if they have no relationship to one another Note: do not confuse ‘independence’ with ‘independent variable’ – they are The Logic of Chi Square Cont’d If variables are independent of one another, it means the cell frequencies are determined by random chance In our example, this means about half the respondents from Canada would rank high on participation, while the other half of Canadians rank low The same pattern would present itself for people not born in Canada This pattern (see right) means that place of birth has no effect on volunteer participation (i.e., the variables are independent) The Logic of Chi Square When we use Chi Square, the null hypothesis is that the variables are If the null hypothesis is true, we independent should see little difference between expected and observed frequencies Always remember, the null hypothesis is a statement of “no difference” or “no relationship” – i.e., independence If the null hypothesis is false, we should see large difference between expected and observed frequencies The test first computing cell frequencies that we would expect to find if only random chance were operating That is to say, “the greater the differences between expected () and These are referred to as expected observed () frequencies, the less likely frequencies () variables are independent and the more likely we will be able to reject the null hypothesis” (Healey, These expected frequencies are compared Donoghue, and Prus 2023, 222-223). cell by cell with observed frequencies () Computing Chi Square When we use Chi Square, it produces a test statistic Notation for this is: (obtained) The value of (obtained) is compared to (critical) (See Appendix C or p. 506 in Textbook) Chi Square is “a family of sampling distributions based on degrees of freedom – there is a unique number for each number of degrees of freedom” (Healey, Donoghue, and Prus 2023, 223). Unlike t distribution, chi square calculates degrees of freedom (df) based on the number of rows and columns Computing Chi Square Cont’d To calculate df when using Chi Square, the formula is: df = (rows – 1)(columns-1) For example, if you have a table with two rows and two columns, the degrees of freedom will always be 1 df = (2-1)(2-1) df = (1)(1) df = 1 Computing Chi Square Cont’d Figure 7.2 shows us the distributions associated with different degrees of freedom The shaded area represents where the critical region begins (i.e., what alpha is set at). Here, alpha is set at 0.05 If you look at Appendix C (see textbook) and look at the df associated with the different alphas, you will find various corresponding critical values (i.e., the point at which the critical region begins) For 1 df, the critical value is 3.841 For 5 df, the critical value is 11.070 Computing the Test Statistic To use the chi square sampling distribution to conduct a hypothesis test, you need to compute the test statistic () The formula for this is: In this equation: = the cell frequencies observed in the bivariate table = the cell frequencies that would be expected if the variables were independent You must work cell-by-cell to solve the formula Computing the Test Statistic Cont’d To find the expected frequency for each cell… The formula for this is: Computing Test Statistic Example Let’s say you have a bivariate table that shows you 100 social work graduates whose undergraduate programs were accredited or not accredited and whether these graduates were hired in social work programs within three months of graduation Computing Test Statistic Example Cont’d The independent variable is Accreditation Status – so it is in the columns The dependent variable is Employment Status – so it is in the rows Computing Test Statistic Example Cont’d Upper Left Upper Right First, compute the expected = 22 = 18 frequencies, using the formula… Lower Left Lower Right = 33 = 27 Computing Test Statistic Example Cont’d You can see the row and column marginals are the same as before The row and column marginals for expected frequencies must always equal those of the observed frequencies With all this information, we can now compute our chi square value… Computing Test Statistic Example Cont’d Remember, you must use every cell in the bivariate table… To compute test statistic, use formula: In this equation: = the cell frequencies observed in the bivariate table = the cell frequencies that would be expected if the variables were = 10.78 independent The Chi Square Test For Independence The Chi Square Test for Independence To conduct the Chi Square Test for Independence, we will use data from the previous example… Step 1 – Make Assumptions and Meet Test Requirements 1. Select random sample using EPSEM 2. Measure variables at nominal or ordinal level Model: Random Sampling Level of Measurement is nominal or ordinal Step 2 – State the Null Hypothesis (): The two variables are independent (): The two variables are dependent Or… (): Accreditation Status has no effect on Employment Status (): Accreditation Status affects the likelihood of Employment Status Step 3 – Select the Sampling Distribution and Establish Critical Region Since we are using Chi Square, we select the Chi Square distribution Chi Square is positively skewed – so critical region is established in the upper tail of sampling Model: distribution (i.e., the right of the sampling Sampling Distribution = (chi square) distribution) distribution Alpha () = 0.05 Find the degrees of freedom (since it is a 2x2 Degrees of Freedom = 0.05 table, we know df = 1) (critical) = 3.841 Set alpha at 0.05 (but we can set it at 0.10 or 0.01) Using Distribution of Chi Square Note: to find (critical), use Appendix C in textbook. Look for df 1 and alpha () 0.05 Step 4 – Compute the Test Statistic Run calculations to determine )… We see that is 10.78 Step 5 – Make a Decision and Interpret the Results of the Test Compare the test statistic with the critical region… (critical) = 3.841 = 10.78 The test statistic (is larger than the critical value ((critical)) – meaning it falls in the critical region… Therefore, we can reject the null hypothesis! This means the pattern of cell frequencies in Table 7.5 is unlikely to have happened by random chance Therefore, the variables are dependent (there is a relationship between them) This tells us that the probability of securing employment in the field of social work is dependent on whether the program the student was in was accredited or not Column Percentages There is one problem: a significant chi square test tells us that the variables are likely dependent on each other in the population, but it provides no other details of the relationship Chi square does not tell us if it is the students in accredited programs or non- accredited programs who are more likely to find employment… To determine this, we need to compute column percentages To do this, we need to divide each cell frequency by the total number of cases in the column (i.e., the column marginal) Column Percentages Cont’d Column percentages give us more details about the relationship and tell us how X affects Y We see that 55% of students working as social workers come from accredited programs, while only 22% of students from non-accredited programs are working as social workers Limitations of Chi Square It is difficult to interpret bivariate When sample size is small, it can no longer tables when variables have too be assumed the sampling distributions of all many categories possible sample distributions are described by the chi square distribution General rule, both variables Small sample sizes are observed when a should have four or fewer higher percentages of cells have expected scores/response categories frequencies () of five or less When sample sizes are large, the probability Sample size also affects chi of rejecting a true null hypothesis increases square (Type I error) This happens because increases at the same rate as the sample size It is much easier to compute Chi Square in SPSS For example, let’s go back to our first hypothesis: “Transparent Laws with Predictable Enforcement cause lower levels of executive bribery and corruption” Note: original variables transformed (now have only three categories) Computin g Chi Square in SPSS Computin g Chi Square in As we can see here, in 84.8% of countries where there is SPSS “No Enforcement” of laws, executive corruption is Cont’d common; compared to only 12.5% countries where there is “High Enforcement” Conversely, in only 3% of countries where there is “No Enforcement” is corruption “Rare”, compared to 58.8% of countries where there is “High Enforcement” p value Computin We also see that the p value is smaller than 0.001 – g Chi indicating statistical significance Square in This tells us that we can reject the null hypothesis – i.e., the variables are dependent (i.e., related to one another) SPSS Notice we didn’t need to use to determine this, only the p Cont’d value Note: as you can see no cells have an expected count less than 5 – meaning that all possible sample outcomes can be computed