3003PSY Survey Design and Analysis in Psychology PDF
Document Details
Uploaded by MesmerizedPeridot
Griffith University
Tags
Summary
This document provides a mini-lecture on two-way chi-square tests for survey design and analysis in psychology. It covers the concept of independence of variables presented in contingency tables. The lecturer shows how to calculate expected frequencies and conduct chi-square tests by hand and uses example data.
Full Transcript
3003PSY Survey Design and Analysis in Psychology TWO-WAY CHI-SQUARE TWO-WAY CHI SQUARE AKA CHI SQUARE TEST OF INDEPENDENCE uRelations between variables can also be tested with the chi-square test uGood news: formula for chi-square the same uLess than good news: new formula for fe uyou’...
3003PSY Survey Design and Analysis in Psychology TWO-WAY CHI-SQUARE TWO-WAY CHI SQUARE AKA CHI SQUARE TEST OF INDEPENDENCE uRelations between variables can also be tested with the chi-square test uGood news: formula for chi-square the same uLess than good news: new formula for fe uyou’ll love it in the end uThe null hypothesis in this case is that the two variables are distributed independently ureferred to as c2 test of independence uAlso c2 contingency uOne variable must not depend on another CONTINGENCY TABLES Red M&Ms Blue M&Ms Green M&Ms Female 19 17 5 Male 16 18 9 uthe ‘cells’ of the above table (called a contingency or crosstabulation table) contain the frequencies of individuals in that particular sample and that particular category TWO-WAY CHI SQUARE AKA CHI SQUARE TEST OF INDEPENDENCE utwo variables are considered to be independent when the frequency distribution for one variable has the same shape for all levels of the second variable ui.e., salary is independent of gender TWO-WAY CHI SQUARE AKA CHI SQUARE TEST OF INDEPENDENCE utwo variables are considered to be independent when the frequency distribution for one variable has the same shape for all levels of the second variable ui.e., salary is independent of gender Independent (i.e., salary levels are not dependent on gender) Males Females frequency frequency TWO-WAY CHI SQUARE AKA CHI SQUARE TEST OF INDEPENDENCE utwo variables are considered to be independent when the frequency distribution for one variable has the same shape for all levels of the second variable ui.e., salary is dependent on gender Dependent (i.e., salary level depends on gender) Males Females frequency frequency TWO-WAY CHI SQUARE AKA CHI SQUARE TEST OF INDEPENDENCE One-way chi-square Two-way chi-square (aka goodness-of-fit) (aka test of independence) test two somewhat different questions One-way: generally, does the observed data fit the model? Two-way: are the two variables independent ? TWO-WAY CHI-SQUARE: AN EXAMPLE uLet's revisit our GRE-Q and Stats Exam performance uThis time we only know if students pass or not uIn this hypothetical exam the mark for a passing grade is 65% TWO-WAY CHI-SQUARE: AN EXAMPLE uRQ: Is there a relationship between class attendance (dichotomous) and passing/failing the exam in our sample? uare attendance and pass/fail status distributed independently or not? Less than Perfect Perfect Attendance Attendance Failed 6 0 Passed 5 9 EXPECTED FREQUENCIES Less than Perfect Perfect Attendance Attendance Failed 6 0 6 Passed 5 9 14 11 9 20 We are interested in the marginal totals of the table That is, the row and column totals and the grand total EXPECTED FREQUENCIES Less than Perfect Perfect Attendance Attendance Failed 6 Passed 14 11 9 20 Let’s look just at the margins for a minute uBy ignoring the frequencies within the body of the table and just focusing on the marginals, we are viewing the data from the perspective of the null (H0) THE LOGIC OF uif H0 is true, then the marginals are all we THE TEST OF need to understand the variables as they would be distributed independently INDEPENDENCE uHow do we obtain the necessary expected frequencies? uwe use the frequencies expected under independence EXPECTED FREQUENCIES Less than Perfect Perfect Attendance Attendance Failed 6 Passed 14 11 9 20 If the variables are independent, the relative proportions of those who passed/failed (70:30) should be mirrored in each level of the Attendance variable i.e., 70% of those with perfect attendance would pass and 30% would fail and 70% of those with less than perfect attendance would pass and 30% would fail EXPECTED FREQUENCIES Less than Perfect Perfect Attendance Attendance Failed 6 Passed 14 11 9 20 We know that 30% of the sample failed And that 70% of the sample passed EXPECTED FREQUENCIES Less than Perfect Perfect Attendance Attendance Failed 6 Passed 14 11 9 20 We know that 30% of the sample failed And that 70% of the sample passed CALCULATING fe FOR TWO-WAY TABLE fe = (col1 x row1)/grand tot fe = (col2 x row1)/grand tot (Row1)6 fe = (col1 x row2)/grand tot fe = (col2 x row2)/grand tot (Row2) 14 (Col1) 11 (Col2) 9 (Grand Total) 20 We only need the marginal totals and the grand total the expected frequencies (i.e., fe) represent the null For each cell: fe = (column total x row total)/grand total CALCULATING fe FOR TWO-WAY TABLE fe = (11 x 6)/20 fe = (9 x 6)/20 (Row1) 6 fe = (11 x 14)/20 fe = (9 x 14)/20 (Row2) 14 (Col1) 11 (Col2) 9 (Grand Total) 20 For each cell: fe = (column total x row total)/grand total We have essentially duplicated the table and replaced the obtained frequencies (i.e., our actual data) with the expected frequencies that represent the way the data ought to look under the null (i.e., when H0 is true) CALCULATING fe FOR TWO-WAY TABLE Less than Perfect Attendance Perfect Attendance fe = 3.3 fe = 2.7 (Row1) 6 fe = 7.7 fe = 6.3 (Row2) 14 (Col1) 11 (Col2) 9 (Grand Total) 20 Note that these all still sum to the same marginal totals they have to, by definition, as they are a restatement of the original data They have redistributed the 20 people evenly across the categories of both variables at once OBSERVED FREQ. : EXPECTED FREQ. Less than Perfect Perfect Attendance Attendance Failed 6 : 3.3 0 : 2.7 6 Passed 5 : 7.7 9 : 6.3 14 11 9 20 CALCULATING CHI-SQUARE Less than Perfect Perfect Attendance Attendance Failed 2.21 2.7 6 Passed 0.95 1.16 14 11 9 20 2 χ2 =∑ ( fo − fe) fe Calculating Chi-square CALCULATING CHI-SQUARE Less than Less than Perfect Perfect Perfect perfect Attendance Attendance Attendance Attendance Failed 2.21 2.7 6 Passed 0.95 1.16 14 11 9 20 2 χ2 =∑ ( fo − fe) fe E.g. ((5-7.7)*(5-7.7))/7.7 =.95 CALCULATING CHI-SQUARE Less than Perfect Perfect Attendance Attendance Failed 2.21 2.7 6 Passed 0.95 1.16 14 11 9 20 2 χ2 =∑ ( fo − fe) fe CALCULATING CHI-SQUARE Less than Perfect Perfect Attendance Attendance Failed 2.21 2.7 6 Passed 0.95 1.16 14 11 9 20 2 ( fo − fe) X2obt χ =∑ 2 = 2.21+ 0.95 + 2.7 + 1.16 fe = 7.02 CALCULATING CHI-SQUARE Less than Perfect Perfect Attendance Attendance Failed 2.21 2.7 6 Passed 0.95 1.16 14 11 9 20 2 ( fo − fe) X2obt = 7.02 χ =∑ 2 X2crit = 3.84 fe for a two-way chisquare, df = (Rows – 1)(Columns – 1) in our case: df = (2 – 1)(2 – 1) = 1 uCritical value for df = 1 is 3.84 up =.05, H0: true uOur obtained value of chi square of 7.02 is larger than 3.84. uTherefore we reject the null hypothesis of independence and conclude that our sample represents a population in which class attendance and grade (fail/pass) are associated ui.e., not independent CONCLUSION If we had predicted that attendance status and exam performance (Pass/fail) were related, we would be pleased to have found evidence in support of the prediction remember, statistical output has always had meaning taken out before we get it we need to supply the meaning CROSSTABS uInthis mini- lecture we used hand /TABLES=Pass BY Attendance calculations /FORMAT=AVALUE TABLES /STATISTICS=CHISQ /CELLS=COUNT uSometimes this is the easier /COUNT ROUND CELL. approach (esp when you don't have the raw data in SPSS) uIn tutes you will use SPSS SUMMARY uThe two-way chi square tests the association between two categorical variables uThe two-way chi square is also referred to as the Test of Independence uThe expected frequencies of the two-way is set to test for independence of the two variables uThe calculation of expected frequencies involves the marginal totals and the grand total uThe obtained chi square is tested against the chi square distribution uThe calculation of degrees of freedom involve the number of categories of both variables, not sample size uCalculating by hand is better than SPSS when you only have the cell means