Podcast
Questions and Answers
What methodological weakness significantly undermines the study's validity, rendering it almost worthless for understanding preferences between humans and robots?
What methodological weakness significantly undermines the study's validity, rendering it almost worthless for understanding preferences between humans and robots?
The human participants did not respond honestly to the question to avoid undesirable consequences.
Explain the fundamental approximation underlying the $\chi^2$ test and why it might not hold true when there is only 1 degree of freedom.
Explain the fundamental approximation underlying the $\chi^2$ test and why it might not hold true when there is only 1 degree of freedom.
The $\chi^2$ test assumes the binomial distribution approximates a normal distribution for large N. This approximation often fails with only 1 degree of freedom, especially in 2x2 contingency tables.
In the context of the $\chi^2$ test, what problem does Yates' continuity correction address, and how does it attempt to resolve it?
In the context of the $\chi^2$ test, what problem does Yates' continuity correction address, and how does it attempt to resolve it?
Yates' correction addresses the problem of the goodness of fit statistic tending to be "too big" when N is small and df = 1, leading to inflated alpha values. It resolves this by subtracting 0.5 from the absolute difference between observed and expected values in the $\chi^2$ formula.
Describe the potential impact of ignoring the continuity correction when conducting a $\chi^2$ test with one degree of freedom. Focus on how it affects the p-value and the likelihood of Type I error.
Describe the potential impact of ignoring the continuity correction when conducting a $\chi^2$ test with one degree of freedom. Focus on how it affects the p-value and the likelihood of Type I error.
Explain the difference between achieving a statistically significant result and obtaining a result with scientific value, using the example of the flawed human-robot preference study.
Explain the difference between achieving a statistically significant result and obtaining a result with scientific value, using the example of the flawed human-robot preference study.
Flashcards
Reactivity Effect
Reactivity Effect
The effect where participants alter their behavior because they know they are being studied.
Significance vs. Value
Significance vs. Value
A statistically significant result doesn't guarantee scientific value if there are methodological flaws.
Continuity Correction
Continuity Correction
A small adjustment used in chi-square tests with 1 degree of freedom to correct for the approximation of a continuous distribution.
Chi-Square Distribution
Chi-Square Distribution
Signup and view all the flashcards
Yates Correction Formula
Yates Correction Formula
Signup and view all the flashcards
Study Notes
Goodness of Fit Test & Calculating Expected Frequencies
- An approach to calculate expected frequencies relies on the null hypothesis to determine expectations
- Calculate based on the true probability
- The expected frequency is the probability of choosing an option multiplied by the number of people in a species
Contingency Table & Chi-Square Statistic
- The challenge is the null hypothesis does not specify a particular value for probability, which needs estimation from data
- Estimate of probability involves dividing the row total by the total sample size
- The expected frequency can be expressed as the row total multiplied by the column total, divided by total observations
- A test statistic can be defined using the same strategy as the goodness of fit test
- The chi-square statistic is the sum of squared differences between expected and observed frequencies, divided by the frequencies
Understanding Degrees of Freedom
- Large chi-square values suggest the null hypothesis poorly describes data
- Small values suggest a good fit
- The null hypothesis should be rejected if the chi-square value is too large
- Degrees of freedom relate to the amount of data being analyzed minus constraints
- In a contingency table with rows and columns, total observations is the number of observed frequencies
Constraints & Experimental Design
- Constraints relate to fixed column totals by experimenter intervention
- Null hypothesis having free parameters affects the number of constraints
- Each free parameter resembles an additional constraint
- Probabilities have constraints because they sum to one
Test Implementation in R
- The associationTest function in the 1sr package simplifies testing
- A formula is needed to specify variables for cross-tabulation
- A data frame name containing such variables is also needed
- Similar testing can be done through chisq.test()
Interpreting Association Test Results
- The output of the chi-square test includes variables, hypotheses, observed and expected contingency tables
- Statistical significance is determined through the X-squared statistic, degrees of freedom, and p-value
- The effect size is quantified by the Cramer's V
- Significant association indicated a likelihood of different preferences among groups
- Statistical significance may not guarantee scientific value due to methodological flaws
Yates' Correction (Continuity Correction)
- The Chi-squared tests rely on assumptions and approximations
- The chi-squared test is based on how closely the binomial distribution resembles a normal distribution
- At 1 degree of freedom the tests tend to be too big, so the p-value can be too small
- The correction subtracts 0.5 from each difference between observed and expected values
- Redefines the goodness of fit statistic
- It is not derived from principled theory, but examining the behavior of the test and observing a better performance
- Continuity correction is explicitly noted in the output
Effect Size Measures
- Reporting effect size indicates the strength of association or deviation
- Common measures include phi statistic and Cramer's V
- Phi statistic's value is by dividing X2 by sample size, then taking the square root
- Cramer's V adjusts for contingency table size, proposed by Cramér
Advantages of Cramer's V
- Cramer's V is calculated using a similar approach to a contingency table
- V ranges from 0 (no association) to 1 (perfect association)
- The core R packages do not have these functions
- The cramersV() function is available in the 1sr package
Key Test Assumptions
- Expected frequencies should be sufficiently large
- Goal is to have all expected frequencies larger than 5, or at least most above 5
- Should be no expected frequencies below 1
- Guidelines are rough and somewhat conservative
- Data should be independent of one another
Non-Independence Issues
- The Chi-square test assumes observations are independent
- Non-independence stuffs things up", causing false rejection or retention of the null
- Extreme (and extremely silly) examples exist
- The cards experiment illustrates related data, and falsely retaining the null
- Potential alternatives include the McNemar or Cochran tests, or check out the Fisher exact test
R Functions
- goodnessOfFitTest() & associationTest() offer detailed output
- chisq.test() is more terse with output
- The goodness-of-fit test and test of independence are underpinned by the same mathematics
- chisq.test() can run either based on input type
- Input a frequency table for a goodness of fit test
- Input a cross-tabulation for a test of independence
Fisher Exact Test Basics
- Done when cell counts are low
- Based on field experiments looking at the emotional status of people accused of witch craft
- Not easy to find people being set on fire, so cell counts are often very small
- Test works differently than Chi-Square and others that often calculate a value of "p" directly
- Tests works fine for small sample sizes in such experiments
Fisher Analysis Basics
- Probability is described by a distribution (hypergeometric)
- One must calculate the probability of observing this type of table or one more extreme
- One must compute the probability of observing a particular table or a table that is "more extreme"
- Conceptual difficult is to figure out which contingency table is more extreme than another
- The table with the lowest probability, is the most extreme
- fisher.test() has basic implementation
McNemar Test Situations
- Used when hired to demonstrate how effective advertisements are
- For those that intend to vote before and after seeing the advertisements
- Obvious" studies will often include many other aspects but study notes consider one simple experiment
- Data is expressed via a contingency table
McNemar Test Set-up (cont.)
- Null hypothesis would be simple x2 test when simple and independent
- But after the test one often has 100 participants and 200 observations, there is an issue
- Each person gives you an answer in both the before column and the after column each time, and therefore aren't independent
- If voter A says "yes" the first time and voter B says "no," then you'd expect that voter A is more likely to say "yes" the second time than voter B, so you'd expect that voter A is more likely to say "yes" the second time than voter B
- Such a test is for non-standard use since there is often little time to talk of such a thing
McNemar Test: Table Setup
- McNemar published such solutions but starting with the given data in a slightly different way
- This is exactly the same" data but re-written" so that each of our 100 participants appears in only one cell and satisfies assumptions
- The is now a contingency table with something near x2-goodness, then becomes a tricky part
- Labeling has many considerations
McNemar Null Hypothesis
- In a rewritten formula, must consider the test after and before as the same test
- For the now available data we are "testing", the row/cot totals come from the same distribution
- Testing whether the null hypothesis in McNemar's test is the case will have "marginal homogeneity"
- The totals have two distributions
Studying That Suits You
Use AI to generate personalized quizzes and flashcards to suit your learning preferences.
Related Documents
Description
Explore the calculation of expected frequencies using the null hypothesis. Learn how to apply the Chi-Square statistic. Understand degrees of freedom in statistical tests.