Podcast
Questions and Answers
Explain the difference between measuring effect size relative to difference scores versus original variables. Why might a researcher choose one over the other?
Explain the difference between measuring effect size relative to difference scores versus original variables. Why might a researcher choose one over the other?
Measuring effect size relative to difference scores focuses on the magnitude of change within the specific context of the comparison. Measuring effect size relative to original variables assesses the practical significance of the change in relation to the overall variability of the original data. If the practical consequences really matter, the effect size relative to original variables may be preferred.
Describe a scenario where a variable might be highly non-normal, and explain why this non-normality might occur.
Describe a scenario where a variable might be highly non-normal, and explain why this non-normality might occur.
Response time (RT) data is often non-normal because it represents the minimum time it takes for one of many potential triggers to elicit a response. This often results in a skewed distribution.
What is the purpose of using a QQ plot? How does it help in assessing the normality of a sample?
What is the purpose of using a QQ plot? How does it help in assessing the normality of a sample?
A QQ plot is used to visually check if a sample violates the assumption of normality. It plots observed quantiles against theoretical quantiles from a normal distribution. Systematic deviations from a straight line suggest non-normality.
Explain why the Central Limit Theorem often leads to real-world quantities being normally distributed. What condition needs to be met for this to occur?
Explain why the Central Limit Theorem often leads to real-world quantities being normally distributed. What condition needs to be met for this to occur?
If you suspect that your data violates the normality assumption required for a t-test, what are two methods you could use to assess whether this assumption is seriously violated?
If you suspect that your data violates the normality assumption required for a t-test, what are two methods you could use to assess whether this assumption is seriously violated?
In the context of ANOVA, explain why the term 'analysis of variance' can be considered misleading.
In the context of ANOVA, explain why the term 'analysis of variance' can be considered misleading.
Describe a scenario where a one-way ANOVA would be an appropriate statistical test. What specific type of data is needed for this test?
Describe a scenario where a one-way ANOVA would be an appropriate statistical test. What specific type of data is needed for this test?
In a clinical trial studying the effectiveness of an antidepressant, what is the purpose of including a placebo group and an existing drug (like Anxifree) group, in addition to the new drug (Joyzepam) group?
In a clinical trial studying the effectiveness of an antidepressant, what is the purpose of including a placebo group and an existing drug (like Anxifree) group, in addition to the new drug (Joyzepam) group?
Explain the role of 'post hoc tests' in ANOVA. Why are they used, and what problem do they address?
Explain the role of 'post hoc tests' in ANOVA. Why are they used, and what problem do they address?
Briefly describe the relationship between ANOVA and t-tests. In what situation(s) would ANOVA be preferred over a series of t-tests?
Briefly describe the relationship between ANOVA and t-tests. In what situation(s) would ANOVA be preferred over a series of t-tests?
Flashcards
Cohen's d (Original Variables)
Cohen's d (Original Variables)
Measures effect size compared to original variables, not just difference scores. Use when practical consequences relative to original scales matter.
Normality Assumption Rationale
Normality Assumption Rationale
Many real-world variables tend to be normally distributed due to the central limit theorem, especially if they're averages of many factors.
Non-Normality of Response Time (RT)
Non-Normality of Response Time (RT)
Response time data is often non-normal because the response occurs at the first trigger out of many possibilities.
QQ Plot
QQ Plot
Signup and view all the flashcards
QQ plot (Normality)
QQ plot (Normality)
Signup and view all the flashcards
One-Way ANOVA
One-Way ANOVA
Signup and view all the flashcards
ANOVA Purpose
ANOVA Purpose
Signup and view all the flashcards
Clinical Trial Example
Clinical Trial Example
Signup and view all the flashcards
ANOVA Developer
ANOVA Developer
Signup and view all the flashcards
One-Way ANOVA Focus
One-Way ANOVA Focus
Signup and view all the flashcards
Study Notes
Paired Samples T-Test
- A paired samples t-test expects two variables, x and y, and requires specifying paired=TRUE.
- It verifies the first element of x and y correspond to the same subject, since there is no "id" variable.
- The command t.test(x = chico$grade_test2, y = chico$grade_test1, paired = TRUE) performs a paired samples t-test on Dr. Chico's class data.
- The output includes the t-statistic, degrees of freedom, p-value, confidence interval, and the mean of the differences.
- Results match those calculated in Section 13.5
Effect Size - Cohen's d
- Cohen's d is a common measure of effect size for t-tests.
- In the context of a Student's t-test, it is calculated as the difference between the means divided by an estimate of the standard deviation. -d = (mean 1) - (mean 2) / std dev
- Cohen's d has a natural interpretation.
- It describes the difference in means as the number of standard deviations separating them.
Interpreting Cohen's d
- Around 0.2 represents a small effect.
- Around 0.5 represents a moderate effect.
- Around 0.8 represents a large effect.
- Context is important when thinking about the size.
- A small effect can have practical importance.
- A large effect might not matter in some situations.
- The cohensD() function in the 1sr package is used.
- It uses the method argument to distinguish between them.
Cohen's d - One Sample
- When running a t-test with oneSampleTTest, independentSamplesTTest, and pairedSamplesTTest(), new commands are not needed because they automatically produce an estimate of Cohen's d as part of the output
- If using t.test(), the cohensD() function (in the 1sr package) is needed.
- In the case of a one-sample t-test comparing a sample mean X to a population mean µ₀, Cohen's d is calculated as: d = (𝑋 - μ₀) / 𝜎̂
- x is a numeric vector with sample data.
- mu is the mean against which x is being compared (defaults to mu = 0)
Calculating Cohen's d - Zeppo's class
- cohensD(x = grades, mu = 67.5) calculates the effect size for data from Dr. Zeppo's class.
- Results indicate students achieved 72.3%, about 0.5 standard deviations higher than the expected level (67.5%).
- This is considered a moderate effect size.
Cohen's d - Student t-test
- Focuses on the situation analogous to that of the Student's independent samples t-test.
- Several versions of d can be used.
- The method argument to the cohensD() function picks one of the versions
- Population effect size delta = (𝜇₁ - 𝜇₂) / 𝜎
- 𝜇₁ and 𝜇₂ are the population means for groups 1 and 2
- sigma is the standard deviation
Estimating d
- d = (𝑋₁ - 𝑋₂) / 𝜎̂_p
- 𝑋₁ and 𝑋₂ are the sample means
- 𝜎̂_p is the pooled standard deviation
- Commonly used version is referred to as Hedges' g statistic
- method = "pooled" (the default, in the cohensD() function.)
- Glass' ∆ is when you only want to use one of the two groups as the basis for calculating the standard deviation
- Use method = "x.sd" or method = "y.sd" when using cohensD() function, the groups are one of two groups is a purer reflection of "natural variation" than the other
- method = "raw" is when correction omitted, divide N
- Primarily to calculate the effect size in the sample, rather than estimating an effect size in the population.
- method = "corrected" Multiplies the value of d by (N − 3)/(N – 2.25)
- Based on Hedges and Olkin (1985), pointing out there is a small bias in the usual (pooled) estimation for Cohen's d.
- Command example cohensD(formula = grade ~ tutor, data = harpo, method = "pooled" )
- This outputs Cohen's d: [1] 0.7395614
Cohen's d - Welch test
- The situation is more like the Welch test: have two independent samples, where the corresponding populations have equal variances
- New measure: delta prime = (𝜇₁ - 𝜇₂) / 𝜎', so as to keep it distinct from the measure delta
- What Cohen (1988) suggests: could define our new population effect size by averaging the two population variances.
- unequal calculates d
-
formula = grade ~ tutor, data = harpo, method = "unequal"
- his is the version of Cohen's d that gets reported by the independentSamplesTTest() function whenever it runs a Welch t-test.
Cohen's d - paired samples test
- What should we do for a paired samples t-test?
- If is you want to measure your effect sizes relative to the distribution of difference scores, the measure of d that you calculate is just d = D / 𝜎_D
-
cohensD( x = chico$grade_test2, y = chico$grade_test1, method = "paired")
- Use the versions of Cohen's d the same you use for a Student or Welch test, the same versions of Cohen's d that if is when you care about research
Checking the Normality of the sample
- Statistical tests assume the data is normally distributed
- The Central Limit Theorem (Section 10.3.3) often indicates that many real world quantities are normally distributed making t-tests safe to use
- Normality is not guaranteed, variables can be highly non-normal
- Data such response time (RT) data is systematically normal
QQ plots
- A "quantile-quantile” plot (QQ plot) checks if the sample breaks normality
- Visually determines if there are any systematic violations on sample
- Each observation is plotted as a single dot.
- The x co-ordinate is the theoretical quantile that the observation should fall in, if the data were normally distributed (with mean and variance estimated from the sample)
- The y co-ordinate is the actual quantile of the data within the sample
- If the data are normal, the dots should form a straight line
- Generate normal data example
normal.data <- rnorm( n = 100 ) # generate N = 100 normally distributed numbers hist(x = normal.data ) # draw a histogram of these numbers qqnorm( y = normal.data ) # draw the QQ plot
Shapiro-Wilk tests
- Shapiro-Wilk test (Shapiro & Wilk, 1965) tests something a bit more formal
- The null hypothesis being tested is N observations is normally distributed.
- The test statistic is conventionally denoted as W and calculated as follows
- The test statistic should be test =
- Conventionally denoted as W
Shapiro-Wilk behavior
- Small values of W indicate deviation from normality when reporting
- The test in R, shapiro.test()
Testing non-normal data with Wilcoxon tests
- What if the data is pretty substantially non-normal, but still test like a t-test?"
- This situation where you want to use Wilcoxon tests
Wilcoxon tests
- Has 2 forms: one-sample and two-sample
- Situation: exact same situations as the corresponding t-tests
Wilcoxon vs T-test
- Wilcoxon: doesn’t assume normality
- T-test: Assume normality
- Wilcoxon: Makes no assumptions about what kind of distribution is involved: calls them nonparametric tests
2 sample Wilcoxon test
Suppose test of awesomeness with groups of people, "A” and "B"
- File: awesome.Rdata
- Contains a dataset of only a single frame, called “awesome”
- Code example: ´load(″awesome.Rdata″)´ ´´print(awesome).
Two sample Wilcoxon use
- Want to construct a table compares every observer in group A against every observation in group B
- There are no ties: simple
- Each time the group A datum is checked: place a check in the table
Wilcoxon test statistic
- W: number of checkmarks
- The interpretation of W: qualitatively the same as the interpretation with t or z
- Two-sided test: you refuse if W is very large/small
- Directional (that is, one-sided Hypothesis): Only use one or the other
Structure of Wilcox
- Is structured like the wilcox.test() function should feel familiar by structure now
- Organized using organizing variables such as formula and data which is how code/comands are organized wilcox.test(formula = scores group, data = awesome)´
- Just like we saw with test) Function, the test function, the alternative argument can switch sides, plus extra arguments
data vs grouped data
- ´function allows using x and y arguments when there is data set separated by each group
- load(″awesome2.Rdata″)´ Score A17.6,13.2,1.3
- Score B17.6,13.2,1.14
- Results same as with last run
One sample Wilcoxon test
- The on sample WIlcoxon Test
- WIlcoxon test - equivalent to paired samples WIlcoxon test
- Suppose an effect of happiness of student effect by taking a class in stats
test vs paired samples WIlcoxon test
- No fundemental differece between doing paired sample test vs doin one sample just usin change scores
- Simply think about it to make a tabulation test for a complete sample with those positive values that the test would create code: williox.test(X= happiness change, m=0
- output the WIlcoxon signed Rank test + function
One-sample test and Summarys
- As seen in the test, there’s a significant effect and statistics affects the
- Switching test/sample version doesn’t alter the answers Summary
Summary - Tests
- One sample test is used to compare 1+sample mean is by a population
- Independent samples t test is used to compare 2 groups, and test the hypothesis that they posses the same mean, with 2 forms: the student test assumes groups share a standard number. The Welch test does not
Tests vs Paired Samples Test
- Two scores used to determine the same mean, equals doing the differnce from each, by doing one sample with score
- Cohen- effect size calculations are calculated on with cohens
- QQ can check data results for normality
- If it’s non- normal can use wIllioxon test
Studying That Suits You
Use AI to generate personalized quizzes and flashcards to suit your learning preferences.
Description
This content explains effect size calculation with difference scores and original variables, scenarios causing non-normality, and the use of QQ plots for normality assessment. It also clarifies the Central Limit Theorem's role in normal distributions and methods to assess violations of normality assumptions for t-tests.