Paired Samples T-Test & Cohen's d

Podcast

Play an AI-generated podcast conversation about this lesson

Download our mobile app to listen on the go

Get App

Questions and Answers

When conducting a one-sample Wilcoxon test, what data transformation is typically performed to convert it into a form suitable for analysis, and why is this transformation necessary?

The data is transformed into change scores by calculating the difference between paired observations. This is necessary to assess whether the median difference is significantly different from zero.

In the context of the Wilcoxon signed-rank test, explain how the test statistic V is calculated when analyzing the impact of a statistics class on student happiness, given 'before' and 'after' happiness scores.

V is calculated by first determining the change scores (difference between 'after' and 'before' scores). Then, only the positive change scores are tabulated against all change scores. V is the sum of ranks of the positive differences.

In the context of the Wilcoxon test, if you observe a very large or very small value for the test statistic V, what conclusion would you draw regarding the null hypothesis, assuming a two-sided test?

You would reject the null hypothesis. A very large or very small V indicate a significant difference, suggesting the medians of the two groups are not equal.

Explain the purpose of using the `wilcox.test` function in R with the `x` and `y` arguments, providing an example scenario where this would be appropriate.

<p>It is used to perform the Wilcoxon rank-sum test to compare two independent samples. For example, comparing scores from group A to scores from group B to determine if there is a significant difference between the two groups.</p> Signup and view all the answers

Explain why a paired-samples Wilcoxon test using 'before' and 'after' measurements is fundamentally similar to a one-sample Wilcoxon test using change scores.

<p>Both tests evaluate whether there is a significant difference between two related sets of measurements. By calculating the difference between 'before' and 'after' creating change scores, the paired data is transformed into a single sample, making the one-sample test equivalent.</p> Signup and view all the answers

Flashcards

Wilcoxon Rank Sum Test

A non-parametric test comparing two independent groups.

One-Sample Wilcoxon Test

A non-parametric test assessing the difference between two related samples or a single sample against a hypothesized median.

Change Scores

The differences calculated by subtracting paired observations (before - after).

Wilcoxon Tabulation

Tabulating positive change scores against the entire sample to calculate the Wilcoxon test statistic.

Signup and view all the flashcards

Wilcoxon Test Statistic (V)

The test statistic derived from the Wilcoxon test. Used to determine statistical significance.

Signup and view all the flashcards

Study Notes

Sets up to handle data in long form
Expects two separate variables, x and y
Requires specifying paired=TRUE
Demands corresponding first elements of x and y, lacking an "id" variable
Code example is provided for paired samples t test

Effect Size

Cohen's d is a commonly used measure of effect size for a t-test
Cohen's d is calculated by dividing the difference between means by an estimate of the standard deviation
$d = \frac{\text{(mean 1) - (mean 2)}}{\text{std dev}}$
Interpreting Cohen's d is aided by a rough guide as interpreted in the next section

Cohen's d from one sample measures

Cohen's d formula: $d = \frac{X - μ_0}{\hat{\sigma}}$
x represents a numeric vector containing the sample data
mu represents the mean against which the mean of x is compared; default value is mu = 0
Example R code includes data stored in grades vector to compare against a mean of 67.5
Results indicate psychology students achieving grades (mean = 72.3%) about .5 standard deviations higher than expected (67.5%), a moderate effect size

Cohen's d from a Student t test

Hedges' g statistic is the most used version of Cohen's d in the context of a student t-test, corresponds to method = "pooled" in the cohensD() function, and is the default
True population effect size is calculated with the means of both populations and the standard deviation: $\delta = \frac{\mu_1 - \mu_2}{\sigma}$
Sample Cohen's d is calculated with the means of the samples and the pooled standard deviation: $d = \frac{\bar{X_1} - \bar{X_2}}{s_p}$
Other method options include:
- Use only one of the groups as the basis for calculating the standard deviation
- Omission of bias correction during the usual calculation of pooled standard deviation
- Introducing a small correction

Cohen's d from a Welch test

Cohen advises averaging the two populations variances with the formula $\delta' = \frac{\mu_1 - \mu_2 }{\sigma'}$
$\sigma' = \sqrt{\frac{\sigma^2_1 + \sigma^2_2}{2}}$
All that is done to calculate d for this version (method = "unequal") is substitution of the sample means $X_1$ and $X_2$ and the corrected sample standard deviations $\hat{\sigma}_1$ and $\hat{\sigma}_2$ into the equation for $\delta'$
Formula for d is $d = \frac{\bar{X_1} - \bar{X_2}}{\sqrt{\frac{\hat{\sigma}^2_1 + \hat{\sigma}^2_2}{2}}}$

Cohen's d from a paired-samples test

Calculation simply uses method = "paired"
$d = \frac{\bar{D}}{\sigma_D}$

Checking Normality of a Sample

All tests assume data are normally distributed
Central Limit Theorem often ensures real-world quantities are normally distributed, especially if variables are averages of many things
Normality can be checked using QQ plots and the Shapiro-Wilk test

QQ Plots

Quantile-Quantile plots visually check for systematic violations of normality
Plotted observations as single dots with x as theoretical quantile, and y as sample data quantile
Data are normal when dots form a straight line

Shapiro-Wilk Tests

A formal test that checks to see if a sample is normally distributed.
Null hypothesis W tests if a set of N observations is normally distributed.
$W = \frac{(\sum_{i=1}^{N} a_i X_i)^2}{\sum(X_i - \bar{X})^2}$

Testing non-normal Data with Wilcoxon Tests

Runs when a t-test is undesired because data is substantively non-normal
Tests come in one-sample and two-sample forms
Tests assume no specific distribution, making them nonparametric
The tests are usually less powerful than the t-test to create a higher Type II error rate

Summary of Key Points

One sample t-tests are used to compare a single sample mean against a hypothesised value for the population mean
Independent samples t-tests compare the means of two groups, testing the null hypothesis for that they have the same mean
- Student t-tests assume groups have the same standard deviation
- Welch t-tests do not
Paired samples t-tests are used with two scores from each person, testing the null hypothesis that the two scores have the same mean; the test is equivalent to taking the difference between the two scores for each person, and then running a one sample t-test on the difference scores
Calculation of effect size can be calculated with the Cohen's d statistic
Check for normality of a sample using QQ plots and Shapiro-Wilk test
Use Wilcoxon tests instead of t-tests it data are non-normal

Studying That Suits You

Use AI to generate personalized quizzes and flashcards to suit your learning preferences.