Podcast
Questions and Answers
Why is it important to adjust p-values when conducting multiple hypothesis tests?
Why is it important to adjust p-values when conducting multiple hypothesis tests?
Adjusting p-values is crucial to control the overall Type I error rate across all tests, ensuring that the probability of making at least one false positive conclusion remains at an acceptable level.
Briefly explain the logic behind the Bonferroni correction method for multiple comparisons.
Briefly explain the logic behind the Bonferroni correction method for multiple comparisons.
The Bonferroni correction reduces the significance threshold for each individual test by dividing the desired overall alpha level by the number of tests conducted i.e. $\alpha/m$. This ensures that the total probability of making a Type I error across all tests does not exceed the specified alpha level.
A researcher is conducting 5 independent t-tests with a desired alpha level of 0.05. Using the Bonferroni correction, what should be the adjusted alpha level for each individual test?
A researcher is conducting 5 independent t-tests with a desired alpha level of 0.05. Using the Bonferroni correction, what should be the adjusted alpha level for each individual test?
The adjusted alpha level would be 0.01 ($0.05 / 5 = 0.01$).
What is a potential drawback of using the Bonferroni correction, and why does this occur?
What is a potential drawback of using the Bonferroni correction, and why does this occur?
How would you implement the Bonferroni correction when using the pairwise.t.test()
function in R?
How would you implement the Bonferroni correction when using the pairwise.t.test()
function in R?
In the multiple regression model provided, what does the coefficient b1
represent, and how does it relate to the outcome variable?
In the multiple regression model provided, what does the coefficient b1
represent, and how does it relate to the outcome variable?
Explain, in your own words, how the lm()
function in R is used to perform multiple regression, and what the basic formula looks like when including two predictor variables.
Explain, in your own words, how the lm()
function in R is used to perform multiple regression, and what the basic formula looks like when including two predictor variables.
In the context of the provided R output, interpret the meaning of the intercept coefficient (125.96557) in the 'regression.2' model.
In the context of the provided R output, interpret the meaning of the intercept coefficient (125.96557) in the 'regression.2' model.
Describe how the concept of minimizing sum squared residuals applies to both simple linear regression and multiple regression. What is the key difference in its application between the two?
Describe how the concept of minimizing sum squared residuals applies to both simple linear regression and multiple regression. What is the key difference in its application between the two?
How does the 3D plot described in the text help visualize a multiple regression model with two predictor variables and one outcome variable?
How does the 3D plot described in the text help visualize a multiple regression model with two predictor variables and one outcome variable?
Flashcards
Multiple Comparisons Correction
Multiple Comparisons Correction
Adjusting p-values to control the overall error rate when conducting multiple tests.
Bonferroni Correction
Bonferroni Correction
A simple method to correct for multiple comparisons by multiplying each p-value by the number of tests.
Family-wise Error Rate (FWER)
Family-wise Error Rate (FWER)
Ensuring that the probability of making any Type I errors across all tests is below a certain level (alpha).
Bonferroni Formula
Bonferroni Formula
Signup and view all the flashcards
Bonferroni in R
Bonferroni in R
Signup and view all the flashcards
Multiple Regression
Multiple Regression
Signup and view all the flashcards
Dependent Variable
Dependent Variable
Signup and view all the flashcards
Independent Variables
Independent Variables
Signup and view all the flashcards
Estimated Coefficients
Estimated Coefficients
Signup and view all the flashcards
Residual
Residual
Signup and view all the flashcards
Study Notes
- Introduces an adjustment to the p-value to oversee a few variables within the test, known as a correction for multiple comparisons or simultaneous inference.
- Mentions that multiple ways exist to perform this adjustment with discussions found throughout Chapter 16.8.
Bonferroni Corrections
- Adjustment method that involves multiplying all raw p-values by the number of separate tests (m).
- The corrected p-value will be p' = m × p.
- Using the Bonferroni correction: reject the null hypothesis if p' < α.
- The logic ensures that the total Type I error rate across m tests is no more than α.
- Method is simple and general.
- In R, the Bonferroni correction can be implemented using
pairwise.t.test()
withp.adjust.method = "bonferroni"
orposthocPairwiseT()
in thelsr
package.
Holm Corrections
- Method used instead of the Bonferroni correction.
- Employs a sequential testing approach by adjusting p-values based on their rank.
- The j-th largest p-value adjustment is pj = j * pj, or pj = pj+1, whichever is larger.
- Sort all p-values in order from smallest to largest.
- Multiply the smallest p-value by m.
- The other values enter a two-stage process: Multiply the secondary p-value by m - 1.
- Use the greater value for the adjusted p.
- Holm correction: more powerful due to a lower Type II error rate, while maintaining the same Type I error rate.
- Holm correction: the default setting in
pairwise.t.test()
andposthocPairwiseT()
.
Assumptions of One-Way ANOVA
- Statistical test that relies on assumptions about data, including normality, homogeneity of variance, and independence.
- A statistical model underpinning ANOVA is expressed as Yᵢₖ = μ + ϵᵢₖ
- Alternatively, Yᵢₖ = μₖ + ϵᵢₖ.
- Testing procedures rely on specific assumption about residuals (ϵᵢₖ ~ Normal(0, σ²)).
- Normality: residuals are normally distributed, assessed via QQ plots or Shapiro-Wilk tests.
- Homogeneity of Variance: population standard deviation is the same across all groups.
- Independence: knowing one residual provides no information about any other residual.
Checking Homogeneity of Variance
-
Levene's Test: Commonly used test in literature, closely related to Brown-Forsythe test.
-
The population means of Z which are identical for all groups are tested with the null hypothesis.
-
The leveneTest() function: found in the car package, used for performing Levene's test.
-
Use
leveneTest(my.anova)
by inputting original aov object as a function. -
A Brown-Forsythe test differs: constructs the transformed variable Z using deviations from group medians.
-
Apply
Zik = |Yik - mediank(Y)|
. -
Although R (the most up-to-date version being utilized) will report test as F-value, it could also be W.
Removing Homogeneity of Variance
- Welch One-Way Test: saves ANOVA when homogeneity is violated - implemented using
oneway.test()
function. - Important arguments includes:
- Formula: outcome ~ group.
- Data: specifies dateframe containing variables
var.equal
: If FALSE, the Welch test runs. If TRUE, a regular ANOVA runs
Checking the Normality Assumption
- Residual extraction performed by R function residuals().
- Residual extraction can be done through the
hist
,qqnorm
andshapiro.test
to generate histogram and QQ plots. - Results of the Shapiro-Wilk test supports the hypothesis that normality is not violated if the p value is >0.05
Removing the Normality Assumption
- Easiest solution is switching to a non parametric test such as the Kruskal-Wallis which doesn't relate to a particular assumption of normality
- The test consists of ranking all Yik values while further analyzing ranked data.
- With Rik refers to ranking, average rank (Rk): (1/Nk)Σ Rik. The grand mean rank R = ΣΣ Rik.
- Kruskal-Wallis Statistic: K = (N-1)RSSb/RSStot.
- With new calculation: K = [12/N(N+1)] Σk (SRk²/Nk)-3(N+1) for when there are correct matching variables.
- K sampling: distribution ≈ chi-square with (G–1) degrees of freedom, one-sided test (reject H₀ when K sufficiently large).
- Adjust equation for K: K = [12/N(N+1)] Σk (SRk²/Nk)-3(N+1). Corrects for when there are matching variables.
How to Run the Kruskal-Wallis Test
- Easily performed in R since R function
kruskal.test
. - Specify data by variable
mood.gain
andtree
. - Specify outcome and grouping variable as separate input arguments, X + Y.
- Bundle all three together as list:
mood.gain<- list(placebo, joyzepam, anxifree
kruskal.test<-( mood.gain)
Relationship Between ANOVA and The Students Test:
- ANOVA with two groups is nearly identical to the Student t test with a few exceptions.
- If the residual df from ANOVA and t test align themselves up to t(16) =-1.13068
- P-Values are almost identical to when we value p=0.21 but we were able to achieve the result.
- The new equation for F-statistic being F(0) = 1.71
Studying That Suits You
Use AI to generate personalized quizzes and flashcards to suit your learning preferences.
Related Documents
Description
Overview of Bonferroni and Holm corrections. These adjustments to p-values control the error rate across multiple tests. The Bonferroni method multiplies raw p-values by the number of tests, while the Holm method uses a sequential testing approach.