Bonferroni and Holm Corrections
10 Questions
0 Views

Choose a study mode

Play Quiz
Study Flashcards
Spaced Repetition
Chat to Lesson

Podcast

Play an AI-generated podcast conversation about this lesson

Questions and Answers

Why is it important to adjust p-values when conducting multiple hypothesis tests?

Adjusting p-values is crucial to control the overall Type I error rate across all tests, ensuring that the probability of making at least one false positive conclusion remains at an acceptable level.

Briefly explain the logic behind the Bonferroni correction method for multiple comparisons.

The Bonferroni correction reduces the significance threshold for each individual test by dividing the desired overall alpha level by the number of tests conducted i.e. $\alpha/m$. This ensures that the total probability of making a Type I error across all tests does not exceed the specified alpha level.

A researcher is conducting 5 independent t-tests with a desired alpha level of 0.05. Using the Bonferroni correction, what should be the adjusted alpha level for each individual test?

The adjusted alpha level would be 0.01 ($0.05 / 5 = 0.01$).

What is a potential drawback of using the Bonferroni correction, and why does this occur?

<p>A potential drawback of the Bonferroni correction is that it can be overly conservative reducing statistical power. This occurs because dividing the alpha level by the number of tests makes it more difficult to reject the null hypothesis, increasing the chance of making a Type II error.</p> Signup and view all the answers

How would you implement the Bonferroni correction when using the pairwise.t.test() function in R?

<p>To implement the Bonferroni correction, set the <code>p.adjust.method</code> argument to &quot;bonferroni&quot; within the <code>pairwise.t.test()</code> function. For example: <code>pairwise.t.test(data, grouping_variable, p.adjust.method = &quot;bonferroni&quot;)</code>.</p> Signup and view all the answers

In the multiple regression model provided, what does the coefficient b1 represent, and how does it relate to the outcome variable?

<p><code>b1</code> represents the coefficient associated with 'dan.sleep'. It quantifies the change in 'dan.grump' for each unit increase in 'dan.sleep', holding other variables constant.</p> Signup and view all the answers

Explain, in your own words, how the lm() function in R is used to perform multiple regression, and what the basic formula looks like when including two predictor variables.

<p>The <code>lm()</code> function fits a linear model. For multiple regression, the formula specifies the outcome variable on the left, followed by a <code>~</code>, and then each predictor variable separated by <code>+</code> symbols. For example: <code>outcome ~ predictor1 + predictor2</code>.</p> Signup and view all the answers

In the context of the provided R output, interpret the meaning of the intercept coefficient (125.96557) in the 'regression.2' model.

<p>The intercept (125.96557) is the predicted value of 'dan.grump' when both 'dan.sleep' and 'baby.sleep' are zero. It represents a baseline level of grumpiness.</p> Signup and view all the answers

Describe how the concept of minimizing sum squared residuals applies to both simple linear regression and multiple regression. What is the key difference in its application between the two?

<p>In both, the goal is to find coefficients that minimize the sum of the squared differences between observed and predicted values. In simple regression, we minimize along a line; in multiple regression, we minimize across a plane or hyperplane.</p> Signup and view all the answers

How does the 3D plot described in the text help visualize a multiple regression model with two predictor variables and one outcome variable?

<p>The 3D plot visualizes each observation as a point in 3D space defined by the three variables. The regression model is represented as a plane that best fits these points, showing the relationship between the predictors and the outcome.</p> Signup and view all the answers

Flashcards

Multiple Comparisons Correction

Adjusting p-values to control the overall error rate when conducting multiple tests.

Bonferroni Correction

A simple method to correct for multiple comparisons by multiplying each p-value by the number of tests.

Family-wise Error Rate (FWER)

Ensuring that the probability of making any Type I errors across all tests is below a certain level (alpha).

Bonferroni Formula

If you are doing m separate tests, multiply all raw p-values by m.

Signup and view all the flashcards

Bonferroni in R

Use the pairwise.t.test() function, setting p.adjust.method = "bonferroni".

Signup and view all the flashcards

Multiple Regression

A statistical method to model the relationship between a dependent variable and multiple independent variables.

Signup and view all the flashcards

Dependent Variable

The variable we are trying to predict or explain in a regression model.

Signup and view all the flashcards

Independent Variables

Variables used to predict or explain the dependent variable.

Signup and view all the flashcards

Estimated Coefficients

The estimated values that minimize the sum of squared residuals in a regression model.

Signup and view all the flashcards

Residual

The difference between the observed value and the value predicted by the regression model.

Signup and view all the flashcards

Study Notes

  • Introduces an adjustment to the p-value to oversee a few variables within the test, known as a correction for multiple comparisons or simultaneous inference.
  • Mentions that multiple ways exist to perform this adjustment with discussions found throughout Chapter 16.8.

Bonferroni Corrections

  • Adjustment method that involves multiplying all raw p-values by the number of separate tests (m).
  • The corrected p-value will be p' = m × p.
  • Using the Bonferroni correction: reject the null hypothesis if p' < α.
  • The logic ensures that the total Type I error rate across m tests is no more than α.
  • Method is simple and general.
  • In R, the Bonferroni correction can be implemented using pairwise.t.test() with p.adjust.method = "bonferroni" or posthocPairwiseT() in the lsr package.

Holm Corrections

  • Method used instead of the Bonferroni correction.
  • Employs a sequential testing approach by adjusting p-values based on their rank.
  • The j-th largest p-value adjustment is pj = j * pj, or pj = pj+1, whichever is larger.
  • Sort all p-values in order from smallest to largest.
  • Multiply the smallest p-value by m.
  • The other values enter a two-stage process: Multiply the secondary p-value by m - 1.
  • Use the greater value for the adjusted p.
  • Holm correction: more powerful due to a lower Type II error rate, while maintaining the same Type I error rate.
  • Holm correction: the default setting in pairwise.t.test() and posthocPairwiseT().

Assumptions of One-Way ANOVA

  • Statistical test that relies on assumptions about data, including normality, homogeneity of variance, and independence.
  • A statistical model underpinning ANOVA is expressed as Yᵢₖ = μ + ϵᵢₖ
  • Alternatively, Yᵢₖ = μₖ + ϵᵢₖ.
  • Testing procedures rely on specific assumption about residuals (ϵᵢₖ ~ Normal(0, σ²)).
  • Normality: residuals are normally distributed, assessed via QQ plots or Shapiro-Wilk tests.
  • Homogeneity of Variance: population standard deviation is the same across all groups.
  • Independence: knowing one residual provides no information about any other residual.

Checking Homogeneity of Variance

  • Levene's Test: Commonly used test in literature, closely related to Brown-Forsythe test.

  • The population means of Z which are identical for all groups are tested with the null hypothesis.

  • The leveneTest() function: found in the car package, used for performing Levene's test.

  • Use leveneTest(my.anova) by inputting original aov object as a function.

  • A Brown-Forsythe test differs: constructs the transformed variable Z using deviations from group medians.

  • Apply Zik = |Yik - mediank(Y)|.

  • Although R (the most up-to-date version being utilized) will report test as F-value, it could also be W.

Removing Homogeneity of Variance

  • Welch One-Way Test: saves ANOVA when homogeneity is violated - implemented using oneway.test() function.
  • Important arguments includes:
    • Formula: outcome ~ group.
    • Data: specifies dateframe containing variables
  • var.equal: If FALSE, the Welch test runs. If TRUE, a regular ANOVA runs

Checking the Normality Assumption

  • Residual extraction performed by R function residuals().
  • Residual extraction can be done through the hist,qqnorm and shapiro.test to generate histogram and QQ plots.
  • Results of the Shapiro-Wilk test supports the hypothesis that normality is not violated if the p value is >0.05

Removing the Normality Assumption

  • Easiest solution is switching to a non parametric test such as the Kruskal-Wallis which doesn't relate to a particular assumption of normality
  • The test consists of ranking all Yik values while further analyzing ranked data.
  • With Rik refers to ranking, average rank (Rk): (1/Nk)Σ Rik. The grand mean rank R = ΣΣ Rik.
  • Kruskal-Wallis Statistic: K = (N-1)RSSb/RSStot.
    • With new calculation: K = [12/N(N+1)] Σk (SRk²/Nk)-3(N+1) for when there are correct matching variables.
  • K sampling: distribution ≈ chi-square with (G–1) degrees of freedom, one-sided test (reject Hâ‚€ when K sufficiently large).
  • Adjust equation for K: K = [12/N(N+1)] Σk (SRk²/Nk)-3(N+1). Corrects for when there are matching variables.

How to Run the Kruskal-Wallis Test

  • Easily performed in R since R function kruskal.test.
  • Specify data by variable mood.gain and tree.
  • Specify outcome and grouping variable as separate input arguments, X + Y.
  • Bundle all three together as list:
    • mood.gain<- list(placebo, joyzepam, anxifree
    • kruskal.test<-( mood.gain)

Relationship Between ANOVA and The Students Test:

  • ANOVA with two groups is nearly identical to the Student t test with a few exceptions.
    • If the residual df from ANOVA and t test align themselves up to t(16) =-1.13068
    • P-Values are almost identical to when we value p=0.21 but we were able to achieve the result.
    • The new equation for F-statistic being F(0) = 1.71

Studying That Suits You

Use AI to generate personalized quizzes and flashcards to suit your learning preferences.

Quiz Team

Description

Overview of Bonferroni and Holm corrections. These adjustments to p-values control the error rate across multiple tests. The Bonferroni method multiplies raw p-values by the number of tests, while the Holm method uses a sequential testing approach.

More Like This

Silence Showdown
5 questions

Silence Showdown

PreeminentWhale avatar
PreeminentWhale
Use Quizgecko on...
Browser
Browser