Podcast
Questions and Answers
Explain why the interpretation of a p-value as "the probability that the null hypothesis is true" is incorrect from a frequentist perspective.
Explain why the interpretation of a p-value as "the probability that the null hypothesis is true" is incorrect from a frequentist perspective.
The frequentist approach does not allow assigning probabilities to the null hypothesis; it's either true or not. It cannot have a probabilistic chance of being true.
Even if a Bayesian approach is used (which does allow probabilities to be assigned to hypotheses), why doesn't the p-value represent the probability that the null hypothesis is true?
Even if a Bayesian approach is used (which does allow probabilities to be assigned to hypotheses), why doesn't the p-value represent the probability that the null hypothesis is true?
The p-value does not represent the probability that the null hypothesis is true because that interpretation is inconsistent with the mathematical calculation of the p-value.
In hypothesis testing, what is the primary piece of information that must always be reported, regardless of the specific test being conducted, and why is it so important?
In hypothesis testing, what is the primary piece of information that must always be reported, regardless of the specific test being conducted, and why is it so important?
The p-value because it directly addresses the likelihood of observing the data (or more extreme data) if the null hypothesis were true.
Explain the convenience that p-values offer in hypothesis testing, allowing researchers to avoid pre-specifying an alpha level.
Explain the convenience that p-values offer in hypothesis testing, allowing researchers to avoid pre-specifying an alpha level.
Briefly explain the contention regarding whether to report the exact p-value obtained from a hypothesis test or to simply state whether p < α for a predetermined significance level. What are the benefits of reporting the exact p-value?
Briefly explain the contention regarding whether to report the exact p-value obtained from a hypothesis test or to simply state whether p < α for a predetermined significance level. What are the benefits of reporting the exact p-value?
Flashcards
P-value Misinterpretation
P-value Misinterpretation
The incorrect interpretation of a p-value as the probability that the null hypothesis is true.
Frequentist Hypothesis Testing
Frequentist Hypothesis Testing
A frequentist tool where hypotheses are either true or false, probabilities can't be assigned to them.
Reporting Hypothesis Test Results
Reporting Hypothesis Test Results
Always report the p-value and whether the outcome was significant.
Significance Level (alpha)
Significance Level (alpha)
Signup and view all the flashcards
P-value Convenience
P-value Convenience
Signup and view all the flashcards
Study Notes
- A common but incorrect interpretation of the p-value is "the probability that the null hypothesis is true."
- This is wrong for two reasons:
Frequentist Approach
- The frequentist approach doesn't allow assigning probabilities to the null hypothesis; it's either true or not.
Bayesian Approach
- Even in the Bayesian approach, the p-value doesn't correspond to the probability that the null is true and is inconsistent with calculations.
- You should never interpret a p-value this way.
Reporting Hypothesis Results
- Several pieces of information usually need reporting, varying from test to test.
- A particularly detailed example of reporting can be seen in Section 12.1.9.
- Regardless of the test, you must always report something about the p-value and whether the outcome was significant.
- Exactly how to do this is subject to some disagreements.
The Issue of Exact vs. Inequality Reporting
- P-values are convenient for direct interpretation, meaning an alpha level does not need specification to run the test.
- Softening decision-making has advantages, removing treating p = .051 differently from p = .049.
- Flexibility of p-values is both an advantage and a disadvantage, giving the researcher too much freedom.
- Researchers could change their mind about error tolerance after seeing the data.
- Temptation may arise to manipulate results, which could cause biased data reading.
- Specifying the alpha value in advance ensures honesty.
Two Possible Solutions
-
It is rare for a researcher to specify a single alpha level ahead of time
-
Conventionally, researchers rely on levels: 0.05, 0.01 and 0.001
-
Indicating significance levels allows rejecting the null
-
Since these levels are fixed, people cannot choose alpha values
-
Some prefer reporting exact p-values to let readers interpret p = .06 values
-
In practice, "p < .001" is common for small p-values, as software may not print them
-
Human minds struggle to process numbers like .0000000001
-
In practice, the alternative hypothesis is a near certainty
-
Statistical tests rely on simplifications, an approximation, and assumptions
-
A stronger analysis of the study with a confidence value greater than .001 should be used.
Hypothesis Tests
- A binomial test named binom.test() exists as an R function
- An R command can run the test in practice
The Power Function
- The major design principle in hypothesis testing tries to control the Type I error rate.
- Fixing α = .05 attempts to ensure that only 5% of true null hypotheses are incorrectly rejected.
- Type II errors should not be ignored, minimizing β, the Type II error rate.
- Maximizing the power of the test, defined as 1 - β, is a secondary goal.
Defining the Error
- A Type II error means accepting a false null hypothesis.
- Calculating a single β may not always be achieved
- The alternative hypothesis corresponds to many values
- Calculating rejection is more probable when the null is incorrect
- The power of a test depends on the true value of θ
Effect Size and George Box
- If the world's true state differs from the null hypothesis, power is high and low if similar
- Measuring how similar is quantifying effect size with multiple definitions
Cohen and Ellis
-
To capture how big the difference is between true population parameters and null assumptions
-
If $latex \theta_0 = 0.5$ marks the null hypothesis, an effect size is $latex \theta - \theta_0$
-
Reporting effect size is standard and should accompany hypothesis tests
-
A hypothesis test is whether an observed result is "real"
-
A size tells if you should care
-
Hypothesis test shows the observed effect is real
-
The effect size of the test indicates whether or not to care
Maximizing Power
- Scientists aim to maximize the power of their experiments; an experiment must work correctly
- Clever design or the sample size affects the power to reach its maximum potential
- It would be useful to know how much power you're likely to have to avoid low success.
Power Analysis
-
Power Analysis involves guessing an estimate of what sample size needs to be
-
Helpful: since you can tell if you can make that experiment sucessfully
-
Arguments state that power analysis is a required component in designing the experiment
-
Writing a grant application can at times lead to overthinking and excess power analysis with the only objective being the writing of the grant application itself.
-
It is important to determine the effect size that can be calculated for any particular setting.
Issues in NHST
- Orthodox framework for null hypothesis significance testing (NHST) is used
- Dominates inferential statistics since the 20th century meaning the process is very consistent
- Essential to understand, though flawed because the processes are difficult at times
Mashup History
-
NHST combines Fisher and Neyman's hypothesis testing approaches
-
Fisher: Determine if the null hypothesis is inconsistent with data for safe rejection, without any alternatives
-
Neyman: Hypothesis testing is a guide to action and requires specified alternatives to assess the test's power
-
The mishmash result of this, for instance, is that the p value relates to extreme data
-
There remain value controversies when taking this approach
Why the $latex H_0$ Statistic is Terrible
- The p-value should not be confused as the $latex H_0$ being true because what they actually state is often not true
- Null's should not be confused with not existing
Traps During Implementation
- As per the orthodox approach of null hypothesis statistic testing: it should have high drawbacks and can be misleading
- It is reasonable to not assume stupidity, or an inability to work with statistics, but avoid using the $latex H_0$ value
- Without checking and being consistent against some data it should not be stated
Example Problems
- Analyzing the data to ensure females and males are tested with the correct $latex H_0$ is essential
- Correct data analysis is something one must test or the given response will offer no value
- It is essential to know if each test answer is possible and makes sense in the real world.
Quick Notes
-
A recap of the notes:
-
Research hypotheses and statistical hypotheses. Null and alternative hypotheses. (Section 11.1).
-
Type 1 and Type 2 errors (Section 11.2)
-
Test statistics and sampling distributions (Section 11.3)
-
Hypothesis testing as a decision making process (Section 11.4)
-
p-values as "soft" decisions (Section 11.5) Reporting the results of a hypothesis test (Section 11.6)
-
Effect size and power (Section 11.8) A few issues to consider regarding hypothesis testing (Section 11.9)
-
Chapter 17 reintroduces the theories of statistical test against a few $H_0$ examples
-
The idea is to explain why something is being testing and what is attempting to be displayed in the data
Studying That Suits You
Use AI to generate personalized quizzes and flashcards to suit your learning preferences.
Related Documents
Description
P-values are often misinterpreted as the probability the null hypothesis is true. From a frequentist view, hypotheses are fixed, and p-values reflect the data's compatibility with the null. Even Bayesians don't see p-values as P(null true). Reporting the p-value is crucial as it conveys the evidence against the null hypothesis.