Understanding P-values and Hypothesis Testing
5 Questions
0 Views

Choose a study mode

Play Quiz
Study Flashcards
Spaced Repetition
Chat to Lesson

Podcast

Play an AI-generated podcast conversation about this lesson

Questions and Answers

Explain why the interpretation of a p-value as "the probability that the null hypothesis is true" is incorrect from a frequentist perspective.

The frequentist approach does not allow assigning probabilities to the null hypothesis; it's either true or not. It cannot have a probabilistic chance of being true.

Even if a Bayesian approach is used (which does allow probabilities to be assigned to hypotheses), why doesn't the p-value represent the probability that the null hypothesis is true?

The p-value does not represent the probability that the null hypothesis is true because that interpretation is inconsistent with the mathematical calculation of the p-value.

In hypothesis testing, what is the primary piece of information that must always be reported, regardless of the specific test being conducted, and why is it so important?

The p-value because it directly addresses the likelihood of observing the data (or more extreme data) if the null hypothesis were true.

Explain the convenience that p-values offer in hypothesis testing, allowing researchers to avoid pre-specifying an alpha level.

<p>P-values allow researchers to determine the significance of their results without needing to set an alpha level beforehand.</p> Signup and view all the answers

Briefly explain the contention regarding whether to report the exact p-value obtained from a hypothesis test or to simply state whether p < α for a predetermined significance level. What are the benefits of reporting the exact p-value?

<p>Some prefer reporting only whether p &lt; α, while others advocate for providing the exact p-value. Reporting the exact p-value provides more information about the statistical evidence.</p> Signup and view all the answers

Flashcards

P-value Misinterpretation

The incorrect interpretation of a p-value as the probability that the null hypothesis is true.

Frequentist Hypothesis Testing

A frequentist tool where hypotheses are either true or false, probabilities can't be assigned to them.

Reporting Hypothesis Test Results

Always report the p-value and whether the outcome was significant.

Significance Level (alpha)

The threshold to determine if a result is statistically significant.

Signup and view all the flashcards

P-value Convenience

Computing a p-value means you don’t have to specify an alpha level to run the test.

Signup and view all the flashcards

Study Notes

  • A common but incorrect interpretation of the p-value is "the probability that the null hypothesis is true."
  • This is wrong for two reasons:

Frequentist Approach

  • The frequentist approach doesn't allow assigning probabilities to the null hypothesis; it's either true or not.

Bayesian Approach

  • Even in the Bayesian approach, the p-value doesn't correspond to the probability that the null is true and is inconsistent with calculations.
  • You should never interpret a p-value this way.

Reporting Hypothesis Results

  • Several pieces of information usually need reporting, varying from test to test.
  • A particularly detailed example of reporting can be seen in Section 12.1.9.
  • Regardless of the test, you must always report something about the p-value and whether the outcome was significant.
  • Exactly how to do this is subject to some disagreements.

The Issue of Exact vs. Inequality Reporting

  • P-values are convenient for direct interpretation, meaning an alpha level does not need specification to run the test.
  • Softening decision-making has advantages, removing treating p = .051 differently from p = .049.
  • Flexibility of p-values is both an advantage and a disadvantage, giving the researcher too much freedom.
  • Researchers could change their mind about error tolerance after seeing the data.
  • Temptation may arise to manipulate results, which could cause biased data reading.
  • Specifying the alpha value in advance ensures honesty.

Two Possible Solutions

  • It is rare for a researcher to specify a single alpha level ahead of time

  • Conventionally, researchers rely on levels: 0.05, 0.01 and 0.001

  • Indicating significance levels allows rejecting the null

  • Since these levels are fixed, people cannot choose alpha values

  • Some prefer reporting exact p-values to let readers interpret p = .06 values

  • In practice, "p < .001" is common for small p-values, as software may not print them

  • Human minds struggle to process numbers like .0000000001

  • In practice, the alternative hypothesis is a near certainty

  • Statistical tests rely on simplifications, an approximation, and assumptions

  • A stronger analysis of the study with a confidence value greater than .001 should be used.

Hypothesis Tests

  • A binomial test named binom.test() exists as an R function
  • An R command can run the test in practice

The Power Function

  • The major design principle in hypothesis testing tries to control the Type I error rate.
  • Fixing α = .05 attempts to ensure that only 5% of true null hypotheses are incorrectly rejected.
  • Type II errors should not be ignored, minimizing β, the Type II error rate.
  • Maximizing the power of the test, defined as 1 - β, is a secondary goal.

Defining the Error

  • A Type II error means accepting a false null hypothesis.
  • Calculating a single β may not always be achieved
  • The alternative hypothesis corresponds to many values
  • Calculating rejection is more probable when the null is incorrect
  • The power of a test depends on the true value of θ

Effect Size and George Box

  • If the world's true state differs from the null hypothesis, power is high and low if similar
  • Measuring how similar is quantifying effect size with multiple definitions

Cohen and Ellis

  • To capture how big the difference is between true population parameters and null assumptions

  • If $latex \theta_0 = 0.5$ marks the null hypothesis, an effect size is $latex \theta - \theta_0$

  • Reporting effect size is standard and should accompany hypothesis tests

  • A hypothesis test is whether an observed result is "real"

  • A size tells if you should care

  • Hypothesis test shows the observed effect is real

  • The effect size of the test indicates whether or not to care

Maximizing Power

  • Scientists aim to maximize the power of their experiments; an experiment must work correctly
  • Clever design or the sample size affects the power to reach its maximum potential
  • It would be useful to know how much power you're likely to have to avoid low success.

Power Analysis

  • Power Analysis involves guessing an estimate of what sample size needs to be

  • Helpful: since you can tell if you can make that experiment sucessfully

  • Arguments state that power analysis is a required component in designing the experiment

  • Writing a grant application can at times lead to overthinking and excess power analysis with the only objective being the writing of the grant application itself.

  • It is important to determine the effect size that can be calculated for any particular setting.

Issues in NHST

  • Orthodox framework for null hypothesis significance testing (NHST) is used
  • Dominates inferential statistics since the 20th century meaning the process is very consistent
  • Essential to understand, though flawed because the processes are difficult at times

Mashup History

  • NHST combines Fisher and Neyman's hypothesis testing approaches

  • Fisher: Determine if the null hypothesis is inconsistent with data for safe rejection, without any alternatives

  • Neyman: Hypothesis testing is a guide to action and requires specified alternatives to assess the test's power

  • The mishmash result of this, for instance, is that the p value relates to extreme data

  • There remain value controversies when taking this approach

Why the $latex H_0$ Statistic is Terrible

  • The p-value should not be confused as the $latex H_0$ being true because what they actually state is often not true
  • Null's should not be confused with not existing

Traps During Implementation

  • As per the orthodox approach of null hypothesis statistic testing: it should have high drawbacks and can be misleading
  • It is reasonable to not assume stupidity, or an inability to work with statistics, but avoid using the $latex H_0$ value
  • Without checking and being consistent against some data it should not be stated

Example Problems

  • Analyzing the data to ensure females and males are tested with the correct $latex H_0$ is essential
  • Correct data analysis is something one must test or the given response will offer no value
  • It is essential to know if each test answer is possible and makes sense in the real world.

Quick Notes

  • A recap of the notes:

  • Research hypotheses and statistical hypotheses. Null and alternative hypotheses. (Section 11.1).

  • Type 1 and Type 2 errors (Section 11.2)

  • Test statistics and sampling distributions (Section 11.3)

  • Hypothesis testing as a decision making process (Section 11.4)

  • p-values as "soft" decisions (Section 11.5) Reporting the results of a hypothesis test (Section 11.6)

  • Effect size and power (Section 11.8) A few issues to consider regarding hypothesis testing (Section 11.9)

  • Chapter 17 reintroduces the theories of statistical test against a few $H_0$ examples

  • The idea is to explain why something is being testing and what is attempting to be displayed in the data

Studying That Suits You

Use AI to generate personalized quizzes and flashcards to suit your learning preferences.

Quiz Team

Related Documents

Description

P-values are often misinterpreted as the probability the null hypothesis is true. From a frequentist view, hypotheses are fixed, and p-values reflect the data's compatibility with the null. Even Bayesians don't see p-values as P(null true). Reporting the p-value is crucial as it conveys the evidence against the null hypothesis.

More Like This

Interpreting P-Values in Statistics
10 questions
Hypothesis Testing Overview
16 questions

Hypothesis Testing Overview

InexpensiveImagery2673 avatar
InexpensiveImagery2673
Hypothesis Testing Overview
52 questions
Use Quizgecko on...
Browser
Browser