Hypothesis Testing and Descriptive Statistics Formulas

Choose a study mode

Play Quiz
Study Flashcards
Spaced Repetition
Chat to Lesson

Podcast

Play an AI-generated podcast conversation about this lesson

Questions and Answers

When comparing two independent samples with unknown variances, which test is appropriate if the variances are assumed to be unequal?

  • Standard t-test
  • Z-test
  • Paired t-test
  • Welch's t-test (correct)

In hypothesis testing, if z_calculated falls in the rejection region, what decision should be made regarding the null hypothesis?

  • Modify the null hypothesis.
  • Reject the null hypothesis. (correct)
  • Fail to reject the null hypothesis.
  • Accept the null hypothesis.

What is the purpose of performing a post-hoc test (like Tukey HSD) after an ANOVA?

  • To reduce the probability of Type I error.
  • To determine if the overall ANOVA result is significant.
  • To identify which specific group means are different from each other. (correct)
  • To validate the assumptions of ANOVA.

In linear regression, what does the coefficient of determination ($r^2$) represent?

<p>The proportion of variance in the dependent variable explained by the independent variable(s). (B)</p> Signup and view all the answers

What does a statistically significant interaction term in a two-way ANOVA suggest?

<p>The effects of one factor depend on the level of the other factor. (C)</p> Signup and view all the answers

When conducting a hypothesis test comparing the means of two dependent samples, what is the null hypothesis typically?

<p>The mean difference between the paired observations is zero. (A)</p> Signup and view all the answers

In the context of linear regression, what does the F-test assess?

<p>The overall significance of the regression model. (D)</p> Signup and view all the answers

If Levene's test indicates a violation of the assumption of homogeneity of variances, which statistical test might be more appropriate than the standard ANOVA?

<p>Welch's ANOVA. (C)</p> Signup and view all the answers

What is the purpose of linearizing a non-linear equation before performing regression analysis?

<p>To simplify the calculations and apply linear regression techniques. (C)</p> Signup and view all the answers

In multiple linear regression, what does a partial regression coefficient represent?

<p>The change in the dependent variable associated with a one-unit change in the independent variable, holding all other independent variables constant. (A)</p> Signup and view all the answers

Flashcards

Moyenne (Mean)

A measure of the average value in a dataset.

Variance

A measure of the spread of data around the mean.

Ecart-type (Standard Deviation)

Square root of the variance, indicating data dispersion.

Student's t-test

A statistical test to compare an observed mean to a theoretical value.

Signup and view all the flashcards

Null Hypothesis (H0)

Hypothesis stating no significant difference between groups.

Signup and view all the flashcards

Alternative Hypothesis (H1)

Hypothesis stating there is a significant difference between groups.

Signup and view all the flashcards

Type I Error (False Positive)

Rejecting the null hypothesis when it is actually true.

Signup and view all the flashcards

Type II Error (False Negative)

Failing to reject the null hypothesis when it is false.

Signup and view all the flashcards

ANOVA (Analysis of Variance)

A statistical test used to compare the means of two or more groups.

Signup and view all the flashcards

Linear Regression

A statistical method used to model the relationship between a dependent variable and one or more independent variables.

Signup and view all the flashcards

Study Notes

Hypothesis Testing

  • Hypothesis tests are statistical methods used to make decisions or inferences about population parameters based on sample data.

Descriptive Statistics Reminders

  • These formulas provide ways to compute key measures like the mean, variance, and standard deviation for both ungrouped and grouped data.

Mean (Average)

  • For a series of values: xÌ„ = (∑xi) / N, where ∑xi is the sum of all values and N is the number of values.
  • For grouped data: xÌ„ = (∑xini) / N, where xi is the value, ni is the frequency, and N is the total frequency.

Variance

  • For a population: σ² = ∑(xi – xÌ„)² / N, measures the spread around the population mean.
  • For a sample: S² = ∑(xi – xÌ„)² / (N-1), estimates the population variance using sample data. It includes a correction (N-1) for unbiased estimation.
  • SCE: var = SCE/ddl; calculation isn't clearly defined in this context.

Standard Deviation

  • Standard Deviation S = √var: Measures the typical deviation of values from the mean, essentially the square root of the variance.

Student's t-Test

  • t-Tests are a class of hypothesis tests used to compare means. The appropriate test depends on the nature of the data (paired, independent) and knowledge about variances.

Comparing a Sample Mean to a Theoretical Value

  • Use this when you want to determine if the mean of a sample is statistically different from a known or hypothesized population mean (µ0).
Variance of Population Known
  • H0: µ1 = µ0 (Null hypothesis: no significant difference).
  • H1: µ1 ≠ µ0 (Alternative hypothesis: there is a significant difference).
  • Test statistic: zob = |xÌ„ − µ0| / (σ / √n)
  • Decision:
    • If zob < z théo: Reject H1, and cannot reject H0 (Fail to reject null hypothesis) (no significant difference)
    • If zob > z théo: Reject H0 (significant difference)
Variance of Population Unknown
  • H0: µ1 = µ0 (Null hypothesis: no significant difference).
  • H1: µ1 ≠ µ0 (Alternative hypothesis: there is a significant difference).
  • Test statistic: tob = |xÌ„ − µ0| / (S / √n)
  • Decision:
    • If tob < t théo : Fail to reject null hypothesis (Ho retained) (no significant difference)
    • If tob > t théo : Reject null hypothesis H0 (significant difference)

Comparing Two Independent Sample Means

  • This is used to see if the means of two independent groups are significantly different.
Variances of Populations Known
  • H0: µ1 = µ2 (Null hypothesis: no significant difference between means).
  • H1: µ1 ≠ µ2 (Alternative hypothesis: means are different).
  • Test statistic: zcal = (xÌ„1 - xÌ„2) / √(σ1²/n1 + σ2²/n2)
  • Decision:
    • If zcal < z théo: Fail to reject null hypothesis (Ho retained) (no significant difference))
    • If zcal > z théo: Reject null hypothesis H0 (significant difference)
Variances of Populations Unknown
  • H0: µ1 = µ2 (Null hypothesis: no significant difference between means).
  • H1: µ1 ≠ µ2 (Alternative hypothesis: means are different).
  • Test statistic: tcal = (xÌ„1 - xÌ„2) / √(Sc²(1/n1 + 1/n2))
    • Sc² = (S1²(n1 - 1) + S2²(n2 – 1)) / (n1 + n2 - 2) (pooled variance estimate)
  • Decision:
    • If tcal < t théo: Fail to reject null hypothesis Ho (no significant difference)
    • If tcal > t théo: Reject null hypothesis H0 (significant difference)

Comparing Two Dependent Sample Means (Paired)

  • Used when comparing means of two related samples (e.g., before and after treatment on the same subjects).
  • H0: µdiff = 0 (Null hypothesis: no average difference between paired observations).
  • H1: µdiff ≠ 0 (Alternative hypothesis: there is a difference).
  • Test statistic: tcal = xÌ„diff / (Sdiff / √n)
  • Decision:
    • If tcal < t théo: Fail to reject null hypothesis Ho (no significant difference).
    • If tcal > t théo: Reject null hypothesis H0 (significant difference).

Conditions for Applying the t-Test

Normality

  • Data should be approximately normally distributed
    • Assess visually (histograms, Q-Q plots).
    • Shapiro-Wilk test for normality.

Homogeneity of Variance (Homoscedasticity)

  • For independent two-sample t-tests, the variances of the two groups should be approximately equal.
    • Test using Fisher-Snedecor (F-test)
    • H0: σ1² = σ2² (variances are equal).
    • F = Variance max / Variance min
  • Conclusion:
    • If variances are equal: Use the regular Student's t-test.
    • If variances are unequal: Use the modified Student's t-test (Welch's t-test) to correct for unequal variances.

ANOVA (Analysis of Variance) Tests

  • ANOVAs are used for comparing means of two or more groups.

Conditions for Performing ANOVA

  • Normality: The data within each group should be approximately normally distributed.
  • Homogeneity of variances: Variances should be approximately equal across all groups.
    • Use the F test or Bartlett's test (latter for more than two groups). Bartlett is more sensitive to departures from normality
    • F test is used to verify homogeneity of variances.

One-Way ANOVA (ANOVA 1 Facteur)

  • Used to compare the means of several groups based on a single factor (independent variable).
  • Hypothesis:
    • H0: m1 = m2 = m3 (Null hypothesis: no significant difference between the means of the groups).
ANOVA Table Structure
  • Factorial (Between-groups) Variation:
    • SCE = ni * ∑(mi – M)², where ni is the number of individuals, mi is the mean of each group, M is the overall mean, and C is the number of groups.
    • ddl = C-1 (degrees of freedom)
    • CM = ni * ∑(mi – M)² / (C-1)
  • Residual (Within-groups) Variation:
    • SCE = SCEtotal – SCEfactorielle
    • ddl = N-C (degrees of freedom)
    • CM = SCE résiduelle / (N-C)
  • Total Variation:
    • SCE = ∑(xi – M)²
    • ddl = N-1 (degrees of freedom)
  • F-statistic: F = CM Fact / CM Res
  • Decision:
    • If Fcal < F tab: Fail to reject null hypothesis H0 ("Pas de différence significative").
    • If Fcal > F tab: Reject null hypothesis H0 ("Différence significative").
Post-Hoc Comparisons (Tukey's HSD)
  • Used after a significant ANOVA result to determine which specific groups differ significantly from each other.
    • Examples: m1 ≠ m2 ≠ m3, m1 = m2 ≠ m3, m1 ≠ m2 = m3

Two-Way ANOVA (ANOVA 2 Facteurs)

  • Used to examine the effects of two factors (independent variables) on a dependent variable.
  • Hypotheses:
    • H0: μ1 = μ2 = µ3 (The first factor has no significant effect on the dependent variable).
    • H0: μ'1 = μ'2 (The second factor has no significant effect on the dependent variable).
    • H0: Interaction: The interaction between the two factors has no significant effect.
ANOVA Table
  • Total: Variation in entire data set
  • Cellular: Represents variation considering all individual cells
  • Factor A: Effect of first factor
  • Factor B: Effect of second factor
  • Interaction A/B: Interaction between factors
  • Residual: Error within groups
  • Decision-Making: Analogous to One-Way ANOVA: compare Fcal to Ftab for main effects and interaction.
  • Post-hoc Comparisons (Tukey's HSD)
    • If H0 is rejected, this identifies which specific groups are different in the test.

Linear Regression

  • Linear regression models the linear relationship between a dependent variable 'y' and one or more independent variables 'x'.

Conditions for Application

  • Verify the normality of the observations (y).
  • Error on x must be negligible.
  • Check data for homoscedasticity.

Equation of the Model

  • y = ax + b + ε
    • a : slope
    • b : y-intercept
    • ε : Residual error

Calculating a and b

  • a (Slope):
    • a = S²xy / S²x or a = cov_xy / S²x
  • Ordonnée à l'origine:
    • b = ȳ – a * xÌ„

Coefficient of Correlation

  • Expressed as * r *, indicates the strength and direction of a linear relationship.
  • Ranges from -1 to +1.

Evaluating Regression Quality

  • Decomposition of Variance: ∑(yi – ȳ)² = ∑(Å· – ȳ)² + ∑(yi – Å·)²
  • Coefficient of Determination: r² = (SCEregression / SCEtotal) * 100; Indicates the proportion of variance in 'y' explained by 'x'.
  • Test of the Slope: Used to test if there is a significant linear relationship.
    • H0: β = 0 → the variable (y) is linearly independent of (x)
    • Tableau ANOVA: Used to test the slope
ANOVA Table for Regression
  • Regression, Residual and Total variations are calculated and assessed via the F statistic.
  • Decision:
    • P > 0.05 means fail to reject H0: the variable (y) is linearly independent of (x)
    • P < 0.05 means reject H0: the variable (y) is linearly dependent of (x)

Multiple Linear Regression

  • Multiple linear regression extends simple linear regression to include multiple independent variables.
  • Equation of the model
  • y = a1x1 + a2x2 + a3x + … + apxp + b
  • Calculation:
  • y = f(x1,x2) = a1x1 + a2x2 + b
  • Evaluation:
  • Coefficient of determination= r² = (SCE regression / SCE total) * 100
  • Test of the slope:
  • Table ANOVA is used to define this.
  • Decision
  • P > 0.05 means fail to reject H0: the variable (y) is linearly independent of (x)
  • P < 0.05 means reject H0: the variable (y) is linearly dependent of (x1, x2), in less than one of the points is responsible: β1 ≠ β2 ≠ 0 , β1 = β2 ≠ 0 , β1 ≠ β2 = 0

Non-Linear Regression

  • Linearitazion provides a method for computing linear regression on non-linear data.
  • Equation Expo= y = beᵃx
  • Linearization and Equation of the model =
  • Ln y = Ln beaÑ…
  • Ln y = Ln b + ax
  • Y = B + ax
  • Calculate a and B using the equations in the linear regression table.
  • NOTE substitute Y for y in the equations and calculate using X = x
  • Inverse Model= y = ax / b+x
  • Linearization and Equation of the model
    • 1/y = b+x / ax
    • 1/y = b/ax + x/ax
    • 1/y = b/a * 1/x + 1/a
    • Y = AX + B
    • Calculate A and B using the equations in the linear regression table.
    • NOTE substitute Y for 1/y and X for 1/x in the equations and calculate.
  • Michaelis Mentan Model V = Vmax [S] / Km + [S]
  • Use:
  • 1/V= a . 1/[S] + b
  • Y = AX + B
  • Calculate the same as above.

Studying That Suits You

Use AI to generate personalized quizzes and flashcards to suit your learning preferences.

Quiz Team

Related Documents

Use Quizgecko on...
Browser
Browser