Statistics Chapter 20: Nonparametric Tests

Podcast

Play an AI-generated podcast conversation about this lesson

Questions and Answers

What is the calculated mean ($\mu_W$) in the provided information?

13.2288

105 (correct)

What does the p-value of 0.0004 indicate in this context?

The null hypothesis is accepted.

There is weak evidence against the null hypothesis.

There is strong evidence to reject the null hypothesis. (correct)

The results are inconclusive.

What is the significance of the z-score of 3.33?

It indicates a low probability of observing such a value under the null hypothesis. (correct)

It represents the average of the dataset.

It shows that the data is normally distributed.

It corresponds to the highest value in the dataset.

Which test is a nonparametric alternative to the one-way ANOVA?

Kruskal-Wallis test Signup and view all the answers

What is the first step in calculating the test statistic for testing three or more population medians?

Pool the observations from all samples and rank them Signup and view all the answers

What assumption does the ANOVA F-test make about the populations being tested?

They must be normally distributed. Signup and view all the answers

What is used as the test statistic in the Kruskal-Wallis test?

Ranks of the samples Signup and view all the answers

What does a substantial deviation in ranked sums from one another suggest?

Not all population medians are the same Signup and view all the answers

Which of the following is NOT a characteristic of the Kruskal-Wallis test?

It tests for equality of population means. Signup and view all the answers

In the hypotheses for testing population medians, which is the null hypothesis?

$H_0: m_1 = m_2 = m_3 = m_4$ Signup and view all the answers

Which conclusion can be drawn if the alternative hypothesis $H_A$ is accepted?

At least one population median is different Signup and view all the answers

What is the formula used to calculate the z-score?

$ z = \frac{W - \mu_W}{\sigma_W} $ Signup and view all the answers

What is the significance of calculating ranked sums $R_i$ for each sample?

To compare medians across samples Signup and view all the answers

What is being tested in the example involving monthly sales for store layouts?

Comparison among store layout designs Signup and view all the answers

When pooling observations from all samples, what is the subsequent step?

Calculate the ranked sum for each sample Signup and view all the answers

What indicates that the sample medians might differ when testing multiple populations?

Some ranked sums are significantly different Signup and view all the answers

What is the null hypothesis (H0) regarding the GDP growth rate in the example?

The GDP growth rate is random Signup and view all the answers

Which statistic is used to analyze the growth rate in the provided example?

Median growth rate Signup and view all the answers

What does the alternative hypothesis (HA) suggest about the GDP growth rate?

The GDP growth rate is predictable Signup and view all the answers

In the example provided, what is the significance of analyzing runs above and below the median?

To assess the randomness of the GDP growth rate Signup and view all the answers

What is the purpose of setting up a hypothesis test in the context of GDP growth analysis?

To identify patterns within the growth data Signup and view all the answers

In statistical testing, what does the term 'runs' refer to?

A series of consecutive data points in one direction Signup and view all the answers

When performing hypothesis testing, what role does the median growth rate serve?

It defines the baseline for random behavior Signup and view all the answers

How can one determine if the GDP growth rate is random using the runs test?

By analyzing the frequency and sequence of increases and decreases Signup and view all the answers

What is the purpose of rejecting the null hypothesis in this scenario?

To suggest that at least one population median differs from the others. Signup and view all the answers

What is the calculated p-value in this example?

0.009 Signup and view all the answers

What significance level is used for testing in this example?

5% Signup and view all the answers

In the provided data, how is the test statistic for three or more population medians calculated?

Using the chi-squared distribution formula. Signup and view all the answers

What does the notation $𝑝 𝜒32 ≥ 11.465$ represent?

The probability associated with the test statistic under the null hypothesis. Signup and view all the answers

What is the degrees of freedom (df) used in this analysis?

3 Signup and view all the answers

What conclusion can be drawn if the null hypothesis is rejected?

At least one median is distinct. Signup and view all the answers

How many different layouts are implied in the analysis based on the given context?

3 Signup and view all the answers

What does the null hypothesis (𝐻0) state in the context of testing three or more population medians?

All population means are equal. Signup and view all the answers

What is the focus of the alternative hypothesis (𝐻𝐴) when testing multiple populations?

Not all population means are equal. Signup and view all the answers

In the test statistic formula 𝐻 = 12 𝑘 𝑅𝑖 / (𝑛(𝑛 + 1)), what does 𝑅𝑖 represent?

The rank sum of the ith sample. Signup and view all the answers

What does the variable '𝑘' signify in the context of testing three or more population medians?

The number of populations being compared. Signup and view all the answers

When can the test statistic 𝐻 be approximated by a chi-square distribution?

If 𝑛𝑖 ≥ 5 for all i. Signup and view all the answers

In the formula for the test statistic, what does '𝑛' represent?

The sum of the sizes of all samples. Signup and view all the answers

In the context of hypothesis testing, which of the following is true about the degrees of freedom (𝑑𝑓)?

𝑑𝑓 = 𝑘 - 1. Signup and view all the answers

Which of the following is NOT required for conducting a test of three or more population medians?

Equal sample sizes across populations. Signup and view all the answers

What does the symbol $p$ represent in the context of the Sign Test?

The population proportion of plus signs Signup and view all the answers

Which formula correctly represents the test statistic used in the Sign Test?

$z = \frac{p̄ - 0.5}{0.5/n}$ Signup and view all the answers

When is the test for the Sign Test considered valid?

When $n \geq 10$ Signup and view all the answers

In the example involving the pizza chain, what scale is used for the customer ratings?

1 to 5 scale Signup and view all the answers

What does the variable $p̄$ represent?

The estimator of the population proportion of plus signs Signup and view all the answers

What rating corresponds to the worst possible feedback in the customer study?

1 Signup and view all the answers

In the Sign Test analysis, what are $X$ and $n$ specifically referring to?

$X$ is the total number of positive ratings and $n$ is the sample size Signup and view all the answers

What is the implication of the null hypothesis in the Sign Test?

There is no difference in the proportions of ratings Signup and view all the answers

Study Notes

Chapter 20: Nonparametric Tests

This chapter covers nonparametric tests used in business statistics.
Nonparametric tests are used when assumptions of the underlying populations are not met in parametric tests.
These tests are particularly useful for small sample sizes or when data doesn't originate from a normal distribution.
They are also called distribution-free tests.

20.1 Testing a Population Median

Parametric tests in earlier chapters make assumptions about underlying populations.
These assumptions, if not met, can lead to misleading results.
Nonparametric tests use fewer assumptions and are suitable when sample sizes are small or data doesn't originate from a normal distribution.
Wilcoxon signed-rank test: Makes no assumptions about the distribution of the population, except for continuity and symmetry. The hypothesis test is about the population median. Different hypothesis tests use different null and alternative hypotheses

20.1 Testing a Population Median - Calculations

Calculate differences (𝑑ᵢ = 𝑥ᵢ − 𝑚₀) from the hypothesized median (𝑚₀)
Take the absolute value of each difference.
Discard any zero differences.
Rank the absolute differences from smallest to largest.
Assign average ranks for ties.
Sum the ranks of positive differences (T⁺) and negative differences (T⁻).

20.1 Testing a Population Median - Normal Approximation

If the sample size (n) is less than or equal to 10, use special tables to find the p-value.
If n is greater than 10, use normal approximation, 𝑚 = n(n+1)/4 𝜎 = √(n(n+1)(2n+1)/24)

20.2 Testing Two Population Medians

T-tests assume normal populations for matched-pairs or independent samples.
Wilcoxon signed-rank test: A nonparametric equivalent for matched-pairs.
Wilcoxon rank-sum test (Mann-Whitney): Used for independent samples.
These tests are less powerful than their parametric counterparts if the normality assumption is valid.
For matched-pairs, calculate differences (D= X - Y) between paired samples.

20.2 Testing Two Population Medians - Wilcoxon Signed-Rank Test

The Wilcoxon test for a matched-pairs sample is nearly identical to its use for a single sample.
Calculate differences (D= X - Y) between paired samples.
Follow the same procedure as a single population test to compute T+.

20.2 Testing Two Population Medians - Wilcoxon Rank Sum Test

Calculates a ranked sum (W₁) for each sample in a combined ordered list of both samples.
For samples with n₁ ≤ 10 and n₂ ≤ 10, use special tables or software to find p-values
For samples n₁ ≥ 10 and n₂ ≥ 10 use a normal approximate distribution calculation: 𝜇 = (n₁*n₂)/(2), 𝜎 = √((n₁n₂(n₁ + n₂ + 1))/12)

20.3 Testing Three or More Population Medians

ANOVA F-test assumes normally distributed data with equal variances
Kruskal-Wallis test: A nonparametric alternative to one-way ANOVA.
Kruskal-Wallis is based on ranks, an extension of the Wilcoxon rank-sum test.

20.3 Testing Three or More Population Medians - Calculations

Combine observations from all k samples and rank them from 1 to n.
Calculate the ranked sum (Rᵢ) for each sample.
Calculate H = (12/(n(n+1)))(Σ(Rᵢ² / nᵢ))− 3(n+1)

20.4 Spearman Rank Correlation Test

Spearman's rank correlation measures the correlation between two variables.
It is interpreted similarly to the Pearson coefficient (between -1 and +1).
Based on ranks of observations, not raw data.
A nonparametric alternative to Pearson's correlation, when normality assumption isn't met.

20.4 Spearman Rank Correlation Test - Calculations

Rank the observations for each variable from smallest to largest.
Calculate the difference (dᵢ) between the ranks for each pair of observations.

20.4 Spearman Rank Correlation Coefficient

𝑟ₛ=1−(6Σdᵢ²)/(n(n²−1))

20.5 Sign Test

Used for matched-pairs ordinal data.
Only interested in the direction (positive or negative) of the difference between paired observations.
Does not use the magnitude of the difference.
Valid when the sample size (n) ≥ 10

20.5 Sign Test - Test Statistic

Calculate the sample proportion of positive signs (p̂ = X/n).
Calculate the z-statistic: z =(p̂ − 0.5)/(0.5/√n)
Determine the p-value using a standard normal distribution table

20.6 Runs Test

Determines if a sequence of observations is random.
Applicable to both categorical and numerical data.
Categorical data: Assign categories (e.g., above or below the median) when numerical
Compute the number of runs (R).

20.6 Runs Test - Test Statistic

If n₁ and n₂ represent the number of occurrences of each category, n = n₁ + n₂, and R is the number of runs.
Calculate the mean and standard deviation of sample R: 𝜇ᵣ= (2n₁n₂)/(n₁ + n₂)+1, 𝜎ᵣ=√((2n₁n₂(2n₁n₂-n))/(n₁ + n₂)²(n₁ + n₂−1))
Calculate the z-score: 𝑧=(𝑅−𝜇ᵣ)/𝜎ᵣ
Determine the p-value using a standard normal table.

Studying That Suits You

Use AI to generate personalized quizzes and flashcards to suit your learning preferences.

Description

This chapter focuses on nonparametric tests used in business statistics, particularly their application when sample sizes are small or data is not normally distributed. It explains the advantages of using tests such as the Wilcoxon signed-rank test that require fewer assumptions compared to parametric tests. Learn how these distribution-free tests can assist in making valid statistical conclusions when traditional methods fall short.