Hypothesis Testing PDF
Document Details
Uploaded by Deleted User
Tags
Summary
This PDF document provides an introduction to hypothesis testing, a crucial aspect of inferential statistics. It covers the fundamentals, including defining hypotheses, determining the types of hypothesis tests, and explaining the concept of level of significance. The document also addresses errors in hypothesis testing and introduces concepts like Z-tests.
Full Transcript
# MODULE SEVEN - Inferential Statistics - - Hypothesis Testing - Types of Hypothesis Tests - Level of Significance - Errors in Hypothesis Testing - Z-tests on Means - Approaches to Hypothesis Testing ## LEARNING OBJECTIVES At the end of this lesson, students are expected to: 1....
# MODULE SEVEN - Inferential Statistics - - Hypothesis Testing - Types of Hypothesis Tests - Level of Significance - Errors in Hypothesis Testing - Z-tests on Means - Approaches to Hypothesis Testing ## LEARNING OBJECTIVES At the end of this lesson, students are expected to: 1. Give the meaning of hypothesis 2. Explain why there is a need to test the hypotheses 3. Define important terms in hypothesis testing - Statistical hypotheses - Null hypothesis ($H_0$) - Alternative hypothesis ($H_1$) 4. Determine the types of hypotheses tests - One-tailed right directional - One-tailed left directional - Two-tailed non-directional 5. Explain the meaning of level of significance - α = 0.01 - α = 0.05 6. Formulate null and alternative hypotheses 7. Identify the test statistic to be used in testing the significance of difference between means - z-test - t-test 8. Perform simple test of hypothesis using z-test statistic The measures of variability describe how the data deviate from the center of the distribution or from the mean. These are the range, standard deviation, variance and coefficient of variation. A small measure of variability means the data are clustered closely around the mean; the group is somewhat homogeneous; the data are uniformly distributed; or performance is consistent or less variable. On the other hand, the measures of skewness tell you how the data behave relative to the center of the distribution. It measures the degree of symmetry or asymmetry of the distribution from the center. In short, coefficient of skewness tells if the distribution is normal relative to the center, or approximates the normal distribution, or if it is skewed to the left or skewed to the right distribution. Kurtosis tells you if the height of a distribution is normal with regards to its peakedness. The topic under study goes deeper than the plain description of the sample or population. You will learn why groups are different, why one group behaves differently from the other, whether one group is really performing better than the other group, or why the sample is different from the population, etc... Doing these things requires knowledge of "Inferential Statistics", that is, deciding between REALITY and COINCIDENCE! These days, you are unknowingly or unconsciously making hypotheses. If you think you will pass ELEMSTA, if you assume that more than 80% of CSB students are right-brained, if you believe online enrolment is better than the old method of enrolment—these and other assumptions you make everyday are perfect examples of hypotheses. For starters, study the definition of hypothesis. ## Definition **WHAT IS HYPOTHESIS?** - An assumption about the population parameter - An educated guess about the population parameter This is the beginning of Inferential Statistics, thus, a review of past lessons will help you understand the new lesson. All previous discussions are so far classified under descriptive statistics. For example, the measures of central tendency such as the mean, median, and the mode describe the center of the distribution. They give you one value which, in turn, depicts the overall performance of a group. The average grade obtained by a group of students in statistics is referred to as the mean. The middle grade that appears after arranging the grades from lowest to highest is referred to as the median. The grade which occurs the most number of times is called the mode. ## Definitions **HYPOTHESIS TESTING:** This is the process of making an inference or generalization on population parameters based on the results of the study on samples. **STATISTICAL HYPOTHESIS:** It is a guess or prediction made by the researcher regarding the possible outcome of the study. Hypothesis Testing is deciding between what is reality and what is coincidence! ## Definitions **NULL HYPOTHESIS ($H_0$):** is always hoped to be rejected. It always contains "=" sign. **ALTERNATIVE HYPOTHESIS ($H_1$):** - Challenges $H_0$. - Never contains the "=" sign - Uses the "<" or ">" or "≠" sign - Generally represents the idea which the researcher wants to prove. ## Definitions **THE NULL HYPOTHESIS ($H_0$)** Example: - $H_0$: The average GPA of this class is 3.5. - $H_0$: μ = 3.5 **The Alternative HYPOTHESIS ($H_1$)** $H_1$: The average GPA of this class is: - a) higher than 3.5 ($H_1$: μ > 3.5) - b) lower than 3.5 ($H_1$: μ < 3.5) - c) not equal to 3.5 ($H_1$: μ ≠ 3.5) These two types of hypotheses are the ones which you will formulate in the succeeding discussions. You will have no problem in the null hypothesis since it always uses the “=” sign. Your only trouble is the alternative hypothesis, since you have three choices "≠", "<", and ">". But don't worry, there are glaring clues... For practice, construct the null and alternative hypotheses, appropriate for the effectiveness of online learning. **Title 1: An evaluation of the effectiveness of online learning.** **Problem:** The researcher wants to know if online learning has increased the average GPA of CSB students from 80%. - $H_0$: μ = 80; Online learning has not increased the average GPA of CSB students - $H_1$: μ > 80; Online learning has increased the average GPA of CSB students. This is so because the researcher is interested in knowing if online learning has increased the average GPA of CSB students (>, because of the word "increased"). **Title 2: An assessment of the services in the newly renovated CSB canteen.** **Problem:** The management of CSB canteen wants to know if the services in the newly renovated canteen are different from the services before the renovation took place. - $H_0$: μ_Now = μ_Before; The services in the newly renovated CSB canteen have not changed (meaning, the services now are just the same as before). - $H_1$: μ_Now ≠ μ_Before; The services in the newly renovated CSB canteen have changed (meaning, the services are different now than before). This time the researcher is not interested in knowing which services are better — before or after the renovation. He/she is just interested in knowing if there is a change (≠, because the researcher just wants to know if there is change... ). **Title 3: Do tutorial services offered by the SMS-Math Area help students?** **Problem:** The mathematics coordinator wants to know if the number of failures in all math courses is lower than 30% of the total enrollment. - $H_0$: p = 0.30; The number of failures in all math courses is 30% - $H_1$: p < 30; The number of failures in all math courses is lower The Math coordinator wants to know if the number of failures has been reduced or if it is lower than the usual 30% of the total enrollment (<, lower). Now, you already have some idea on how to formulate the null and alternative hypotheses. Have you seen the glaring clues? What are these clues? Can you give some more of these clues? ## The Types of Hypothesis Tests 1. **One-tailed left directional test:** This is used if $H_1$ uses the "<" symbol. - α = 0.05 - Acceptance region = 0.95 - Rejection region = 0.05 - Critical value is obtained from the table 2. **One-tailed right directional test:** This is used if $H_1$ uses the ">" symbol. - α = 0.05 - Acceptance region = 0.95 - Rejection region = 0.05 - Critical value is obtained from the table 3. **Two-tailed test: Non-directional:** This is used if $H_1$ uses the "≠" symbol. - α = 0.05/2 - Acceptance region = 0.95 - Rejection region = 0.025 - Critical value is obtained from the table Therefore: 1. if $H_1$ uses the "≠", the test is two-tailed non-directional 2. If $H_1$ uses the "<", the test is one-tailed left directional 3. If $H_1$ uses the ">", the test is one-tailed right directional Can you give the type of hypothesis test for our examples of research problems in the previous page? ## Level of Significance The level of significance is the area of the rejection region designated by the Greek Letter α (alpha) while the area of the acceptance region is designed by the Greek letter β (Beta). If α = 0.05, β = 0.95, the typical values of α are 0.01 and 0.05. But you prevented from using 0.02, 0.03, ...etc. In your research, however, you just have to use α = 0.05. Hypothesis testing is decision-making. You need to decide whether to reject or not to reject the null hypothesis. The moment you reject the $H_0$, it means it is “wrong.” When you accept the $H_0$, it does not mean it is correct— you simply don't have enough evidence to reject it. ## Decisions Made Regarding $H_0$ (Reject $H_0$/ Do not reject $H_0$) - If you reject $H_0$, it means it is wrong! - If you do not reject $H_0$, it doesn't mean it is correct — you simply don't have enough evidence to reject it! In making decisions, you oftentimes commit errors. In statistics, they are referred to as Type I and Type II errors. ## Errors in Hypothesis Testing **Type I (α error):** Rejecting a true $H_0$! **Type II (ß error):** Accepting a false $H_0$! The probability of committing a Type I error (rejecting a true null hypothesis) is designated by α, while the probability of committing a Type II error (accepting a false null hypotheses) is designated by ß. An α of 0.01 (compared with 0.05) means the researcher is being relatively careful. He/she is only willing to risk being wrong once in 100 times in rejecting the null hypothesis which is true. If $H_0$ is rejected at α = 0.05, the perceived difference is significant, but if it is rejected at α = 0.01, the difference is highly significant. ## Testing the Significance of Difference Between Means **z-test:** - n ≥ 30 - σ is known **t-test:** - n < 30 - σ is unknown **F-test (ANOVA)** To reiterate, the z-test is used when “n is large” or when "n ≥ 30" and σ (population standard deviation) is known. Three types of hypotheses can be tested by z-test. These are: - Testing the significance of difference between - Population or hypothesized mean, that is Population mean vs Sample mean - Two sample means and two sample standard deviations are known, that is Sample mean 1 vs Sample mean 2 - Two sample means and population standard deviation is known, that is Sample mean 1 vs Sample mean 2 Following are the three formulas. There are again clues so that you will not be confused as to what formula to use in the problem. ## Testing the Significance of Difference Between Means "n is large or when n ≥ 30 and σ is known" **z-test** - n ≥ 30 - σ is known - Sample mean 1 vs Sample mean 2 and 2 sample standard deviations are known. - $X_1$ is the mean of sample 1 - $X_2$ is the mean of sample 2 - $n_1$ and $n_2$ are the sample sizes - $S_1$ and $S_2$ are the sample standard deviations Using Microsoft Excel: Go to... "Z-Test: Two-sample For Means" ## Testing the Significance of Difference Between Means "n is large or when n ≥ 30 and σ is known" **z-test** - n ≥ 30 - σ is known - Sample mean 1 vs Sample mean 2 and population standard deviation is known. - $X_1$ is the mean of sample 1 - $X_2$ is the mean of sample 2 - $n_1$ and $n_2$ are the sample sizes - σ is the population standard deviation The two approaches in hypothesis testing are shown below. The first is the critical value approach which compares the computed value of the test statistic and the critical value. The second compares the _p_-value (the area to the right of the computed value) and α. The 5-step solution for each approach is also indicated. ## Approaches in Hypothesis Testing - **Critical value approach** - **p-value approach** ### Computed vs. Critical - 5-step solution 1. $H_0$: 2. $H_1$: 3. α = ; Cri-value = 4. Decision rule: Reject $H_0$ if Comp-value ≥ Cri-value 5. Decision: 6. Conclusion: ### p-value vs. α - 5-step solution 1. $H_0$: 2. $H_1$: 3. α = ; p-value = 4. Decision rule: Reject $H_0$ if p-value < α 5. Decision: 6. Conclusion: Since this is the first time that you will attempt to test the hypotheses, there is a need to learn the critical value approach; the _p_-value approach will be discussed in the next module. Although all the values for both approaches can be obtained by generating the Microsoft Excel printout (discussed later in this module), it is also advantageous to know the step-by-step procedure in getting the computed and the critical values. The reason for this is that Microsoft Excel can only process problems for two samples using z-tests. The following shows you how to get the critical value. ## Finding Critical Values: One-Tailed What is _z_ , given α = 0.05? Notice that in the critical value approach in hypothesis testing, the comparison is between z-computed and z-critical or tabular. In this text, the decision rule (step 3) will be based on the absolute value of z-computed and z-critical or tabular. A table is constructed so that you don't have to go back to the areas under the normal curve table. You will always refer to this table whenever you use the z-test in hypothesis testing. ## Z - Table (Critical Values) | Test | α | 0.01 | 0.05 | | -------------- | ------ | ------ | ------ | | One-tailed | | ±2.33 | ±1.65 | | Two-tailed | | ±2.58 | ±1.96 | ## Some Examples **Example 1.** The average score in the final examination in College Algebra at XYZ College is known to be 80 with a standard deviation of 10. A random sample of 39 students was taken from this year's batch and it was found that they have a mean score of 84. Test at α = 0.05. a. Is this an indication that this year's batch performed better in College Algebra than the previous batches. **Solution:** a. Since _n_ = 39, and only one sample mean is given, use the z-test, and the test statistic is: $Z = \frac{(x - μ)\sqrt{n}}{\sigma}$ Substituting μ = 80, _x_ = 84, _n_ = 39, and σ = 10 in the formula, you obtain: $Z = \frac{(84 - 80)\sqrt{39}}{10}$ = 2.50. Below is the 5-step solution: 1. $H_0$: μ = 80; This year's batch is as good as the previous batches in College Algebra. 2. $H_1$: μ > 80; This year's batch is better in College Algebra than the previous batches 3. α = 0.05; one-tailed test; $Z_{tab}$ is +1.65 (Can you give the reason why?) 4. Decision rule: Reject $H_0$ if |$Z_c$ (2.50)| ≥ |$Z_{tab}$ (1.65)|, that is if 2.53 > 1.65. 5. Decision: Reject $H_0$, because 2.50 > 1.65. 6. Conclusion: This year's batch is better in College Algebra than the previous batches. The curve is shown below: b. Is the mean score in College Algebra of this year's batch different from 80? Test at α = 0.05. **Solution:** a. Since you use the same problem, _Z_ = 2.50. Below is the 5-step solution: 1. $H_0$: μ = 80; The mean score of this year's batch is equal to 80. 2. $H_1$: μ ≠ 80 ; The mean score of this year's batch not equal to 80. 3. α = 0.05; two-tailed test; $Z_{tab}$ = +1.96 (because _Z_ is +) 4. Decision rule: Reject $H_0$ if |$Z_c$ (2.50)| ≥ |$Z_{tab}$ (1.96)|, that is if 2.53 > 1.96. 5. Decision: Reject $H_0$, since 2 (2.50) > $Z_{tab}$ (1.96). 6. Conclusion: The mean score of this year's batch is not equal to 80. **Example 2.** The dean of XYZ College wants to know which method is better in teaching College Algebra. He/ she took a random sample of 40 students handled by only one teacher in lecture and laboratory, and found it to have a mean final grade of 83 with a standard deviation of 7. Fifty students from a group handled by two different teachers in lecture and laboratory were randomly taken and it was found that they have a mean final grade of 87 with a standard deviation of 10. Does this indicate that a two-teacher setup is better than a one-teacher setup? Test at α = 0.01 **Solution:** Since two independent samples are being compared, and _n_ ≥ 30, the test statistic to be used is $Z = \frac{(X_1 - X_2)}{\sqrt(\frac{S_1^2}{n_1} + \frac{S_2^2}{n_2})}$ Substituting $X_1$ = 83, $X_2$ = 87, $S_1$ = 7, $S_2$ = 10, $n_1$ = 40, and $n_2$ = 50 in the formula, you get $Z = \frac{(83 - 87)}{\sqrt(\frac{7^2}{40} + \frac{10^2}{50})}$ = -2.23 Below is the 5-step solution: 1. $H_0$: μ_one-teacher = μ_two-teachers; One-teacher setup is as good as the two-teacher setup. 2. $H_1$: μ_one-teacher < μ_two-teachers; One-teacher setup is inferior to the two-teacher setup. 3. α = 0.01; one-tailed test; $Z_{tab}$ = -2.33 (Do you know why?) 4. Decision rule: Reject $H_0$ if |$Z_c$ (-2.23)| ≥ |$Z_{tab}$ (-2.33)|, that is if 2.23 ≥ 2.33. 5. Decision: Do not reject $H_0$, because 2.23 < 2.33. 6. Conclusion: One-teacher setup is as good as the two-teacher setup. **Example 3.** An ELEMSTA teacher wants to know if students without calculator got significantly lower scores in ELEMSTA midterm exam than those with calculator. To verify his/her claim, he/she did the following: Step 1. He/she took a sample of 40 students without calculator and recorded their midterm exam scores: Step 2. He/she then took a sample of 50 students with calculator and got the following midterm exam scores: Step 3. He/she computed for the mean and standard deviation of the midterm scores of each group of students. Verify the following: Step 4. He/she chose α = 0.05 Step 5. He/she computed for Z statistic for two samples. Computation is as follows; verify if correct: Let μ₁ be the mean scores of those without calculator and μ₂ the mean scores of those with calculator 1. $H_0$: μ₁ = μ₂; Students without calculator performed equally well with those with calculator 2. $H_1$: μ₁ < μ₂; Students without calculator performed lower than those with calculator. 3. α = 0.05; one-tailed left directional; $Z_{tab}$ = -1.65 4. Decision rule: Reject $H_0$ if |$Z_c$ (-2.76)| ≥ |$Z_{tab}$ (-1.65)|, that is if 2.76 > 1.65. 5. Decision: Reject $H_0$ because 2.76 > 1.65. 6. Conclusion: Students without calculator performed lower than those with calculator. # SEVENTH WEEK OUTPUT Each group will formulate at least two hypotheses drawn from their research project. 1. At least two hypotheses on testing the significance of difference between means 2. At least two hypotheses on testing the significance of difference between proportions ## RUBRIC FOR SEVENTH WEEK OUTPUT | Seventh Week Output | 1-2 points | 3-4 points | 5 points | | --------------------- | ---------- | ---------- | -------- | | Hypothesis for means | Cannot be statistically tested; data cannot be gathered by the questionnaire; $H0$ and $H1$ are incorrectly stated. | Can be statistically tested; data can be gathered by the questionnaire; $H0$ and $H1$ are incorrectly stated. | Can be statistically tested; data can be gathered by the questionnaire; $H0$ and $H1$ are correctly stated. | | Hypothesis for proportions | Cannot be statistically tested; data cannot be gathered by the questionnaire; $H0$ and $H1$ are incorrectly stated. | Can be statistically tested; data can be gathered by the questionnaire; $H0$ and $H1$ are incorrectly stated. | Can be statistically tested; data can be gathered by the questionnaire; $H0$ and $H1$ are correctly stated. | | Promptness | Submits the output one week or later after the deadline. | Submits the output the following meeting after the deadline. | Submits the output on time. | **Total Points: 15**