Statistical Significance (Unit 4) PDF
Document Details
Uploaded by PainlessJacksonville4088
Tags
Summary
This document provides an introduction to statistical significance. It discusses the objectives, topics, and presentation of statistical hypothesis testing. The document explains key concepts like null and alternative hypotheses, type I and type II errors, and significance levels.
Full Transcript
Unit 4 Statistical Significance In this lesson, you will learn about the introduction of significance. There are terms which you should know first so that you will have a better understanding about test of significance. 1. Objectives In this unit, the...
Unit 4 Statistical Significance In this lesson, you will learn about the introduction of significance. There are terms which you should know first so that you will have a better understanding about test of significance. 1. Objectives In this unit, the student is expected to be able to: A. Define the null hypothesis and alternative hypothesis B. Understanding the steps for hypothesis testing C. To differentiate a null hypothesis from an alternative hypothesis D. Define terms such as type I error and type II error E. Define the power of the test F. To describe the procedure of hypothesis testing 2. Topics A. Null hypothesis and alternative hypothesis Error, P-values and power H 0 and H A Type I Error Type II Error Power B. Statistical tests for one or two groups T-Test Chi-square Test 3. Presentation A. Null hypothesis and alternative hypothesis Error, P-values and power Definition of Hypothesis: A statistical hypothesis is an assumption about a population parameter Hypothesis testing is refers to the formal procedures used by statisticians to accept or reject statistical hypotheses. The methodology employed by the analyst depends on the nature of the data used and the reason for the analysis. Why using hypothesis testing: Hypothesis testing is used to infer the result of a hypothesis performed on sample data from a larger population. Types of Null Hypothesis Null hypothesis denoted by H0: Alternative hypothesis denoted by Ha: The best way to determine whether a statistical hypothesis is true would be to examine the entire population. Why is difficult: Since that is often impractical, researchers typically examine a random sample from the population. If sample data are not consistent with the statistical hypothesis, the hypothesis is rejected. Null hypothesis H 0 , is usually the hypothesis that sample observations result purely from chance. The Null hypothesis is formulated often in negative form and is set up to be rejected (Why?), Argument: it is easier to prove a statement is false than to prove it is true. Alternative hypothesis H a , the alternative hypothesis is the hypothesis that is accepted if the null hypothesis is rejected. Decision Error When testing a hypothesis it could be committing two types of errors: Type I error: is committed if the null hypothesis is rejected when it is true. – This probability is denoted by Pr(type I error) = α – In many medical and pharmaceutical studies the amount of this value is α=0.05 (5%) or α=0.01 (1%) Type II error: is committed if the null hypothesis is not rejected when it is false. – denoted by Pr(type II error) = β – Normally the value of β is 20% i.e β=0.20 Significance level and the Power Definition: the significance level is given as the value α , That is the probability of committing a type I error. The significance level is fixed before starting the study, Typically we choose it as α=0.05 Definition: the power of a hypothesis test is defined as 1- β=1-Pr(type II error). When the power of the hypothesis test (study) is about 80%, this means that in case of rejecting the null hypothesis we are 80% sure this rejecting is correct. Definition: the P value is the probability of getting values of the test statistics as extreme as or more extreme, than observed if the null hypothesis is true P = Pr( Z ≥ X-bar│H0 is true) Decision: 1. If P ≤ α, then H 0 will be rejected, we say the result is statistically significant. 2. If P > α, then H 0 will be not rejected, we say the result is not statistically significant Hypothesis Testing, Significance level and Power B. Statistical tests for one or two groups The main aim of statistical analysis is to use the information gained from a sample of individuals to make inferences about the population of interest. There are two basic approaches to statistical analysis: Estimation (with Confidence intervals) and Hypothesis Testing (with p-values). One sample T-test General conditions for this test are: 1. The study-variable is continuous (quantitative) 2. Follow the normal ditribution Example: We have a theory that the median age of university students is 20 or 24 (the theory is about the median age of university students and not the very accurately defined median age of the sample). The test is One-Sample T test or Single Mean T Test. Other examples: The percentage of men in the university is 30%, 50%, and the average height in society is 170 cm. Application: (Research Question) Assume that patients with acute myocardial heart attack (AMI) have a triglyceride level of 155.23 mg / dl. An investigator claims that the patients with Chronic Coronary Heart Diseased (CCHD) have higher level of triglyceride. To confirm this, he has randomly selected 27 patients with (CCHD) and watch the level of triglyceride. The results were x-bar=196.49 mg / dl and s=88.4 mg / dl. Do you conclude that the level of triglyceride in patient with CCHD is higher than that for the patients in the AMI group. Suppose the variable triglyceride follow a normal distribution. Solution: Since the investigator claims that the triglyceride levels for the patients with CCHD should be higher than 155.23 then: 1. Specify the significance level α=0.05 2. State H0 : H 0 : µ = 155.23 vs. H1 : µ > 155.23 3. Calculate test statistic T-Test (one-tailed) X − µ0 t= S/ n 196.49 − 155.23 = 88.4 / 27 = 2.4352 4. Determin the tabular value t n −1,1−α = t 26, 0.950 = 1.7056 (Calculated from t table with n=26 and alpha=0.05) 5. Compare the from sample observed value with the tabular value t = 2.4352 > t n −1,1−α = 1.7056 6.. Conclusion: two methods. a. The observed value is greater than the tabular value at significance level 5%, so we reject the H0. b. Using p value and comparing it with fixed α =0.05 P = P (t n −1 ≥ t ) = P (t 26 ≥ 2.4352) = 0.015 from the table Method b is found in scientific articles 7. Decision and Interpretation: Since P-value =0.015 < α =0.05 then we reject the null hypothesis H0 at the significance level 5%, and conclude that the mean value of triglyceride in patients with CCHD is significantly higher than that for the patients with AMI. T-Test for two independent samples General conditions: 1. The study variable is a numeric or a ratio. 2. A comparison variable is a categorical variable with only two values, (Dichotomous). 3. The cases are independent. 4. If the two samples are from one population, we expect that the difference between the means of the two samples is zero. Example: : (Research Question) Is there a significant difference in height between men and women?, is there a difference between the percentage of people with heart disease between married and unmarried people? What about the difference in weight for people across different areas, what about the difference in income rates between men and women? Important conditions for the validity of the test: 1. The two populations have a normal distribution of the trait studied, or the sample size is large (more than 50 cases). 2. The two samples were drawn randomly from the two studied populations. 3. The two samples are independent, meaning that the rates of the characteristic in one sample are not related to the rates of the characteristic in the other sample. 4. Homogeneity of variance, which means that the variance of the trait in the two populations is equal. A defect in the last condition raises the probability of type I and II errors. Application: : (Research Question) Comparing leg ulcer-free weeks between two groups intervention and control using the two independent samples t-test Data ask the coordinator to get it. Run the t-test for two independent sample using IBM SPSS: Results: Results Evaluation: First section: Group statistics Second section 1 Levene’s Test of Equality of Variances, H0 here is the two variances are equal) This Test is for homogeneity of variance. If P>0.05 then it is evidence that the two populations have equal variance, The result is read from first line in the table Otherwise If P0.05) means that there is no relationship between the two variables and that all the difference between the ratios may be due to chance only. When the result is P