Biostatistics 7th Seminar (2017-2018) - PDF
Document Details
Uploaded by ComfortingAestheticism
University of Debrecen Faculty of Medicine
Tags
Summary
This document is a biostatistics seminar. The content includes z-test and t-test examples for biostatistical analysis. The seminar focuses on the calculation of statistical values for different biological experiments. It includes solved examples of z-test and t-test.
Full Transcript
7th seminar ▪ z-probe (one sample). ▪ One-sample t-test ▪ Examples Week 9 z-test ❖ Introduction. z-test can be applied if the standard deviation (σ) of the normally di...
7th seminar ▪ z-probe (one sample). ▪ One-sample t-test ▪ Examples Week 9 z-test ❖ Introduction. z-test can be applied if the standard deviation (σ) of the normally distributed population is known. x− z= test statistics: n steps: - choose a significance level (α) - state the null and the alternative hypothesis (H0, HA) - calculate the z test statistic (zcalculated) - find the critical value of the z in the standard normal z table (zcritical) - compare the test statistic (zcalculated) to the critical z value (zcritical) and decide if you should support or reject the null hypothesis - write a conclusion z-test – Example 1. The blood glucose concentration of the students of the University of Debrecen follows a normal distribution with a standard deviation of 0.4 mM. It is assumed that the mean of the distribution is 4.4 mM. A physician is taking a simple random sample of 16 from this population and determines that the mean blood glucose concentration is 4.65 mM. Can the physician conclude at α=0.05 that the mean blood glucose concentration is different from the assumed value of 4.4 mM? What is the p-value of the test statistic? z-test – Example 1. The blood glucose concentration of the students of the University of Debrecen follows a normal distribution with a standard deviation of 0.4 mM. It is assumed that the mean of the distribution is 4.4 mM. A physician is taking a simple random sample of 16 from this population and determines that the mean blood glucose concentration is 4.65 mM. Can the physician conclude at α=0.05 that the mean blood glucose concentration is different from the assumed value of 4.4 mM? What is the p-value of the test statistic? z-test – Example 1. The blood glucose concentration of the students of the University of Debrecen follows a normal distribution with a standard deviation of 0.4 mM. It is assumed that the mean of the distribution is 4.4 mM. A physician is taking a simple random sample of 16 from this population and determines that the mean blood glucose concentration is 4.65 mM. Can the physician conclude at α=0.05 that the mean blood glucose concentration is different from the assumed value of 4.4 mM? What is the p-value of the test statistic? Solution assumption: data is obtained from a population whose blood glucose concentration follows a normal distribution and the population has a known standard deviation of 0.4 mM. two-tailed test (different from 4.4 mM so is there any „diverge” – in any way – from 4.4 mM) data obtained: n = 16 x = 4.65 mM = 4.4 mM = 0.4 mM hypothesis: H 0 : = 4.4 H A : 4.4 x− 4.65 − 4.4 calculate the value of the test statistic: zcalc = = = 2.5 0.4 n 16 a standard normal distribution curve (or simply a Z-distribution curve). z-test – Example 1. distribution of the test statistics: if H0 is true the test statistics follows a standard normal distribution decision rule. At α = 0.05 find the critical values - two tailed test how to find z1 and z2 critical values? - the area under the curve from -∞ to z1 is 0.025. - the area under the curve from -∞ to z2 is (1-α/2) = 0.975. Locate these from the statistical table of the standard normal distribution: z1 = -1.96 z2 = 1.96 critical values z-test – Example 1. statistical decision: is the calculated value of the test statistics in the rejection region? Reject the null hypothesis, and favor the alternative hypothesis, the HA: µ ≠ 4.4 mM might be true conclusion: the data support the research hypothesis that the mean blood glucose is different from 4.4 mM. type I error - reject H0 when it is true - the probability of committing type I error is α, i.e., the level of significance z-test – Example 1. determination of the p-value: you have to determine the probability of obtaining at a value for the test statistic as extreme or more extreme than the one obtained assuming the null hypothesis is true. Therefore, the area of the shaded parts in the graph below have to be determined you have to look for the probability that the normal distribution assumes values smaller than -2.5. 0.0062 is the area of the left shaded area. But due to the symmetry of the normal distribution the area of the right shaded area is the same. Therefore, the p-value of the statistical test is 2×0.0062=0.0124. you can also decide about the null-hypothesis by comparing the p and a values: p < 0.05 → the null hypothesis has to be rejected in other words the p-value is the smallest level of significance at which the null hypothesis is rejected. z-test – Example 2. The diastolic blood pressure of the students of the University of Debrecen assumes normal distribution with a standard deviation of 14 mmHg. A physician is taking a simple random sample of 12 from this population and determines that the mean diastolic blood pressure is 96 mmHg. Can the physician conclude at α=0.05 that the mean diastolic blood pressure is larger than 90 mmHg (the criterion for elevated blood pressure)? Find the p-value of the test statistics? z-test – Example 2. The diastolic blood pressure of the students of the University of Debrecen assumes normal distribution with a standard deviation of 14 mmHg. A physician is taking a simple random sample of 12 from this population and determines that the mean diastolic blood pressure is 96 mmHg. Can the physician conclude at α=0.05 that the mean diastolic blood pressure is larger than 90 mmHg (the criterion for elevated blood pressure)? Find the p-value of the test statistics? z-test – Example 2. The diastolic blood pressure of the students of the University of Debrecen assumes normal distribution with a standard deviation of 14 mmHg. A physician is taking a simple random sample of 12 from this population and determines that the mean diastolic blood pressure is 96 mmHg. Can the physician conclude at α=0.05 that the mean diastolic blood pressure is larger than 90 mmHg (the criterion for elevated blood pressure)? Find the p-value of the test statistics? Solution assumption: data is obtained from a population whose blood pressure follows a normal distribution and the population has a known standard deviation of 14 mmHg one-tailed test (larger than 90 mmHg?) data obtained: n = 12 x = 96 Hgmm = 90 Hgmm = 14 Hgmm hypothesis: H 0 : 90 H A : 90 x− 96 − 90 calculate the value of the test statistic: zcalc = = = 1.48 14 n 12 z-test – Example 2. distribution of the test statistic: if H0 is true the test statistic follows a standard normal distribution decision rule. At α = 0.05 find the critical value - one tailed test how to find z1? The area under the curve from -∞ to z1 is 0.95. Locate this from the table of the normal distribution. zcrit = 1.65 (to be precise, 1.645 since there is no exact z value for 0.95). Thus, the critical value is 1.65. critical value z-test – Example 2. statistical decision: is the calculated value of the test statistics in the rejection region. Do not reject the H0 hypothesis, the H0: µ ≤ 90 mmHg might be true. ACCEPTANCE AREA conclusion: the data does not support the research hypothesis that the mean blood pressure is larger than 90 mmHg. type II error - fail to reject H0 when it is false (b) z-test – Example 2. determination of the p-value: the probability of obtaining a value for the test statistic at least as extreme as the one actually calculated (taking the sidedness, i.e. one- or two-sided test, into consideration). This is the area of the shaded part in the graph above. Using the table of the standard normal distribution: p = 1 − (1.48 ) = 1 − 0.9306 = 0.0694 the p-value of the statistical test is 0.0694. since the p-value of the test is larger than the level of significance, it can be confirmed that the null hypothesis should not be rejected. The smallest value of the significance level at which the null hypothesis should be rejected is 6.94%. t-distribution ❖ Introduction. in practice, we rarely know σ, instead we calculate SD in our sample and use this to estimate σ this adds another element of uncertainity to inference z does not capture this additional uncertainty, so we use a modification of z distribution called the t- distribution the t-distribution discovered by William Sealy Gosset (1876-1937). He used the pseudonym Student to publish his work (he was a student of statistics) when n is very large, SD is a very good estimate of σ and the corresponding t-distributions are very close to the normal distribution the t-distributions become wider for smaller sample sizes, reflecting the lack of precision in estimating σ from SD x− test statistics: t= SD n degree of freedom (d.f.): d.f. = n – 1 t-distribution ❖ Properties of the t-distribution. t-distributions are similar to z-distribution, but have broader tails there are many t-distributions (a family) each t has a different degrees of freedom (df) as df increases, t becomes increasingly like z One sample t-test – Example 1. A study conducted a few years ago claims that adult men spend an average of 11 hours a week watching sports on television. A recent sample of 16 adult men showed that the mean time they spend per week watching sports on television is 9 hours with a standard deviation of 2.1 hours. Test at the 1% significance level whether currently all adult men spend less than 11 hours per week watching sports on television One sample t-test – Example 1. A study conducted a few years ago claims that adult men spend an average of 11 hours a week watching sports on television. A recent sample of 16 adult men showed that the mean time they spend per week watching sports on television is 9 hours with a standard deviation of 2.1 hours. Test at the 1% significance level whether currently all adult men spend less than 11 hours per week watching sports on television One sample t-test – Example 1. A study conducted a few years ago claims that adult men spend an average of 11 hours a week watching sports on television. A recent sample of 16 adult men showed that the mean time they spend per week watching sports on television is 9 hours with a standard deviation of 2.1 hours. Test at the 1% significance level whether currently all adult men spend less than 11 hours per week watching sports on television Solution assumption: data is obtained from a population whose distribution follows a normal distribution. The population variance is not known → must be estimated from the sample drawn. one-tailed test (spend less than 11 hours) data obtained: n = 16 x=9 = 11 SD = 2.1 hypothesis: H 0 : 11 H A : 11 x − 9 − 11 calculate the value of the test statistic: tn −1 = = = −3.8 SD 2.1 n 16 One sample t-test – Example 1. distribution of the test statistics: if H0 is true the test statistics follows a t- distribution with n-1 degree of freedom. decision rule. At α = 0.01 find the critical values - one tailed test (left-tailed test) how to find tcrit? Use the statistical table for the t-distribution! Recall that the t-distribution is symmetric! - degree of freedom (d.f.): n – 1 = 16 – 1 = 15 ACCEPTANCE AREA d.f. -2,602 tcritical tcrit=t0.01=2.602 (from the table) since the t-distribution is symmetric: tcrit= -2.602 One sample t-test – Example 1. statistical decision: Is the calculated value of the test statistics in the rejection region? Reject the H0 hypothesis, and favor the alternative hypothesis: HA: µ < 11 h might be true conclusion: adult males spend less than 11 hours watching sports on television. type I error Determination of the p-value: tcalc is between -t0,0005 and -t0,001, thus 0.0005