Biostatistics 8th Seminar (Paired t-test, Unpaired t-test, F-test) PDF
Document Details
Uploaded by ComfortingAestheticism
University of Debrecen Faculty of Medicine
Tags
Summary
This document is a seminar presentation about biostatistics. It explains the concepts of paired t-test, unpaired t-test, and F-test, and presents example applications of these methods. The document also provides the relevant theory and calculations of different statistical tests.
Full Transcript
8th seminar ▪ Paired t-test. ▪ Unpaired t-test. F-test ▪ Example Week 10 Paired t-test ❖ Introduction. the aim of the paired t-test is to get rid of possible extrimic factors...
8th seminar ▪ Paired t-test. ▪ Unpaired t-test. F-test ▪ Example Week 10 Paired t-test ❖ Introduction. the aim of the paired t-test is to get rid of possible extrimic factors that can change the values of the sample. instead of performing the analysis with individual observations, we use the difference between pairs of observations, as the variable of interest. the same subjects may be measured before and after receiving some treatment the objective in paired comparisons tests is to eliminate a maximum number of sources of extraneous variation by making the pairs similar with respect to as many variables as possible the test statistics calculated as: d tn −1 = SDd n degree of freedom: (n – 1) hypothesis: two-tailed: one-tailed: H 0 : d = 0 OR H 0 : after = before H 0 : d 0 H 0 : d 0 or H A : d 0 OR H A : after before H A : d 0 H A : d 0 i.e. the mean of the population does not change (for example in a certain treatment) Paired t-test – Example 1. A researcher wanted to find the effect of a special diet on systolic blood pressure. She selected a sample of seven adults and put them on this dietary plan for three months. The following table gives the systolic blood pressures of these seven adults before and after the completion of this plan: Using the 5% significance level can we conclude that the mean blood pressure is influenced by the diet? Paired t-test – Example 1. A researcher wanted to find the effect of a special diet on systolic blood pressure. She selected a sample of seven adults and put them on this dietary plan for three months. The following table gives the systolic blood pressures of these seven adults before and after the completion of this plan: Using the 5% significance level can we conclude that the mean blood pressure is influenced by the diet? Paired t-test – Example 1. A researcher wanted to find the effect of a special diet on systolic blood pressure. She selected a sample of seven adults and put them on this dietary plan for three months. The following table gives the systolic blood pressures of these seven adults before and after the completion of this plan: Using the 5% significance level can we conclude that the mean blood pressure is influenced by the diet? Solution introduction to the solution: the question asks if the population mean before the treatment (µb) is not different from the population mean after the treatment (µa). Symbolically, the assertion “no difference” is written as µb=µa or µb - µa=0, which is equivalent to µd = µ0 = 0 (d=difference). this question means that the mean blood pressure before and after the diet is different – in any direction mean SD n SEM determine the differences (after-before!): Before 210 180 195 220 231 199 224 208.43 18.10 7 6.84 After 193 186 186 223 220 183 233 203.43 21.08 7 7.97 Delta (after-before) -17 6 -9 3 -11 -16 9 -5.00 10.79 7 4.08 assume that the population of paired differences is approximately d = −5 normally distributed. SDd = 10.79 hypothesis: H 0 : d = 0 H A : d 0 d −5 calculate the value of the test statistic: tn −1 = = = −1.23 SDd 10.79 n 7 Paired t-test – Example 1. distribution of the test statistics: if H0 is true the test statistics follows a t- distribution with n-1 degree of freedom decision rule. At α = 0.05 find x1 and x2 the critical values. Two tailed test - degree of freedom: n-1, thus 7-1=6 - recall that the t distribution is symmetric! - you can only get x2 from the table - recall that α/2 has to be considered → α/2 = 0.025 d.f. thus x2 = 2.447 and x1 = −2.447. tcrit = 2.447 Paired t-test – Example 1. statistical decision: is the calculated value of the test statistics in the rejection region? Do not reject H0, b = a, or d = 0 might be true. 0.4 0.3 f(t) 0.2 /2=0.025 /2=0.025 (1-)=0.95 0.1 reject the 0 hypothesis reject the 0 hypothesis 0.0 -4 -3 -2 -1 0 1 2 3 4 t -2.447 -1.23 2.447 conclusion: the data do not support the research hypothesis that diet influenced the systolic blood pressure type II error Paired t-test – Example 2. A researcher conducted a study to determine the effectiveness of a reading intervention. The pre and post- assessment data for eight students are shown in the following table: Assuming the differences are normally distributed, determine whether the intervention produces higher post-assessment scores. Use α = 0.05. Paired t-test – Example 2. A researcher conducted a study to determine the effectiveness of a reading intervention. The pre and post- assessment data for eight students are shown in the following table: Assuming the differences are normally distributed, determine whether the intervention produces higher post-assessment scores. Use α = 0.05. Paired t-test – Example 2. A researcher conducted a study to determine the effectiveness of a reading intervention. The pre and post- assessment data for eight students are shown in the following table: Assuming the differences are normally distributed, determine whether the intervention produces higher post-assessment scores. Use α = 0.05. Solution introduction to the solution: if µ1 is the expected value of the mean pre-assessment score and µ2 is the expected value of the mean post-assessment score, then in the null hypothesis formulated when investigating whether the scores increased we assume that there was no increase in the scores. Symbolically: µ1 ≥ µ2 or µ1 - µ2 ≥ 0, which is equivalent to µd ≤ 0. since the score of every student is available before and after the reading intervention, a paired t-test has to be carried out determine the differences mean SD n SEM (after-before!): before 44 55 25 54 63 38 31 34 43.00 13.31 8 4.71 after 55 68 40 55 75 52 49 48 55.25 11.26 8 3.98 delta (after-before) 11 13 15 1 12 14 18 14 12.25 5.01 8 1.77 assume that the population of paired differences is approximately d = 12.25 normally distributed SDd = 5 hypothesis: H 0 : d 0 H A : d 0 d 12.25 tn −1 = = = 6.92 SDd 5 calculate the value of the test statistic: n 8 Paired t-test – Example 2. distribution of the test statistics: if H0 is true the test statistics follows a t- distribution with n-1 degree of freedom. decision rule. At α = 0.05 find the critical values. One tailed (right sided) test how to find x1 (the critical value)? - use the statistical table for the t-distribution! - degree of freedom is n-1=8-1=7! - recall that the t distribution is symmetric! - recall that α has to be considered → (1-α) = 0.95 d.f. thus x1 = 1.895 tcrit = 1.895 Paired t-test – Example 2. statistical decision: is the calculated value of the test statistics in the rejection region? Reject H0, and favor the alternative hypothesis: HA: µd > 0, which means that µ2 > µ1 might be true 0.4 0.3 = 0.05 f (x) 0.2 Col 1 vs norm. distr. Col 1 vs t distr small df Col 1 vs t distr large_df 0.1 Reject the 0 hypothesis 0.0 -4 -3 -2 -1 0 1 2 3 4 1.895 6.92 conclusion: the data support the research hypothesis that the reading intervention produces higher post-assessment scores type I error Unpaired t-test ❖ Introduction. when population variances are unknown, and we wish to estimate the difference between two population means with a confidence interval, we can use the t distribution as a source of the reliability factor if certain assumptions are met. we must know, or be willing to assume, that the two sampled populations are normally distributed. if the assumption of equal population variances is justified, the two sample variances that we compute from our two independent samples may be considered as estimates of the same quantity, the common variance. samples are taken from the two populations, their means are investigated. It does not matter that the number of elements (n1, n2) of the two populations be the same. hypothesis: two-tailed: one-tailed: H 0 : 1 = 2 H 0 : 1 2 H 0 : 1 2 or H A : 1 2 H A : 1 2 H A : 1 2 ( x1 − x2 ) (n 1 −1) SD12 + (n 2 −1) SD22 test statistics: tn + n − 2 = where s = 2 n1 + n2 − 2 p 1 1 1 2 s 2p + n1 n2 degree of freedom: (n1 + n2 – 2) F-test ❖ Introduction. An F-test is used to test if the variances of two populations are equal null hypothesis: H 0 : 12 = 22 (always two-tailed!) SD12 SD1 > SD2, because each number in the table of F-test test statistics: Fn1 −1, n2 −1 = 2 SD 2 higher than 1 the degree of freedom of the numerator (n1–1) and the denominator (n2–1) must be investigated separately if the significance level is 5 % → F table with α = 0.025 must be used to check the equality of the standard deviations Unpaired t-test – Example 1. In a study researchers investigate whether vitamin D changes the number of epileptic seizures. Patients were randomly divided into two groups: patients in group A received vitamin D every day while patients in group B received placebo. All other circumstances were the same. The number of epileptic seizures was determined in both group of patients, and the mean, the standard deviation (SD) and the number of patients (n) are summarized in the following table: Sample 1 Sample 2 (Vitamin D treated group) (Placebo treated group) Mean of epileptic 15 24 seizures SD of epileptic seizures 2.8 3.5 Number of patients 8 13 Can we conclude at a level of significance of 5% that vitamin D changes the number of epileptic seizures? Unpaired t-test – Example 1. In a study researchers investigate whether vitamin D changes the number of epileptic seizures. Patients were randomly divided into two groups: patients in group A received vitamin D every day while patients in group B received placebo. All other circumstances were the same. The number of epileptic seizures was determined in both group of patients, and the mean, the standard deviation (SD) and the number of patients (n) are summarized in the following table: Sample 1 Sample 2 (Vitamin D treated group) (Placebo treated group) Mean of epileptic 15 24 seizures SD of epileptic seizures 2.8 3.5 Number of patients 8 13 Can we conclude at a level of significance of 5% that vitamin D changes the number of epileptic seizures? Solution perform the test of hypothesis assuming that two populations from which the samples are drawn are normally distributed. The samples are independent. Also assume that the population variances are also equal. This assumption must be tested using the variance ratio test (or F test) at (α = 0.05). perform F-test assumptions: normally distributed quantity, samples are independent hypothesis for F-test: H 0 : 12 = 22 H A : 12 22 3.52 calculate the value of the test statistic: Fcalc = = 1.5625 2.82 Unpaired t-test – Example 1. distribution of the test statistic: if the H0 is true the statistic is distributed as F with n1-1 (numerator) and n2-1 (denominator) degrees of freedom. decision rule: at α=0.05 find the critical value (F.975 or Fcrit), two-tailed test. Please note: the larger SD goes into the numerator. Thus, the df of the numerator is n2-1=12, whereas the df of the denominator is n1-1=7 Table for F-test Denom. Numerator d.f. d.f. mivel Fszámolt az elfogadási tartományba esik (Fszámolt µ2. Its negation is µ1 ≤ µ2. Thus, the hypotheses are : H0: µ1 ≤ µ2 (or µ1 - µ2 ≤ µ0 = 0) HA: µ1 > µ2 (or µ1 - µ2 > µ0 = 0) (right-tailed test) ( x1 − x2 ) (n 1 −1) SD12 + (n 2 −1) SD22 calculate the value of the test statistic: tn1 + n2 − 2 = where s = 2 n1 + n2 − 2 p 1 1 s 2p + n1 n2 (n 1 −1) SD12 + (n 2 −1) SD22 s = 2 = 9.75 n1 + n2 − 2 p ( x1 − x2 ) tn1 + n2 − 2 = = 4.108 1 1 s 2p + n1 n2 Unpaired t-test – Example 2. distribution of the test statistic: if H0 is true the test statistic follows a t distribution with n1+n2-2 degrees of freedom decision rule. At α=0.05 find the critical value - one-tailed (right sided) test how to find x1 critical value? - use the statistical table for the t-distribution! - degrees of freedom is n1+n2-2=45 - recall that the t distribution is symmetric! - you can get directly the value for x1 - recall that α has to be considered → (1-α) = 0.95 - intersection of df=40 and α = 0.05 is 1.684, the intersection of df=50 and α=0.05 is 1.676, the actual df is in between the two degrees of freedom found in the table, thus, x1=1.68 d.f. Unpaired t-test – Example 2. statistical decision: is the calculated value of the test statistic in the rejection region? Reject H0, and favor the alternative hypothesis: HA: µD > 0, this means that µ1 > µ2. conclusion: at α = 0.05 level of significance, the data provide sufficient evidence to conclude that the true average time (in hours) spent by teenage boys playing video games per week is greater than the true average time (in hours) spent by teenage girls playing video games per week. type I error Unpaired t-test – Example 3. A new chemotherapeutic agent against lung cancer is tested in patients and the effect of the new drug is compared to control patients receiving standard therapy. The size of the tumor was determined in both group of patients, and the mean, the standard deviation (SD) and the number of patients (n) in each sample were as follows: sample from patients treated with sample from control patients the new chemotherapeutic agent mean 6 cm 12 cm SD 2 cm 3 cm n 10 16 Did the new chemotherapeutic drug significantly decrease the tumor size compared to the control group of patients? Let the level of significance be 5%. Unpaired t-test – Example 3. A new chemotherapeutic agent against lung cancer is tested in patients and the effect of the new drug is compared to control patients receiving standard therapy. The size of the tumor was determined in both group of patients, and the mean, the standard deviation (SD) and the number of patients (n) in each sample were as follows: sample from patients treated with sample from control patients the new chemotherapeutic agent mean 6 cm 12 cm SD 2 cm 3 cm n 10 16 Did the new chemotherapeutic drug significantly decrease the tumor size compared to the control group of patients? Let the level of significance be 5%. Solution perform the test of hypothesis assuming that two populations from which the samples are drawn are normally distributed. The samples are independent. Also assume that the population variances are also equal. This assumption must be tested using the variance ratio test (or F test) at (α = 0.05). perform F-test hypothesis for F-test: H 0 : 12 = 22 H A : 12 22 SD22 32 test statistic: Fcalc = = = 2.25 SD12 22 Unpaired t-test – Example 3. distribution of the test statistic: if the H0 is true the statistic is distributed as F with n1-1 (numerator) and n2-1 (denominator) degrees of freedom. decision rule: at α=0.05 find the critical value (F), two-tailed test. Please note: the larger SD goes into the numerator. Thus, the df of the numerator is n2-1=15, whereas the df of the denominator is n1-1=9 Table for F-test Denom. Numerator d.f. d.f. mivel Fszámolt az elfogadási tartományba esik (Fszámolt