Introduction to Biostatistics: The Journey Continues - Week 5 - UW
Document Details
Uploaded by KnowledgeablePiccolo
University of Washington
2024
Tags
Summary
This document is a series of lecture slides from week 5 of a Biostatistics course at the University of Washington. It includes an agenda for the week and then discusses various statistical topics and tests, such as ANOVA, Chi-square, and correlation.
Full Transcript
Introduction to Biostatistics: The Journey Continues COPHP Quantitative Methods Winter 2024 Week 5 Agenda Brief review of one vs. two-sided testing – One vs. two-tailed Additional measures of association and tests of statistical significance – ANOVA – Chi-square – Correlation One vs two-sided testin...
Introduction to Biostatistics: The Journey Continues COPHP Quantitative Methods Winter 2024 Week 5 Agenda Brief review of one vs. two-sided testing – One vs. two-tailed Additional measures of association and tests of statistical significance – ANOVA – Chi-square – Correlation One vs two-sided testing (or p-values) Graphical Representation- SR Example “New” Measures and Statistical Tests ANOVA Exposure: Categorical (more than 2 levels) Outcome: Continuous H0: μ1 = μ2 = μ3 HA: at least 1 of the means is different – You won’t know which mean is different Omnibus or global test – Statistical tests that tests for the significance of several parameters at once ANOVA test Measure of association: difference in means ANOVA generates an F-statistic, but we will just interpret the p-value for this test Chi-Square Test Chi-Square test for: - 2 binary variables - 1 binary and 1 categorical variable (with more than 2 levels) - 2 categorical variables (both with more than 2 levels) Chi-Square Assesses if two categorical variables are associated with each other -> doesn’t matter (mathematically) which variable is your exposure and which is your outcome Ho: the variables are independent – The distribution of one variable is the same across levels of the other variable – One variable does not predict the other Ha: the variables are dependent – The distribution of one variable is the different across levels of the other variable – One variable does predicts the other Chi-square Test Chi-squared tests are used with categorical variables. Data are displayed in contingency tables. DV or Outcome IV or exposure Outcome 1 Outcome 2 Outcome 3 Total Group 1 15 8 7 30 Group 2 16 6 8 30 Group 3 11 9 10 30 Total 42 23 25 90 N Chi-square Interpretation Chi2 test statistic FYI-Degrees of freedom = (#Rows – 1)(#Columns – 1) Follows a chi-square distribution Note: You will NOT need to interpret the chi2 test statistic itself, but you WILL interpret the p-value of the test Chi2 tests are non-directional, they tell you there is a difference between groups But does not tell you which group(s) are different (also considered a global test) Case 3 Example Private Medicaid Other Gov No insurance Total No visits 2 7 3 2 14 1-4 visits 13 41 1 2 57 5-9 visits 109 134 27 5 275 10-14 visits 225 215 40 9 489 15+ 48 28 10 0 86 Total 397 425 81 18 921 Wait, are there multiple tests for when both your exposure and outcome are binary? Yup! You can use a chi-square test OR a two-sample z-test in these cases OR calculate an Odds Ratio/Risk Ratio – Just specify which one you are using! – The Exposure/Outcome designation does matter for the two-sample z-test and OR/RR – Likely use the chi-square if you are unsure of the causal relationship Scatter plots Scatter plots show patterns in data. Positive association between height and weight. No association between hours spent studying and hours spent exercising. Correlation coefficient Pearson Correlation Coefficient: r Measures the strength of a linear association – How well a straight line fits the data Constrained between -1 ≤ r ≤ 1 – – 1 or -1 = perfect linear association 0= No association, null value Case 3 Example Strength of Correlation Dependent Variable (Outcome) Measure of Association Binary Continuous proportion mean one sample z-test one-sample t-test Test of Statistical Significance Constant Independent Variable (Exposure or Predictor) RR or OR Binary χ2 test difference in means Difference in proportions two-sample t-test two-sample z-test Categorical Chi-Square difference in means Can be used with categorical outcomes too! ANOVA Correlation (r) Continuous Stay tuned…for logistic regression in Case 6 t-test Four Questions for Assessing Relationships of Interest 1. Is there a difference between groups (an association between the exposure and outcome)? If so, 2. In what direction is it? 3. How big is it? (Magnitude of the effect) 4. Is it statistically significant? Use the measure of association to answer the first 3 Qs, and then test for statistical significance Causality Caution Observing an association between two variables does not imply one variable caused the other. Practice Time! You are interested in comparing the GPA of UW undergraduate students to the GPA of UW graduate students. From a survey of 250 undergraduates and 195 graduate students, you compare GPAs using a two-sample t-test. You find that mean GPA is higher among UW graduate students. Now, you want to know if the proportion of UW graduate students who have GPAs above 3.5 is similar to the national average of 35% (based on comprehensive reporting from all US graduate programs). What test do you run? a. b. c. d. One-sample t-test Two sample t-test One sample z-test Two sample z-test Your public health team wants to compare average household incomes across three Washington counties: King County, Pierce County, and Thurston County. You elect to run a(n) _______ test and get a p-value of 0.04. You conclude that__________ a. Chi-square; there is a difference in average income across these three counties b. ANOVA; the evidence suggests that the average household income is the same for each of these counties. c. T-test; there is a statistically significant difference between the average household incomes comparing any two of these counties. d. ANOVA; at least one of these counties has a statistically significantly different average income e. None of the above. A clinical trial at Harborview Medical Center is testing a patient-centered program designed to reduce repeat visits for drug overdose to the Emergency Department (ED). Patients are enrolled in the study after an ED visit for overdose where they are randomized to this new program or the standard of care and then followed for 6 months. The primary outcome for the trial is any ED visit for overdose within the follow-up period (yes/no). Which of the following could you calculate? a) b) c) d) e) Chi-square statistic ANOVA Difference in proportion Risk Ratio All of the above Biostats= New favorite thing?