Bivariate Analyses Lecture 4 PDF
Document Details
Uploaded by HolyMercury3977
Emlyon Business School
Tags
Summary
This document is a lecture on bivariate analyses. It covers various statistical methods in an undergraduate level data analysis course. It is from a data and content analysis module.
Full Transcript
Bivariate analyses DATA & CONTENT ANALYSIS Lecture 4 Bivariate Analysis When doing a survey, data analysis refers to searching for relationships between two variables...
Bivariate analyses DATA & CONTENT ANALYSIS Lecture 4 Bivariate Analysis When doing a survey, data analysis refers to searching for relationships between two variables One tries to explain the results observed for one variable (dependent variable) through the values of another variable (independent variable) This is when we use hypothesis testing, to make sure that the links observed are not random, but statistically significant Module 4 - Bivariate analyses 2 Who chose Coke and who chose Pepsi? We want to know who buys/drinks Coca-Cola and who buys/drinks Pepsi-Cola in order to better target these segments - Are Coke drinkers younger/older than Pepsi’s? - Are Coke drinkers more urban than Pepsi’s? - Are Coke drinkers more male? - Etc. Module 4 - Bivariate analyses 3 More generally, we want to find out if there is a relationship between two variables To verify this, we use hypothesis testing Module 4 - Bivariate analyses 4 Hypothesis testing Hypothesis testing allows to verify that the relationship observed in the data is not random The test varies depending on the nature of the variables The validation procedure of the test is done by comparing a calculated value with a theoretical (or critical) value This comparison is expressed as a probability (p-value) When the p-value is lower than 0.05 (5%) we can say that there is a relationship between the two variables The critical value is set at p=0.05 as a general rule in the Social Sciences (in the medical field, this threshold will be much lower) Module 4 - Bivariate analyses 5 Three situations emerge depending on the nature of the variables Quali-numeric Numeric-Numeric Quali-quali ANOVA or test of Coefficient of Crosstab/Chi² comparison of correlation r means T-test: when the qualitative variable has only two modalities ANOVA: when the qualitative variable has more than 2 Module 4 - Bivariate analyses modalities 6 Chi-square Distribution Decision rule: If the p-value is lower or equal to 5%, then the conclusion is that there is a relationship between the Critical value two variables (5%) If the p-value is larger than 5% then the two variables are independent Module 4 - Bivariate analyses 7 When the calculated value of t is greater than |1.96|, we can say that there is a relationship between the two variables. If the value of t lies between -1.96 & +1.96, then we conclude that there is no relationship between the two variables. Normal Distribution Critical value Critical value (for Pearson r) 2.50% 2.50% t =-1.96 t = +1.96 Module 4 - Bivariate analyses 8 F Distribution of Fisher (ANOVA) If the p-value 0.05, the two variables are not related p>0.05 p 0.30 is significant in the social sciences Please note Sphinx Campus gives a diagnostic on the level of significance of the correlation (do not consider ‘weak’ or ‘strong’ adjectives) using the t value If t>=1.96 then the relationship is significant If -1.96