PSGY1014 Lecture 8: Regression - 2024-2025 PDF

Summary

This document is a set of lecture notes on the topic of regression analysis, specifically for a university course called PSGY1014. The lecture covers topics such as correlation, simple linear regression, and regression in SPSS. 

Full Transcript

PSGY1014 Lecture 8: Regression Overview What is regression analysis? – Relationship with correlations – Simple linear regression analysis – Conducting a regression analysis in SPSS – Multiple regression analyses – Conducting a multiple regression analysis in SPSS...

PSGY1014 Lecture 8: Regression Overview What is regression analysis? – Relationship with correlations – Simple linear regression analysis – Conducting a regression analysis in SPSS – Multiple regression analyses – Conducting a multiple regression analysis in SPSS PSGY1014 2 Correlation Last week, you learned about correlation coefficients Correlations examine the association (or relationship) between two variables This association can be positive or negative Positive: If variable X increases, variable Y increases too Negative: If variable X increases, variable Y decreases PSGY1014 3 Correlation Correlation coefficients are based on pairs of observations Someone’s height and weight; Someone’s shoe size and reading skills; The number of ice cream cones sold and the number of drowning deaths on a day Correlation coefficients produce a line of best fit and then calculate how well this line describes the association How well the line describes the association is reflected by the magnitude PSGY1014 4 Correlation If one or both of the variables are measured on an ordinal scale (i.e., a categorical rather than continuous variable), then you need to conduct a non-parametric analysis Spearman coefficient of correlation (ρ, rho) However, if both variables are measures on an interval or ratio scale (i.e., a continuous rather than a categorical variable), then you need to conduct a parametric analysis Pearson coefficient of correlation (r) PSGY1014 5 Correlation Both correlational analyses produce such a magnitude If the observations are far from the line, then the magnitude will be small There might be many other variables at play However, if the observations are close to the line, the magnitude will be large There might not be many other variables at play How much variance of Y that is explained by X is the square of the magnitude (R2) PSGY1014 6 Regression Regression uses the same data as correlation uses Correlation considers how closely the data points fall near the line of best fit In contrast, regression describes the characteristics of this straight line It uses both the slope of the line as well as the point it intercepts the y axis We can develop a regression equation to predict the y value based on the x value PSGY1014 7 Regression Linear regression analyses can be used to predict the value of a variable (Y) based on the value of another variable (X) Outcome variable: The variable we want to predict (DV) Predictor variable: The variable we are using the predict the other variable’s value (IV) For example, height can predict weight; exam performance can be predicted based on revision time; cigarette consumption can be predicted based on smoking duration, etc. If there are two or more predictors, rather than just one, we need to use multiple regression analyses PSGY1014 8 Regression To calculate a regression line, you need: ̶ The point at which the line cuts across the y-axis – the intercept; denoted with the letter b Grade ̶ The slope of regression line; denoted by the letter a The slope is simply the number of units that the regression line moves up the vertical axis for each Revision time unit of movement along the horizontal axis PSGY1014 9 Regression To calculate a regression line, you need: ̶ The point at which the line cuts across the y-axis – the intercept; denoted with Grade the letter b ̶ The slope of regression line; denoted by the letter a y = ax + b With this equation, we can predict the Revision time value of grade knowing the number of hours of revision PSGY1014 10 Regression Linear regression analyses can be used to predict the value of a variable (Y) based on the value of another variable (X) Basically, you are developing a simple model To test whether this simple model describes the data well enough, there is a several pieces of information that must be reported 1. How much variance of the outcome variable is explained by the predictor variable – R2 2. Whether the amount of variance explained is significant – Analysis of Variance (ANOVA) 3. What the slope (a) and intercept (b) of the regression equation is (and whether the slope is significant) PSGY1014 11 Regression in SPSS Select ‘Analyze’, ‘Regression’, and ‘Linear...’ PSGY1014 12 Regression in SPSS Select ‘Analyze’, ‘Regression’ and ‘Linear...’ Move the dependent variable (the outcome variable) to the ‘Dependent’ box and the independent variable (the predictor variable) to the ‘Independent(s)’ box PSGY1014 13 Regression in SPSS Select ‘Analyze’, ‘Regression’ and ‘Linear...’ Move the dependent variable (the outcome variable) to the ‘Dependent’ box and the independent variable (the predictor variable) to the ‘Independent(s)’ box Click on ‘Statistics’ and ensure that confidence intervals are checked PSGY1014 14 Regression in SPSS Select ‘Analyze’, ‘Regression’, and ‘Linear...’ Move the dependent variable (the outcome variable) to the ‘Dependent’ box and the independent variable (the predictor variable) to the ‘Independent(s)’ box Click on ‘Statistics’ and ensure that confidence intervals are checked PSGY1014 15 Regression in SPSS – Output Half the variance in scores is explained by revision time (other factors might include individual differences, measurement error, etc.) As the ANOVA is significant, this shows that hours of revision predicts grade PSGY1014 16 Regression in SPSS The intercept is +37.137. This is the point at which the regression line cuts the vertical axis. This is the score one gets without revision. This is the slope. It means that for each increase of one hour of revision, the grade increases with 0.946 PSGY1014 17 Regression in SPSS – Output You can predict someone’s exam mark (Score) from the hours of revision (H) with the following regression equation Score = 0.946*H + 37.13 Use the UNSTANDARDIZED coefficients!!! PSGY1014 18 Reporting Results in APA format “A linear regression analysis was conducted to examine the relationship between study time (i.e., the predictor variable) and exam scores (i.e., the outcome variable). The model explained a significant portion of the variance in the exam scores, R2 =.497, F(1, 24) = 23.67, p <.001. Without learning, students would score 37.137, but with every hour of learning, their score would increase with 0.946 (t = 4.87, p <.001). These results show that study time predicts exam score.” PSGY1014 19 Multiple Regression Simple linear regression analyses tell us about – the effect of one variable (IV) on another variable (DV) – the line of best fit to the data If we have multiple independent variables, we can look at – how they each contribute to the dependent variable (DV) – the plane of best fit PSGY1014 20 Multiple Regression Multiple regressions use a very similar equation as simple regressions: Z = ax + by + c PSGY1014 21 Multiple Regression in SPSS > Analyze > Regression > Linear… PSGY1014 22 Multiple Regression in SPSS 99% the variance in the grades is explained by revision time and IQ As the ANOVA is significant, this shows that revision time and IQ predict the grades PSGY1014 23 Multiple Regression in SPSS Intercept Regression coefficients PSGY1014 24 Multiple Regression in SPSS You can predict someone’s exam mark (Score) from the hours of revision they did (H) and their IQ (IQ) with the following regression equation Score = (1.051*H) + (0.895*IQ) – 66.016 PSGY1014 25 Reporting Results in APA format “A multiple regression analysis was conducted to examine the relationship between study time and IQ (i.e., the predictor variables) on the one hand and exam scores (i.e., the outcome variable) on the other hand. The model explained a significant portion of the variance in the exam scores, R2 =.990, F(2, 23) = 1090.55, p <.001. Both study time (t = 36.54, p <.001) and IQ (t = 32.97, p <.001) significantly predicted the exam scores, with the exam scores increasing with 1.051 for every hour of revision and 0.895 for every IQ point. These results show that exam score is affected not only by someone’s IQ but also by the amount of revision they do.” PSGY1014 26 Summary Linear regression analysis is an excellent way to begin making simple quantitative models and fitting them to your data To assess how well the model fits the dataset: The b value is the gradient of the regression line and the strength of relationship between the predictor and outcome variable. If it is significant (p <.05 in the SPSS output), then you can say the predictor variable significantly predicts the outcome PSGY1014 27 Exercise On Moodle, the datasets of the two examples have been given Use SPSS (or JASP) to see whether you can replicate the outcomes of simple and multiple regression analyses There is also a dataset for the exercise given In this dataset, you have the variables of self-esteem, financial security, and social support, and life satisfaction Conduct a multiple regression analysis to examine which factors influence life satisfaction and answer the questions in the exercise document PSGY1014 28 JASP PSGY1014 29 JASP PSGY1014 30 Questions? Socrative Room: JANSSEN2363 PSGY1014 31 Next Week November 19: How to draw graphs with Excel PSGY1014 32 Thank you! [email protected]

Use Quizgecko on...
Browser
Browser