Data Analysis in Psychology: Correlation and Regression (CHIRUMBOLO 2023)
Document Details
Uploaded by SignificantBandoneon2743
Chirumbolo
2023
Tags
Related
Summary
These lecture notes cover correlation and regression in psychology. It discusses different types of research and explains concepts like descriptive, correlational, and experimental studies. The material also explores various examples of how variables relate to each other. These notes are relevant to undergraduate studies in psychology.
Full Transcript
Data Analysis in Psychology Correlation and Regression Statistical Methods You are here Data Analysis in Psychology Bivariate Correlation The levels of scientific research Level of research Descriptive Correlational Experimental The...
Data Analysis in Psychology Correlation and Regression Statistical Methods You are here Data Analysis in Psychology Bivariate Correlation The levels of scientific research Level of research Descriptive Correlational Experimental The levels of scientific research Descriptive Describe a given phenomenon, representation of the p., an accurate picture of the p. Exploratory → little knowledge of the phenomenon Descriptive statistics (→ frequency, tables, graphs, %, mean …) Examples: opinion polls, surveys, attitudes The levels of scientific research Correlational Describes/discovers/studies relationships between variables Exploratory → association between variables Conclusive → Theory testing Predictive → prediction of an outcome from an exogenous variable Statistical analyses: correlations, regression techniques Methods: survey, questionnaires, structured interviews, tests Examples: relationships between psychological variables and sport/academic achievement (e.g., personality traits, self-efficacy, self-esteem, self-control, perseverance, cognitive abilities, intelligence, social and background factors, norms, SES, education …) Relationships between psychological variables and psycho-social well-being What psychological factors promote well-being? The levels of scientific research Experimental Studies CAUSAL relationships between variables How, when and if an INDEPENDENT variable (X) CAUSES a DEPENDENT variable (Y) (X → Y) Manipulation of IVs (→ experimental conditions) → Cause Measurement of DV (→ measures) after the manipulation of IV Statistical analyses: inferential statistics (t-test, Analysis of Variance …) Methods: Experimental research designs to manipulate the IV and quantitative methods to assess the DV Examples: The levels of scientific research Level of research Level of research Very often research are in practical a combination of two (or more) levels Descriptive Descriptive/ Correlational Correlational Correlational Experimental Experimental/ Correlational Correlation Correlation means association Correlation is a measure of the extent to which two variables are related There are three possible results of a correlational analysis: a positive correlation, a negative correlation, and no correlation. Correlation A positive correlation is a relationship between two variables in which both variables move in the same direction. Therefore, when one variable increases as the other variable increases, or one variable decreases while the other decreases. An example of positive correlation would be height and weight → Taller people tend to be heavier. Self-efficacy and performance → ? Anxiety and Depression → ? Coke advertising and buying → ? Correlation A positive correlation: when one variable increases as the other variable increases, or one variable decreases while the other decreases. This creates a positive slope, visually the scatter plot slants upwards. This means there is a positive relationship. Correlation A negative correlation is a relationship between two variables in which an increase in one variable is associated with a decrease in the other. An example of negative correlation would be height above sea level and temperature. As you climb the mountain (increase in height) it gets colder (decrease in temperature). Prejudice and open mindedness → ? Depression and self-esteem → ? Correlation A negative correlation: One variable decreases when the other variable increases. This creates a negative slope, depicted through points slanting downwards. This means there is a negative relationship. Correlation Cognitive Behavioral Therapy Is Associated With Enhanced Cognitive Control Network Activity in Major Depression and Posttraumatic Stress Disorder. Activation in cognitive control regions correlates with depression severity across major depressive disorder (MDD) and posttraumatic stress disorder (PTSD) groups at baseline. (A) Brain regions showing significant correlation with Montgomery-Åsberg Depression Rating Scale (MADRS) scores are shown in axial slice view (B) Brain regions showing significant correlation with MADRS scores after regressing out Anxious Arousal subscale of Mood and Anxiety Symptoms Questionnaire (MASQ-AA) scores are shown in axial slice view (C) Brain regions showing significant correlation with MASQ-AA scores are shown in axial slice view in Montreal Neurological Institute (D) Brain regions showing significant correlation with MASQ-AA scores after regressing out MADRS scores. DLPFC, dorsolateral prefrontal cortex. HC, healthy control subjects. Correlation A zero correlation exists when there is no relationship between two variables. For example there is no relationship between the amount of tea drunk and level of intelligence. No correlation: One variable does not tend to either increase or decrease depending on the other one, meaning there is no relationship between the two variables. Correlation Scatteplots As we saw, correlations could be visualized by drawing a scatterplot (also known as a scattergram, scatter graph, scatter chart, or scatter diagram). A scatterplot is a graphical display that shows the relationships or associations between two numerical variables, which are represented as points (or dots) for each pair of score. A scatterplot indicates direction of the correlation between the variables (and also the strength as we shall see). When you draw a scatterplot, it doesn't matter which variable goes on the x- axis and which goes on the y-axis. In correlations we are always dealing with paired scores, so the values of the 2 variables taken together will be used to make the diagram. Correlation Scatterplots Correlation Many uses of correlations Prediction Validity Reliability Theory testing Bases of more complex analyses such as Factor Analysis, Multiple regression… Correlation Correlation Coefficients: Determining Correlation Strength Instead of drawing a scatterplot a correlation can be expressed numerically as a standardized coefficient, ranging from -1 to +1. When working with continuous variables, the most common correlation coefficient to use is the Pearson’s r. Correlation The correlation coefficient (r) indicates the extent to which these two variables are associated (related). Values over zero indicate a positive correlation, while values under zero indicate a negative correlation. The more the correlation tend towards –1, the stronger is the negative correlation → as one variable goes up, the other goes down. The more the correlation tend towards +1, the stronger is the positive correlation → as one variable goes up, the other goes up. The values of -1 and +1 indicates a perfect correlation This latter case is observed if and when you correlate the very same variables Due to random measurement error, it is almost impossible to obtain a perfect correlation if you measure and correlate two different variables ± 1 are theoretical bounds Correlation How to interpret the strength of the correlation There is no definitive rule for determining what size of correlation is considered strong, moderate or weak. The interpretation of the coefficient very much depends on the topic of study and on the field Useful guideline (derived from Cohen): Correlation Correlation studies linear relationships If r = 0 → this means only that there is no linear correlation. Note: If r = 0 this does not mean that there is no relationship whatsoever, it just means that it is not linear → It could be a quadratic relationship. Correlation Strength of the relationship and fit of the line Coefficient of determination r2 The proportion (or %) of common variance between two variables How much points are close to the fitting line Do not relate to how much the slope is steeply !!!! Correlation Correlation vs Causation Causation means that one variable (often called the predictor variable or independent variable) causes the other (often called the outcome variable or dependent variable). Experiments can be conducted to establish causation. An experiment isolates and manipulates the independent variable to observe its effect on the dependent variable and controls the environment in order that extraneous variables may be eliminated/controlled (e.g., threats to Internal Validity). A correlation between variables does not automatically mean that the change in one variable is the cause of the change in the values of the other variable. A correlation only shows if there is a relationship between variables. Correlation IS NOT Causation → No causation without MANIPULATION Correlation Correlation vs Causation Correlation IS NOT Causation → No causation without MANIPULATION No MANIPULATION? No causation!!! No Martini? No Party!!! Correlation Ice cream consumption and shark attacks are correlated … Correlation Ice cream consumption and shark attacks are correlated … ? Correlation Ice cream consumption and shark attacks are correlated … Warm Temperature → People swimming in the sea Buying and Shark attacks eating ice cream Correlation Correlation IS NOT Causation: Doing exciting activities and marital satisfaction are correlated Correlation Correlation IS NOT Causation: Doing exciting activities and marital satisfaction are correlated Correlation Correlation IS NOT Causation: Doing exciting activities and marital satisfaction are correlated Correlation Correlation IS NOT Causation: Doing exciting activities and marital satisfaction are correlated due to the intervention of a third variable (spurious correlation) Correlation TPB: Intention to engage in a given behavior (e.g., Physical Exercise) Attitude: favorable orientation toward the behavior + I like to do physical exercise Subjective Norms: what significant others things + + about the behavior My girlfriend thinks it would be good if I do physical exercise Perceived behavioral control: + the possibility to perform the behavior I can do physical exercise Correlation Intention to engage in Physical Exercise Correlation How does a correlation matrix look like in SPSS? Correlation … how to run it with jamovi …. https://www.tylervigen.com/spurious-correlations https://www.thenationalnews.com/uae/nicolas-cage-movies- linked-to-drownings-and-other-spurious-correlations- 1.450759#:~:text=In%20the%20case%20of%20that,making%20 the%20correlation%20statistically%20significant. Data Analysis in Psychology Linear Regression Linear Regression Linear regression is a basic and commonly used type of predictive analysis. Regression models describe the relationship between variables by fitting a straight line to the observed data. Regression allows you to estimate how a dependent variable changes as the independent variable(s) change. The overall idea of regression is to examine two things: (1) does a set of predictor variables (X) do a good job in predicting an outcome (dependent) variable (Y)? (2) Which variables in particular are significant predictors of the outcome variable, and in what way do they impact the outcome variable? The impact is indicated by the magnitude and sign of the “b” estimates, called regression coefficients Linear Regression These regression estimates are used to explain the relationship between one dependent variable (Y) and one or more independent variables (X). Simple linear regression Multiple linear regression The simplest form of the regression equation with one dependent and one independent variable is defined by the formula y = a + b*x where y = estimated dependent variable score, a = intercept (constant), b = regression coefficient, and x = score of the independent variable. Linear Regression Naming the Variables. There are many names for a regression’s dependent variable (Y). It may be called an outcome variable, criterion variable, endogenous variable The independent variables (X) can be called exogenous variables, predictor variables, covariates or regressors. Three major uses for regression analysis are (1) determining the strength of predictor(s), (2) forecasting an effect (3) trend forecasting Linear Regression First, the regression might be used to identify the strength of the effect that the independent variable(s) have on a dependent variable. Second, it can be used to forecast effects or impact of changes. That is, the regression analysis helps us to understand how much the dependent variable changes with a change in one or more independent variables Third, regression analysis predicts trends and future values The regression analysis can be used to get point estimates of the Y (→ correlation does not make it) Linear Regression Types of Linear Regression Simple linear regression 1 dependent variable (interval or ratio), 1 independent variable (interval or ratio or dichotomous dummy) y = a + b*x Multiple linear regression 1 dependent variable (interval or ratio) , 2+ independent variables (interval or ratio or dichotomous) y = a + b1*x1 + b2*x2 + … + bn*xn Linear Regression Least-Squares Regression The most common method for fitting a regression line is the method of least-squares. This method calculates the best-fitting line for the observed data by minimizing the sum of the squares of the vertical deviations from each data point to the line Because the deviations are first squared, then summed, there are no cancellations between positive and negative values. Linear Regression b0 Linear Regression Model fitting When selecting the model for the analysis, an important consideration is model fitting. Typically expressed as R² R² ➔ the amount of explained variance by the set of the predictors Linear Regression What is the difference between correlation and linear regression? Correlation quantifies the direction and strength of the relationship between two numeric variables, X and Y, and always lies between -1.0 and 1.0. Simple linear regression relates X to Y through an equation of the form Y = a + bX. Linear Regression Key similarities Both quantify the direction and strength of the relationship between two numeric variables. When the correlation (r) is negative/positive, the regression slope (b) will be negative /positive. The correlation squared (r2 or R2) has special meaning in simple linear regression. It represents the proportion of variation in Y explained by X. Key differences Regression attempts to establish how X predicts Y and the results of the analysis (→ the equation) will change if X and Y are swapped. With correlation, the X and Y variables are interchangeable. Correlation is a single statistic, whereas regression produces an entire equation. Linear Regression- Example TPB: Intention to engage in Physical Exercise Attitude: favorable orientation toward the behavior + I like to do physical exercise Subjective Norms: what significant others things + + about the behavior My girlfriend thinks it would be happy if I do physical exercise Perceived behavioral control: + the possibility to perform the behavior I can do physical exercise Simple Linear Regression Bivariate Linear Regression while correlation is just used to describe this relationship, regression allows you to take things one step further: from description to prediction. Regression allows you to model the relationship between variables, which enables you to make predictions about what one variable will do based on another. Regression equation: Y = a + bX [→ from X scores you can predict Y scores] Behavioral Attitude Intention Simple Linear Regression Jamovi output Model fit Anova table Simple Linear Regression Jamovi output Regression coefficients Simple Linear Regression Jamovi output Regression equation Int = - 2.984 + 0.243 Att Regression equation: Y = a + bX Simple Linear Regression Jamovi output Scatterplot and fitting line Linear Regression … how to run it with jamovi …. Research Methods & Statistics for the Social Sciences Multiple Linear Regression Extend the bivariate model to more than one predictor, at least two or more More specifically, it enables you to predict the value of one variable based on the values of other predictors The variable you want to predict is called the outcome variable (or DV), usually indicated with Y The variables you will base your prediction on are called the predictor variables (or IV), usually indicated with X Multiple Regression Equation: Y = a + b1X1 + b2X2 + b3X3 + ….. + bnXn Multiple Linear Regression Jamovi output Model fit Anova table Multiple Linear Regression Jamovi output Regression coefficients Multiple Linear Regression Jamovi output Regression coefficients Multiple Linear Regression Jamovi output Regression equation Int = - 6.017 + 0.162 Att + 0.392 SN + 0.380 BC Multiple Linear Regression Effect size in Regression F-squared Linear Regression … how to run it with jamovi …. The End Linear Regression Some tutorial Example: Correlation coefficient intuition https://www.youtube.com/watch?v=-Y-M9aD_ccQ&t=429s Correlation and causality https://www.youtube.com/watch?v=ROpbdO-gRUo&t=60s