Advanced Statistics Lecture PDF
Document Details
![WellPositionedAcademicArt](https://quizgecko.com/images/avatars/avatar-10.webp)
Uploaded by WellPositionedAcademicArt
2025
Tags
Summary
This document is a lecture on advanced statistics, specifically focusing on multiple regression with interaction terms. The lecture notes provide an outline, discuss statistical interaction, and explain how to interpret models with interaction terms. It covers topics like centering explanatory variables and illustrates these concepts with examples.
Full Transcript
Advanced Statistics January 23, 2025 1 Outline Multiple regression with an interaction term Interaction between two continuous variables Centering T table 2 Statistical interaction...
Advanced Statistics January 23, 2025 1 Outline Multiple regression with an interaction term Interaction between two continuous variables Centering T table 2 Statistical interaction 3 Statistical interaction The effect of an independent variable on the dependent variable may be influenced by another independent variable. For example: Studies have found that as people get older, the effect of education on income increases. Age (X2) Education (X1) Income (Y) Interaction between: X1 (education level→continuous/interval variable) and X2 (age → continuous/interval variable) Note that in this example, the variable ‘monthly income’ is named as ‘salary’. 4 How do we display an interaction effect in prediction equation? To see if the effect of education on income depends on age, we add an interaction term of education and age to the regression equation. 𝑦ො = 𝑎 + 𝑏1 𝑒𝑑𝑢𝑐𝑎𝑡𝑖𝑜𝑛 + 𝑏2 𝑎𝑔𝑒 +𝑏3 𝑒𝑑𝑢𝑐𝑎𝑡𝑖𝑜𝑛𝑋𝑎𝑔𝑒 In SPSS: 5 How do we interpret a model with interaction term? When estimating a model with an interaction term, you should pay attention to two things: 1) the p-value for the interaction effect. If it is not significant, it would make sense to drop the interaction term from the model and return to a model without interaction. 2) The coefficients of main effects have a special meaning. 6 How do we interpret a model with interaction term? 𝑦ො = 𝑎 + 𝑏1 𝑒𝑑𝑢𝑐𝑎𝑡𝑖𝑜𝑛 + 𝑏2 𝑎𝑔𝑒 + 𝑏3 𝑒𝑑𝑢𝑐𝑎𝑡𝑖𝑜𝑛𝑋𝑎𝑔𝑒 Interaction effect Effects at a specific value (when the other predictor is zero) 7 Interpretation – Constant 𝑦ො = 𝑎 + 𝑏1 𝑥1 + 𝑏2 𝑥2 +𝑏3𝑥1𝑥2 𝑠𝑎𝑙𝑎𝑟𝑦 = 𝑎 + 𝑏1 ∗ 𝑒𝑑𝑢𝑐𝑎𝑡𝑖𝑜𝑛 + 𝑏2 ∗ 𝑎𝑔𝑒 + 𝑏3 ∗ 𝑒𝑑𝑢𝑐𝑎𝑡𝑖𝑜𝑛𝑋𝑎𝑔𝑒 𝑠𝑎𝑙𝑎𝑟𝑦 = −862.7 + 330.2 ∗ 𝑒𝑑𝑢𝑐𝑎𝑡𝑖𝑜𝑛 + 18.1 ∗ 𝑎𝑔𝑒 + 12.3 ∗ 𝑒𝑑𝑢𝑐𝑎𝑡𝑖𝑜𝑛𝑋𝑎𝑔𝑒 Constant: By filling in the value 0 for all variables: age = 0 and education = 0. → The average monthly salary for someone with age = 0 and education = 0 is -862.7 8 Interpretation – Effect of education We were initially interested in the effect of education on income. Now we want to know whether the effect of education on income is influenced by age. In other words, does the effect of education on income differ as age increases? Therefore, I interpret the effect of education in relation to the interaction effect. 9 Interpretation: Effect of education 𝑦ො = 𝑎 + 𝑏1 𝑥1 + 𝑏2 𝑥2 +𝑏3 𝑥1𝑥2 𝑠𝑎𝑙𝑎𝑟𝑦 = 𝑎 + 𝑏1 ∗ 𝑒𝑑𝑢𝑐𝑎𝑡𝑖𝑜𝑛 + 𝑏2 ∗ 𝑎𝑔𝑒 + 𝑏3 ∗ 𝑒𝑑𝑢𝑐𝑎𝑡𝑖𝑜𝑛𝑋𝑎𝑔𝑒 𝑠𝑎𝑙𝑎𝑟𝑦 = −862.7 + 330.2 ∗ 𝑒𝑑𝑢𝑐𝑎𝑡𝑖𝑜𝑛 + 18.1 ∗ 0 + 12.3 ∗ 𝑒𝑑𝑢𝑐𝑎𝑡𝑖𝑜𝑛𝑋0 Effect of education: This is the effect of education on salary when the value of age = 0. The effect of education on salary is 330.2 for people aged 0 years old. 10 Interpretation – Effect of education Let’s assign 𝑎𝑔𝑒= 1 and look at the model: 𝑠𝑎𝑙𝑎𝑟𝑦 = 𝑎 + 𝑏1 ∗ 𝑒𝑑𝑢𝑐𝑎𝑡𝑖𝑜𝑛 + 𝑏2 ∗ 𝑎𝑔𝑒 + 𝑏3 ∗ 𝑒𝑑𝑢𝑐𝑎𝑡𝑖𝑜𝑛𝑋𝑎𝑔𝑒 𝑠𝑎𝑙𝑎𝑟𝑦 = −862.7 + 330.2 ∗ 𝑒𝑑𝑢𝑐𝑎𝑡𝑖𝑜𝑛 + 18.1 ∗ 1 + 12.3 ∗ 𝑒𝑑𝑢𝑐𝑎𝑡𝑖𝑜𝑛𝑋𝑎𝑔𝑒 𝑠𝑎𝑙𝑎𝑟𝑦 = −844.6 + 342.5 ∗ 𝑒𝑑𝑢𝑐𝑎𝑡𝑖𝑜𝑛 Effect of education: The interaction term is 12.3 which is the piece of effect you need to add to the coefficient for education on salary for each unit increase in age, i.e., the effect of education on income increases with 12.3 when age changes by one unit, 330.2+12.3 = 342.5. 11 Predicted average scores for individuals Predicted value is monthly salary (in euros) Education Age 𝑦ො = a + 𝑏1x1 + 𝑏2x2 + 𝑏3x1x2 4 years 35 years -862.7+(330.2*4)+(18.1*35)+(12.3*4*35) = 2813.6 4 years 55 years -862.7+(330.2*4)+(18.1*55)+(12.3*4*55) = 4159.6 5 years 35 years -862.7+(330.2*5)+(18.1*35)+(12.3*5*35) = 3574.3 5 years 55 years -862.7+(330.2*5)+(18.1*55)+(12.3*5*55) = 5166.3 Old people with 4 years of education have higher income than young people with 4 years of education. Old people with 5 years of education have higher income than young people with 5 years of education. Plotting the interaction 13 Plotting the interaction Significant difference? 14 Centering the explanatory variables Main effects in an interaction are usually not meaningful (especially those of continuous variables) because they only refer to the effect of one predictor when the other predictor is zero. (e.g., 0 years of education). To facilitate the interpretation of the coefficients, we could center the variables before constructing the interaction term. Centering the variables refers to the act of centering the values of these variables around 0 by subtracting the mean (i.e. the mean becomes zero). This would ensure that main effects in an interaction model relate to the effect of one predictor on the mean of another predictor. 15 Centering variables in SPSS 16 Let’s rerun the regression after centering Original models Models with centered varibles 17 Let’s rerun the regression after centering NOTE: In the dataset, the mean age was 47.56, and the mean of the education variable was 4.56. Constant: By assigning the value 0 to all variables, where cent_age = 0 and cent_education = 0, 0 corresponds to the mean values of 47.56 for age and 4.56 for education. In Model 2: The predicted average monthly salary for an individual with the mean age (47.56) and mean education (4.56) is 4170.8. Models with centered varibles 18 Let’s rerun the regression after centering Effetc of education: This is the effect of education on salary when the value of cent_ age = 0 The effect of education on salary is 914.2 for people aged 47.6 years old. Model with centered varibles 19 Predicted average scores for individuals Predicted value is monthly salary (in euros) Education Age 𝑦ො = a + 𝑏1x1 + 𝑏2x2 + 𝑏3x1x2 4 55 years -862.7+(330.2*4)+(18.1*55)+(12.3*4*55) = 4159.6 4 → (4+ 4.56) 55 years → (55+47.56) 4170.8+(914.2*4)+(74.1*55)+(12.3*4*55) = 14609.1 Original model Model with centered variables Predicted average scores for individuals Predicted value is monthly salary (in euros) Education Age 𝑦ො = a + 𝑏1x1 + 𝑏2x2 + 𝑏3x1x2 4 55 years -862.7+(330.2*4)+(18.1*55)+(12.3*4*55) = 4159.6 4 → 4.56+ 4 55 years → 47.56 +55 4170.8+(914.2*4)+(74.1*55)+(12.3*4*55) = 14609.1 -0.56→ 4.56- 0.56 7.44 years → 47.56+ 7.44 4170.8+(914.2*(-0.56))+(74.1*7.44)+(12.3*- 0.56*7.44) = 4158.9 Original model Model with centered variables 22