Econometrics Lecture 7: Functional Forms PDF

Document Details

JollyMoldavite4497

Uploaded by JollyMoldavite4497

Universitat Pompeu Fabra

2024

Tags

econometrics functional forms regression analysis economic models

Summary

This document presents a lecture on functional forms in econometrics. It covers polynomials and logarithmic transformations as approaches to modeling non-linear relationships. The lecture notes are relevant to students studying undergraduate econometrics.

Full Transcript

Lecture 7: Functional Forms 25117 - Econometrics Universitat Pompeu Fabra November 11th, 2024 What we learned in the last lesson - Hypothesis tests and confidence intervals for a single regression coefficient are carried out using essentially the same procedures used in the simpl...

Lecture 7: Functional Forms 25117 - Econometrics Universitat Pompeu Fabra November 11th, 2024 What we learned in the last lesson - Hypothesis tests and confidence intervals for a single regression coefficient are carried out using essentially the same procedures used in the simple linear regression model. For example, a 95% confidence interval for β̂1 is given by β̂1 ± 1.96SE(β̂1 ) - Hypotheses involving more than one restriction on the coefficients are called joint hypotheses. Joint hypotheses can be tested using an F-statistic. F-statistics relate to Fq,n−k−1 distributions. - Regression specification proceeds by first determining a base specification chosen to address concern about omitted variable bias. The base specification can be modified by including additional regressors that control for other potential sources of omitted variable bias. - Simply choosing the specification with the highest R 2 can lead to regression models that do not estimate the causal effect of interest. References 2 / 36 Introduction - The regression function so far has been linear in the X ’s... But the linear approximation is not always a good one! - The multiple regression model can handle regression functions that are nonlinear in one or more X. - If a relation between Y and X is nonlinear: - The effect on Y of a change in X depends on the value of X – that is, the marginal effect of X is not constant - A linear regression is misspecified: the functional form is wrong - The estimator of the effect on Y of X is biased: in general, it isn’t even right on average. - The solution is to estimate a regression function that is nonlinear in X References 3 / 36 A linear fit on a (possibly) linear relationship References 4 / 36 A linear fit on a non-linear relationship References 5 / 36 The general nonlinear population regression function The general nonlinear population regression function is defined as Yi = f (X1i , X2i ,... , Xki ) + ui Assumptions: - E(ui | X1i , X2i ,... , Xki ) = 0 so f (.) is the conditional expectation of Y given the X ’s. - (Yi , X1i , X2i ,... , Xki ) are i.i.d. - Big outliers are rare (same idea as with linear OLS; the precise mathematical condition depends on the specific f (.)). - No perfect multicollinearity (same idea; the precise statement depends on the specific f (.)). The expected difference in Y associated with a difference in X1 , ceteris paribus, is ∆Y = f (X1 + ∆X1 , X2 ,... , Xk ) − f (X1 , X2 ,... , Xk ) References 6 / 36 Nonlinear Functions of a Single Independent Variable We’ll look at two complementary approaches: - Polynomials in X - The population regression function is approximated by a quadratic, cubic, or higher-degree polynomial - Logarithmic transformations - Y and/or X is transformed by taking its logarithm, which provides a “percentages” interpretation (i.e., elasticities and semi-elasticities) of the coefficients that makes sense in many applications The choice of specification (functional form) should be guided by judgment (which interpretation makes the most sense in your application?), tests, and plotting predicted values References 7 / 36 Polynomials in X Approximate the population regression function by a polynomial: Yi = β0 + β1 Xi + β2 Xi2 + β3 Xi3 + · · · + βr Xir + ui - This is just the linear multiple regression model – except that the regressors are powers of X! - Estimation, hypothesis testing, etc. proceeds as in the multiple regression model using OLS - The coefficients are difficult to interpret, but the regression function itself is interpretable For example, let’s estimate the following quadratic specification: TestScorei = β0 + β1 Incomei + β2 Income2i + ui References 8 / 36 Polynomials in X References 9 / 36 Polynomials in X - We can plot the predicted values... - The predicted change in TestScore for a change in income from $5K per capita to $6K per capita is : (6 × 3.85 − 62 × 0.042) − (5 × 3.85 − 52 × 0.042) = 3.4 ∆Income ∆TestScore from 5 to 6 3.4 from 25 to 26 1.7 from 45 to 46 0.0 - Caution! What is the effect of a - The “effect” of a change in income is greater at low change from $100K to $101K? than high income levels (i.e., a declining marginal effect of income) References 10 / 36 Polynomials in X References 11 / 36 Polynomials in X We just ran TestScorei = β0 + β1 Incomei + β2 Income2i + β3 Income3i + ui - We can test the linear model against this cubic model: H0 : β2 = β3 = 0 VS H1 : β2 ̸= 0 and/or β3 ̸= 0 - The F-statistic of the test is 37.69 >> 4.61 - The hypothesis that the population regression is linear is rejected at the 1% significance level against the alternative that it is a polynomial of degree up to 3. References 12 / 36 Logarithmic functions of Y and/or X - Remember that, according to Taylor’s expansion, f (x) ≈ f (a) + f ′ (a)(x − a) - Following this equation, log(x) ≈ log(x) |x=a + ∂log(x) ∂x |x=a (x − a) - Therefore, for a = 1, log(x) ≈ x − 1 - It follows that, in the neighborhood of 1, log(x) can be approximated by x − 1. - Consider Y + ∆Y ≈ Y (i.e., x = Y +∆Y Y ≈ 1). - Then log(Y + ∆Y ) − log(Y ) ≈ Y... the percent ∆Y change is a linear approximation of the log difference! - Approximating the curve of log(x) with x − 1 gets worse and worse as we move away from x = 1... For instance:.5 log(.7 +.5) − log(.7) =.54 ̸= =.71.7 References 13 / 36 [short digression] Logarithmic functions of Y and/or X - So how should we think about log changes? log(.7 +.5) − log(.7) =.54 = 54 × 0.01 ≈ 54 × log(1.01) ⇔.7+.5.7 = 1.2.7 ≈ 1.0154 - 1.2 is, relative to.7, approximately equivalent to 54 different 1% increases compounded! - In economics, we often think in terms of compounding percent changes (e.g., compound interest rates, inflation, population growth, investments returns, price elasticity, etc.), i.e., Y YT = (1 + Rt ) t - Then, it is easier to write X X log(YT ) = log(1 + Rt ) = log(Yt ) − log(Yt−1 ) t t References 14 / 36 Logarithmic functions of Y and/or X In the linear regression model, Yi = β0 + β1 X1i + ui 1-unit increase in X1i was associated with a β1 -unit increase in Yi. Now, depending on the hypothesis we derived from the theory, data can be log-transformed on the RHS or the LHS or both. Yi = β0 + β1 log(X1i ) + ui log(Yi ) = β0 + β1 X1i + ui log(Yi ) = β0 + β1 log(X1i ) + ui In each case, β1 has a different interpretation. References 15 / 36 Linear-log population regression function Y = β0 + β1 log(X ) + u ⇒ Y + ∆Y = β0 + β1 log(X + ∆X ) + u ⇒ ∆Y = β1 [log(X + ∆X ) − log(X )] Because log(Y + ∆Y ) − log(Y ) ≈ ∆Y Y ∆X ∆Y ≈ β1 X ∆Y ⇒ β1 = ∆X /X Because 100 × ∆X /X = percentage change in X , a 1% change in X is associated with a β1 /100 change in Y. References 16 / 36 Log-linear population regression function log(Y ) = β0 + β1 X + u ⇒ log(Y + ∆Y ) = β0 + β1 (X + ∆X ) + u ⇒ log(Y + ∆Y ) − log(Y ) = β1 ∆X Because log(Y + ∆Y ) − log(Y ) ≈ ∆Y Y ∆Y Y ≈ β1 ∆X ∆Y /Y ⇒ β1 = ∆X Because 100 × ∆Y /Y = percentage change in Y , a change in X by 1 unit is associated with a 100 × β1 % change in Y. [β1 =semi-elasticity] References 17 / 36 Log-Log population regression function log(Y ) = β0 + β1 log(X ) + u ⇒ log(Y + ∆Y ) = β0 + β1 log(X + ∆X ) + u ⇒ log(Y + ∆Y ) − log(Y ) = β1 [log(X + ∆X ) − log(X )] Because log(Y + ∆Y ) − log(Y ) ≈ ∆Y Y ∆Y ∆X Y ≈ β1 X ∆Y /Y ⇒ β1 = ∆X /X A 1% change in X by 1 unit is associated with a β1 % change in Y. [β1 =elasticity] References 18 / 36 (Linear-)Cubic vs. Linear-Log References 19 / 36 Log-Linear vs. Log-Log References 20 / 36 Interactions and heterogenous effects - In the California Schools example, at the beginning of the course, we were interested in knowing the impact of the Share of Subsidized Meals on TestScores - We saw that the Share of Subsidized Meals was, on average, negatively associated with TestScores - We said that the impact of the Share of Subsidized Meals on on TestScores could be cofounded by local Educational Attainment - But we might be interested in knowing the heterogenous effects of the Share of Subsidized Meals depending on different levels of Educational Attainment - Is the effect of subsidizing meals still associated with lower test scores, even when local educational attainment is low? References 21 / 36 Interactions between two dummies Consider the following specification Yi = β0 + β1 D1i + β2 D2i + β3 (D1i × D2i ) + ui where D1i and D2i are dummy variables. - (Comparison category) D1i = 0 and D2i = 0 ⇒ E(Y | D1i = 0, D2i = 0) = β0 - When D1i = 1 but D2i = 0 ⇒ E(Y | D1i = 1, D2i = 0) = β0 + β1 - When D1i = 0 but D2i = 1 ⇒ E(Y | D1i = 0, D2i = 1) = β0 + β2 - Full model when D1i = 1 and D2i = 1 ⇒ E(Y | D1i = 1, D2i = 1) = β0 + β1 + β2 + β3 β3 can be interpreted as the incremental effect to D1 (resp. D2 ) when D2 = 1 (resp. D1 = 1) References 22 / 36 Interactions between two dummies References 23 / 36 Interactions between a dummy and a continuous variable How to think about dummies in the presence of a continuous variables? First, consider the simple case without interaction: Yi = β0 + β1 X1i + β2 D1i + ui where D1i is a dummy and X1i is a continuous variable. The regression line when D1 = 0 reads E(Y | D1i = 0, X1i ) = β0 + β1 X2i While the regression line when D1 = 1 reads E(Y | D1i = 1, X1i ) = (β0 + β2 ) + β1 X1i The intercepts are different but the slopes are the same! References 24 / 36 Interactions between a dummy and a continuous variable References 25 / 36 Interactions between a dummy and a continuous variable This framework can be extended to the interaction between a dummy and a continuous variable. Next, consider Yi = β0 + β1 X1i + β2 (D1i × X1i ) + ui where D1i is a dummy and X1i is a continuous variable. The regression line when D1 = 0 reads E(Y | D1i = 0, X1i ) = β0 + β1 X1i While the regression line when D1 = 1 reads E(Y | D1i = 1, X1i ) = β0 + (β1 + β2 )X1i The intercept is the same, but slopes are different! References 26 / 36 Interactions between a dummy and a continuous variable References 27 / 36 Interactions between a dummy and a continuous variable Next consider the fully interacted model: Yi = β0 + β1 D1i + β2 X1i + β3 (D1i × X1i ) + ui where D1i is a dummy and X1i is a continuous variable. The regression line when D1 = 0 reads E(Y | D1i = 0, X1i ) = β0 + β2 X1i While the regression line when D1 = 1 reads E(Y | D1i = 1, X1i ) = (β0 + β1 ) + (β2 + β3 )X1i The intercepts and slopes depend on the groups defined by D1 ! References 28 / 36 Interactions between a dummy and a continuous variable References 29 / 36 Interactions between a dummy and a continuous variable How to discriminate between the last 3 models? Run the fully interacted model Yi = β0 + β1 D1i + β2 X1i + β3 (D1i × X1i ) + ui - Test if the two lines are the same — (In other words, does the interaction matter?) Compute the F-statistic testing the joint hypothesis that β1 = β3 = 0 (Chow Test) - Test if the two lines have the same slope — Compute the t-statistic testing the simple hypothesis that β3 = 0 - Test if the two lines have the same intercept — Compute the t-statistic testing the simple hypothesis that β1 = 0 References 30 / 36 Interactions between a dummy and a continuous variable References 31 / 36 Interactions between a dummy and a continuous variable References 32 / 36 Interactions between two continuous variable References 33 / 36 Wrapping up: Linear vs. Non-Linear Models In econometrics, we initially favor linear models for their simplicity: - Linear models assume a constant relationship between variables (e.g., a one-unit increase in X leads to a fixed change in Y ). - However, economic relationships are often complex and dynamic, and this assumption might not hold in reality. - Non-linear models, in contrast, allow for more flexible and realistic relationships. They can capture intricate economic behaviors that linear models cannot. In practice, numerous economic phenomena exhibit non-linear characteristics: - For example, consider luxury goods. The demand for such goods may not increase linearly with income. As income rises, the demand might accelerate. - Using linear models in such cases can lead to biased and inaccurate results because they fail to capture these nuanced relationships. - Recognizing non-linear patterns in data is essential for accurately modeling and understanding real-world economic dynamics. References 34 / 36 The Importance of Interactions Interactions in econometrics refer to how the effect of one variable depends on the level of another: - Non-linear interactions can be critical in economic analysis. They enable us to explore how the relationship between two variables changes as one variable varies. - For instance, think about the interaction between interest rates and GDP. The impact of interest rates on investment may vary at different levels of GDP. In times of economic growth, a change in interest rates might have a stronger or weaker effect on investment. - By incorporating non-linear interactions, we gain a more nuanced understanding of the interplay between variables, making our models more realistic and insightful. −→ Choosing the right functional form may not only improve your predictive power, it can also have large policy implications! References 35 / 36 Material I – Textbooks: - Introduction to Econometrics, 4th Edition, Global Edition, by Stock and Watson – Chapter 8. - Introductory Econometrics, 5th Edition, A Modern Approach, by Jeff. Wooldridge – Chapter 6-9. References 36 / 36

Use Quizgecko on...
Browser
Browser