Applied Econometrics Probit and Logit Regression Lecture Handout 11 Autumn 2022 PDF
Document Details
Uploaded by ComprehensiveChrysocolla
University of Oulu, Oulu Business School
2022
Elias Oikarinen
Tags
Related
- Multinomial Probit and Logit Models
- Bivariate Probit and Logit Models PDF
- Regression: Linear, Logistic, and Beyond PDF
- Econometrics 2 - Qualitative and Limited Dependent Variable Models PDF
- Econometrics 2: Qualitative and Limited Dependent Variable Models PDF
- Econometría Básica: Regresión No Lineal - PDF
Summary
This lecture handout discusses applied econometrics, focusing on probit and logit regression models, in the context of Autumn 2022. It includes an overview of these regression techniques, their applications in practice, and associated relevant material for independent study.
Full Transcript
Applied Econometrics Probit and Logit Regression Lecture handout 11 Autumn 2022 D.Sc. (Econ.) Elias Oikarinen Professor (Associate) of Economics University of Oulu, Oulu Business School 1 This Handout • Binary dependent variable: dependent variable is not continuous, but takes the value of either...
Applied Econometrics Probit and Logit Regression Lecture handout 11 Autumn 2022 D.Sc. (Econ.) Elias Oikarinen Professor (Associate) of Economics University of Oulu, Oulu Business School 1 This Handout • Binary dependent variable: dependent variable is not continuous, but takes the value of either 0 or 1 • Probit regression • (*)Logit regression to be independently studied • In practice, logit and probit are very similar • (*)Details of ”marginal effects” also to be independently studied • POLL: Are you familiar with probit or logit regressions? • Gujarati, pp. 152-165 (note: although Gujarati relates probit & logit to cross-section data, they work for time series too) • Brooks, pp. 559-588 • EViews, UG II, Chapter 29 2 Multinomial Regression Models • There are many occasions where we may have to choose among several discrete alternatives; the values that the dependent variable may take are limited to certain integers dependent variable can take several (discrete) values (e.g. 0, 1, 2, 3, 4), i.e. it includes multiple categories [e.g. surveys, where people respond poor (0), satisfactory (1), good (2), very good (3), excellent (4)] • Such models are called multinomial regression models (MRM) • Not handled in lectures or exercises (or the assignment) • Gujarati, pp. 166-179; Brooks, pp. 559-579 3 Chapter 11 Regression with a Binary Dependent Variable © Pearson Education Limited 2015 Outline 1. The Linear Probability Model 2. Probit and Logit Regression 3. Estimation and Inference in Probit and Logit 4. Application to Racial Discrimination in Mortgage Lending © Pearson Education Limited 2015 1-5 Binary Dependent Variables: What’s Different? So far the dependent variable (Y) has been continuous What if Y is binary? • Y = get into college, or not; X = high school grades, SAT scores, demographic variables • Y = person smokes, or not; X = cigarette tax rate, income, demographic variables • Y = mortgage application is accepted, or not; X = race, income, house characteristics, marital status © Pearson Education Limited 2015 1-6 Example: Mortgage Denial and Race The Boston Fed HMDA Dataset • Individual applications for single-family mortgages made in 1990 in the greater Boston area • 2380 observations, collected under Home Mortgage Disclosure Act (HMDA) Variables • Dependent variable: – Is the mortgage denied or accepted? • Independent variables: – income, wealth, employment status – other loan, property characteristics – race of applicant © Pearson Education Limited 2015 1-7 Binary Dependent Variables and the Linear Probability Model (SW Section 11.1) A natural starting point is the linear regression model with a single regressor: Yi = β0 + β1Xi + ui But: Y • What does β1 mean when Y is binary? Is β1 = ? X • What does the line β0 + β1X mean when Y is binary? • What does the predicted value Yˆ mean when Y is binary? For example, what does Yˆ = 0.26 mean? © Pearson Education Limited 2015 1-8 The linear probability model, ctd. In the linear probability model, the predicted value of Y is interpreted as the predicted probability that Y=1, and β1 is the change in that predicted probability for a unit change in X. Here’s the math: Linear probability model: Yi = β0 + β1Xi + ui When Y is binary, E(Y|X) = 1×Pr(Y=1|X) + 0×Pr(Y=0|X) = Pr(Y=1|X) Under LS assumption #1, E(ui|Xi) = 0, so E(Yi|Xi) = E(β0 + β1Xi + ui|Xi) = β0 + β1Xi, so Pr(Y=1|X) = β0 + β1Xi © Pearson Education Limited 2015 1-9 The linear probability model, ctd. When Y is binary, the linear regression model Yi = β0 + β1Xi + ui is called the linear probability model because Pr(Y=1|X) = β0 + β1Xi • The predicted value is a probability: – E(Y|X=x) = Pr(Y=1|X=x) = prob. that Y = 1 given x – Yˆ = the predicted probability that Yi = 1, given X • β1 = change in probability that Y = 1 for a unit change in x: Pr(Y 1| X x x) Pr(Y 1| X x) β1 = x © Pearson Education Limited 2015 1-10 Example: linear probability model, HMDA data Mortgage denial & ratio of debt payments to income (P/I ratio) in a subset of the HMDA data set (n = 127) © Pearson Education Limited 2015 1-11 Linear probability model: full HMDA data set Dependent Variable: DENY Method: Least Squares Date: 09/22/19 Time: 21:20 Sample: 1 2380 Included observations: 2380 Huber-White-Hinkley (HC1) heteroskedasticity consistent standard errors and covariance Variable Coefficient Std. Error t-Statistic Prob. C P_I_RATIO -0.079910 0.603535 0.031967 0.098483 -2.499785 6.128343 0.0125 0.0000 R-squared Adjusted R-squared S.E. of regression Sum squared resid Log likelihood F-statistic Prob(F-statistic) Prob(Wald F-statistic) © Pearson Education Limited 2015 0.039738 0.039334 0.318284 240.9028 -651.4237 98.40628 0.000000 0.000000 Mean dependent var S.D. dependent var Akaike info criterion Schwarz criterion Hannan-Quinn criter. Durbin-Watson stat Wald F-statistic 0.119748 0.324735 0.549096 0.553948 0.550862 1.461146 37.55659 1-12 Linear probability model: full HMDA data set, ctd. deny = -0.080 + 0.604P/I ratio (n = 2380) (0.032) (0.098) • What is the predicted value for P/I ratio = 0.3? Pr (deny = 1|P/I ratio = 0.3) = -0.080 + 0.604×0.3 = 0.151 • Calculating “effects:” increase P/I ratio from 0.3 to 0.4: Pr (deny = 1|P/I ratio = 0.4) = -0.080 + 0.604×0.4 = 0.212 The effect on the probability of denial of an increase in P/I ratio from 0.3 to 0.4 is to increase the probability by 0.061, that is, by 6.1 percentage points. © Pearson Education Limited 2015 1-13 Linear probability model: HMDA data, ctd Next include black as a regressor: Dependent Variable: DENY Method: Least Squares Date: 09/22/19 Time: 21:23 Sample: 1 2380 Included observations: 2380 Huber-White-Hinkley (HC1) heteroskedasticity consistent standard errors and covariance Variable Coefficient Std. Error t-Statistic Prob. C P_I_RATIO BLACK -0.090514 0.559195 0.177428 0.028600 0.088666 0.024946 -3.164856 6.306734 7.112417 0.0016 0.0000 0.0000 R-squared Adjusted R-squared S.E. of regression Sum squared resid Log likelihood F-statistic Prob(F-statistic) Prob(Wald F-statistic) © Pearson Education Limited 2015 0.076003 0.075226 0.312282 231.8047 -605.6108 97.76019 0.000000 0.000000 Mean dependent var S.D. dependent var Akaike info criterion Schwarz criterion Hannan-Quinn criter. Durbin-Watson stat Wald F-statistic 0.119748 0.324735 0.511438 0.518717 0.514087 1.517180 49.38650 1-14 Linear probability model: HMDA data, ctd Next include black as a regressor: deny = -0.091 + 0.559P/I ratio + 0.177black (0.032) (0.098) (0.025) Predicted probability of denial: • for black applicant with P/I ratio = 0.3: Pr (deny = 1) = -0.091 + 0.559×0.3 + 0.177×1 = 0.254 • for white applicant, P/I ratio = 0.3: Pr (deny = 1)= -0.091 + 0.559×0.3 + 0.177×0 = 0.077 • difference = 0.177 = 17.7 percentage points • Coefficient on black is significant at the 5% level • Still plenty of room for omitted variable bias… © Pearson Education Limited 2015 1-15 The linear probability model: Summary • The linear probability model models Pr(Y=1|X) as a linear function of X • Advantages: – simple to estimate and to interpret – inference is the same as for multiple regression (need heteroskedasticity-robust standard errors) • Disadvantages: – A LPM says that the change in the predicted probability for a given change in X is the same for all values of X, but that doesn’t make sense. Think about the HMDA example… – Also, LPM predicted probabilities can be <0 or >1! • These disadvantages can be solved by using a nonlinear probability model: probit and logit regression © Pearson Education Limited 2015 1-16 Probit and Logit Regression (SW Section 11.2) The problem with the linear probability model is that it models the probability of Y=1 as being linear: Pr(Y = 1|X) = β0 + β1X Instead, we want: I. Pr(Y = 1|X) to be increasing in X for β1>0, and II. 0 ≤ Pr(Y = 1|X) ≤ 1 for all X This requires using a nonlinear functional form for the probability. How about an “S-curve”… © Pearson Education Limited 2015 1-17 • The probit model satisfies these conditions: I. Pr(Y = 1|X) to be increasing in X for β1>0, and II. 0 ≤ Pr(Y = 1|X) ≤ 1 for all X © Pearson Education Limited 2015 1-18 Probit regression models the probability that Y=1 using the cumulative standard normal distribution function, Φ(z), evaluated at z = β0 + β1X. The probit regression model is, Pr(Y = 1|X) = Φ(β0 + β1X) where Φ is the cumulative normal distribution function and z = β0 + β1X is the “z-value” or “z-index” of the probit model. Example: Suppose β0 = -2, β1= 3, X = 0.4, so Pr(Y = 1|X=0.4) = Φ(-2 + 3×0.4) = Φ(-0.8) Pr(Y = 1|X=0.4) = area under the standard normal density to left of z = -0.8, which is… © Pearson Education Limited 2015 1-19 Pr(z ≤ -0.8) = 0.2119 © Pearson Education Limited 2015 1-20 Probit regression, ctd. Why use the cumulative normal probability distribution? • The “S-shape” gives us what we want: – Pr(Y = 1|X) is increasing in X for β1>0 – 0 ≤ Pr(Y = 1|X) ≤ 1 for all X • Easy to use – the probabilities are tabulated in the cumulative normal tables (and also are easily computed using regression software) • Relatively straightforward interpretation: – β0 + β1X = z-value – ̂0 + ̂1 X is the predicted z-value, given X – β1 is the change in the z-value for a unit change in X © Pearson Education Limited 2015 1-21 EViews Example: HMDA data Dependent Variable: DENY Method: ML - Binary Probit (Newton-Raphson / Marquardt steps) Date: 01/14/19 Time: 18:46 Sample: 1 2380 Included observations: 2380 Convergence achieved after 6 iterations Coefficient covariance computed using the Huber-White method Variable Coefficient Std. Error z-Statistic Prob. C P_I_RATIO -2.194159 2.967908 0.164941 0.465224 -13.30265 6.379524 0.0000 0.0000 McFadden R-squared S.D. dependent var Akaike info criterion Schwarz criterion Hannan-Quinn criter. Restr. deviance LR statistic Prob(LR statistic) Obs with Dep=0 Obs with Dep=1 0.046203 0.324735 0.700666 0.705519 0.702432 1744.171 80.58593 0.000000 2095 285 Mean dependent var S.E. of regression Sum squared resid Log likelihood Deviance Restr. log likelihood Avg. log likelihood Total obs 0.119748 0.316430 238.1049 -831.7923 1663.585 -872.0853 -0.349493 2380 Pr (deny = 1|P/I ratio) = Φ(-2.19 + 2.97×P/I ratio) © Pearson Education Limited 2015 (0.16) (0.47) 1-22 EViews Example: HMDA data, ctd. Pr (deny = 1|P/I ratio) = Φ(-2.19 + 2.97×P/I ratio) (0.16) (0.47) • Positive coefficient: Does this make sense? (POLL) • Standard errors have the usual interpretation • Predicted probabilities: Pr (deny = 1|P/I ratio = 0.3) = Φ (-2.19+2.97×0.3) = Φ (-1.30) =0 .097 • Effect of change in P/I ratio from 0.3 to 0.4: Pr (deny = 1|P/I ratio = 0.4) = Φ (-2.19+2.97×0.4) = Φ (-1.00) = 0.159 • Predicted probability of denial rises from 0.097 to 0.159 © Pearson Education Limited 2015 1-23 Probit regression with multiple regressors Pr(Y = 1|X1, X2) = Φ (β0 + β1X1 + β2X2) • Φ is the cumulative normal distribution function. • z = β0 + β1X1 + β2X2 is the “z-value” or “z-index” of the probit model. • β1 is the effect on the z-score of a unit change in X1, holding constant X2 © Pearson Education Limited 2015 1-24 EViews Example: HMDA data Dependent Variable: DENY Method: ML - Binary Probit (Newton-Raphson / Marquardt steps) Date: 01/14/19 Time: 18:49 Sample: 1 2380 Included observations: 2380 Convergence achieved after 6 iterations Coefficient covariance computed using the Huber-White method Variable Coefficient Std. Error z-Statistic Prob. C P_I_RATIO BLACK -2.258738 2.741637 0.708158 0.158788 0.444081 0.083171 -14.22489 6.173725 8.514527 0.0000 0.0000 0.0000 McFadden R-squared S.D. dependent var Akaike info criterion Schwarz criterion Hannan-Quinn criter. Restr. deviance LR statistic Prob(LR statistic) Obs with Dep=0 Obs with Dep=1 0.085943 0.324735 0.672383 0.679662 0.675033 1744.171 149.8985 0.000000 2095 285 Mean dependent var S.E. of regression Sum squared resid Log likelihood Deviance Restr. log likelihood Avg. log likelihood 0.119748 0.310427 229.0596 -797.1360 1594.272 -872.0853 -0.334931 Total obs 2380 We’ll go through the estimation details later… © Pearson Education Limited 2015 1-25 EViews Example, ctd.: Predicted probit probabilities Probability of denial = 1-@cnorm(-(C(1) + C(2)*p_i_ratio + C(3)*black)) Eviews command: scalar prob_deny = 1-@cnorm(-(C(1) + C(2)*0.3 + C(3)*0)) Predicted probability for p_i_ratio = 0.3, black = 0: 0.075425 NOTE C(1) is the estimated intercept (-2.258738) C(2) is the coefficient on p_i_ratio (2.741637) C(3) is the coefficient on black (0.708) scalar creates a new scalar which is the result of a calculation © Pearson Education Limited 2015 1-26 EViews Example, ctd. Pr (deny = 1|P/I, black) = Φ(-2.26 + 2.74×P/I ratio + 0.71×black) (0.16) (0.44) (0.08) • Is the coefficient on black statistically significant? (POLL) • Estimated effect of race for P/I ratio = 0.3: Pr (deny = 1|0.3,1)= Φ(-2.26+2.74×0.3+0.71×1) = 0.233 Pr (deny = 1|0.3,0)= Φ(-2.26+2.74×0.3+0.71×0) = 0.075 • Difference in rejection probabilities = 0.158 (15.8 percentage points) • Still plenty of room for omitted variable bias! POLL © Pearson Education Limited 2015 1-27 *Logit Regression Logit regression models the probability of Y=1, given X, as the cumulative standard logistic distribution function, evaluated at z = β0 + β1X: Pr(Y = 1|X) = F(β0 + β1X) where F is the cumulative logistic distribution function: 1 F(β0 + β1X) = 1 e ( 0 1 X ) Because logit and probit use different probability functions, the coefficients (β’s) are different in logit and probit. © Pearson Education Limited 2015 1-28 *Logit regression, ctd. Pr(Y = 1|X) = F(β0 + β1X) where Example: F(β0 + β1X) = 1 1 e ( 0 1 X ) . β0 = -3, β1= 2, X = 0.4, so β0 + β1X = -3 + 2*0.4 = -2.2 so Pr(Y = 1|X=0.4) = 1/(1+e–(–2.2)) = 0.0998 Why bother with logit if we have probit? • The main reason is historical: logit is computationally faster & easier, but that doesn’t matter nowadays • In practice, logit and probit are very similar – since empirical results typically don’t hinge on the logit/probit choice, both tend to be used in practice © Pearson Education Limited 2015 1-29 *EViews Example: HMDA data Dependent Variable: DENY Method: ML - Binary Logit (Newton-Raphson / Marquardt steps) Date: 01/14/19 Time: 19:23 Sample: 1 2380 Included observations: 2328 Convergence achieved after 7 iterations Coefficient covariance computed using the Huber-White method Variable Coefficient Std. Error z-Statistic Prob. C P_I_RATIO BLACK -4.127323 5.373453 1.273425 0.348375 0.967129 0.146449 -11.84735 5.556087 8.695329 0.0000 0.0000 0.0000 McFadden R-squared S.D. dependent var Akaike info criterion Schwarz criterion Hannan-Quinn criter. Restr. deviance LR statistic Prob(LR statistic) Obs with Dep=0 Obs with Dep=1 © Pearson Education Limited 2015 0.088484 0.325854 0.674048 0.681461 0.676749 1714.927 151.7436 0.000000 2047 281 Mean dependent var S.E. of regression Sum squared resid Log likelihood Deviance Restr. log likelihood Avg. log likelihood Total obs 0.120704 0.311031 224.9206 -781.5917 1563.183 -857.4635 -0.335735 2328 1-30 *EViews Example, ctd.: Predicted logit probabilities Probability of denial = 1-@clogistic(-(C(1) + C(2)*p_i_ratio + C(3)*black)) Eviews command: scalar pr_deny_logit = 1-@clogistic(-(C(1) + C(2)*0.3 + C(3)*0)) Predicted probability for p_i_ratio=0.3, black=0: 1/(1+exp(-(c(1) + c(2)*0.3 + c(3)*0))) = 0.07479 NOTE: the probit predicted probability is 0.075425 © Pearson Education Limited 2015 1-31 *The predicted probabilities from the probit and logit models are very close in these HMDA regressions: © Pearson Education Limited 2015 1-32 Parameter Interpretation for Probit and Logit Models • • • • • Standard errors and t-ratios will automatically be calculated by the econometric software package used, and hypothesis tests can be conducted in the usual fashion. However, interpretation of the coefficients needs slight care. In the LPM, slope coefficient measures the marginal effect of a unit change in the explanatory variable on the probability of the outcome, holding other variables constant. In the logit and probit models, the marginal effect of a unit change in the explanatory variable not only depends on the coefficient of that variable but also on the level of probability from which the change is measured. The latter depends on the values of all the explanatory variables in the model. (Compare the graphs in pp. 18 & 32) For logit or probit models, the form of the function is not Pi = 1 + 2 x2i + ui, for example, but rather Pi = F(x2i) where F represents the (non-linear) logistic or cumulative normal function. © Pearson Education Limited 2015 30‘ Parameter Interpretation for Probit and Logit Models • • • • To obtain the required relationship between changes in x2i and Pi, we would need to differentiate F with respect to x2i and it turns out that this derivative is 2F(x2i) . So in fact, a 1-unit increase in x2i will cause a 2F(x2i) increase in probability. Usually, these impacts of incremental changes in an explanatory variable are evaluated by setting each of them to their mean values. These estimates are sometimes known as the marginal effects. For logit / probit models The marginal effect is given by the coefficient of the variable multiplied by the value of the logistic / normal density function evaluated for all the X values for that individual. © Pearson Education Limited 2015 31 ‘ *Marginal effects • EViews allows you to compute either the fitted probability, 𝑃𝑖 = 1 − 𝐹 −𝑥𝑖′ 𝛽 , or the fitted values of the index 𝑥𝑖′ 𝛽. From the equation toolbar select Proc/Forecast (Fitted Probability/Index)…, and then click on the desired entry. • You can use the fitted index in a variety of ways, for example, to compute the marginal effects of the explanatory variables. • Simply forecast the fitted index and save the results in a series, say xb. Then the auto-series @dnorm(-xb) or @dlogistic(-xb) may be multiplied by the coefficients of interest to provide an estimate of the derivatives of the expected value of 𝑦𝑖 with respect to the j-th variable in 𝑥𝑖 : 𝜕𝐸 𝑦𝑖 |𝑥𝑖 𝛽 = 𝑓 −𝑥𝑖′ 𝛽 𝛽𝑗 𝜕𝑥𝑖𝑗 © Pearson Education Limited 2015 1-35 *EViews Example: HMDA data / Probit 1. Name the estimated equation in slide 1-22 to eq02 2. In estimation result window, Proc/Forecast/Index/Forecast Name: deny_index OK. Proc/Forecast, choose Index. This will create the XB series with values for each person in your forecast sample. 3. Compute the mean of the series (put into a scalar) scalar meanxb_eq02=@mean(deny_index) 3. Compute the marginal effects weighting term (put into a scalar) scalar w_eq02 =@dnorm(-meanxb_eq02) 4. Compute the marginal effects for each coefficient (put into a vector) vector meffects_eq02=w_eq02*eq02.@coefs © Pearson Education Limited 2015 1-36 *EViews Example: HMDA data / Probit • Marginal effect for P/I ratio is 0.50 – if P/I ratio increases by 0.1 (e.g. from 0.3 to 0.4) the probability of denial increases on average by 0.05, i.e. by 5% • Marginal effect for black is 0.13 – probability of denial is on average 13% higher for black © Pearson Education Limited 2015 1-37 *Parameter Estimation for Probit and Logit Models • • • • • • • • Given that both logit and probit are non-linear models, they cannot be estimated by OLS Maximum likelihood (ML) is used in practice The principle is that the parameters are chosen to jointly maximise a loglikelihood function (LLF) The form of this LLF will depend upon whether it is the logit or probit model While t-test statistics are constructed in the usual way, the standard error formulae used following the ML estimation are valid asymptotically only Consequently, it is common to use the critical values from a normal distribution rather than a t-distribution with the implicit assumption that the sample size is sufficiently large For the logit model, assuming that each observation on yi is independent, the joint likelihood will be the product of all N marginal likelihoods Let L(θ |x2i, x3i, . . . , xki; i = 1,…,N ) denote the likelihood function of the set of parameters (β1, β2, . . . , βk) given the data © Pearson Education Limited 2015 38 *The Likelihood Function for Probit and Logit Models • Then the likelihood function will be given by • It is computationally much simpler to maximise an additive function of a set of variables than a multiplicative function • We thus take the natural logarithm of this equation and so log-likelihood function is maximised © Pearson Education Limited 2015 39 Measures of Fit for Logit and Probit The R2 and R 2 don’t make sense here. So, two other specialized measures are used: 1. The fraction correctly predicted = fraction of Y’s for which the predicted probability is >50% when Yi=1, or is <50% when Yi=0. 2. The pseudo-R2 measures the improvement in the value of the log likelihood, relative to having no X’s. The pseudo-R2 simplifies to the R2 in the linear model with normally distributed errors. EViews goodness of fit measures: UG II, pp. 342-344 © Pearson Education Limited 2015 1-40 EViews Example, ctd View Expectation-Prediction Evaluation: Success if probability is greater than: 0.12 (i.e. the unconditional probability of denial giving as the cutoff) Expectation-Prediction Evaluation for Binary Specification Equation: PROBIT_MULTIPLE Date: 01/15/19 Time: 08:10 Success cutoff: C = 0.12 Estimated Equation Constant Probability Dep=0 Dep=1 Total Dep=0 Dep=1 Total 1676 123 1799 2095 285 2380 419 162 581 0 0 0 Total 2095 285 2380 2095 285 2380 Correct 1676 162 1838 2095 0 2095 % Correct 80.00 56.84 77.23 100.00 0.00 88.03 % Incorrect 20.00 43.16 22.77 0.00 100.00 11.97 Total Gain* -20.00 56.84 -10.80 NA 56.84 -90.18 P(Dep=1)<=C P(Dep=1)>C Percent Gain** On the interpretation of this table: EViews UG II, pp. 340-342 Overall, this could be considered a reasonable set of (in sample) predictions with 77.23% of the total predictions correct (80% of the acceptances correctly predicted as acceptances, and 56.8% of the denials correctly predicted as denials) © Pearson Education Limited 2015 1-41 *Application to the Boston HMDA Data (SW Section 11.4) • Mortgages (home loans) are an essential part of buying a home. • Is there differential access to home loans by race? • If two otherwise identical individuals, one white and one black, applied for a home loan, is there a difference in the probability of denial? © Pearson Education Limited 2015 1-42 *The HMDA Data Set • Data on individual characteristics, property characteristics, and loan denial/acceptance • The mortgage application process circa 1990-1991: – Go to a bank or mortgage company – Fill out an application (personal+financial info) – Meet with the loan officer • Then the loan officer decides – by law, in a race-blind way. Presumably, the bank wants to make profitable loans, and (if the incentives inside the bank or loan origination office are right – a big if during the mid-2000s housing bubble!) the loan officer doesn’t want to originate defaults. © Pearson Education Limited 2015 1-43 The Loan Officer’s Decision • Loan officer uses key financial variables: – – – – P/I ratio housing expense-to-income ratio loan-to-value ratio personal credit history • The decision rule is nonlinear: – Medium loan-to-value ratio, between 80-95% – High loan-to-value ratio > 95% (what happens in default?) – credit score © Pearson Education Limited 2015 1-44 Regression Specifications Pr(deny=1|black, other X’s) = … • linear probability model • probit Main problem with the regressions so far: potential omitted variable bias. The following variables (i) enter the loan officer decision and (ii) are or could be correlated with race: • wealth, type of employment • credit history • family status Fortunately, the HMDA data set is very rich… © Pearson Education Limited 2015 1-45 © Pearson Education Limited 2015 1-46 Table 11.1 (cont.) © Pearson Education Limited 2015 1-47 © Pearson Education Limited 2015 1-48 © Pearson Education Limited 2015 1-49 © Pearson Education Limited 2015 1-50 Summary of Empirical Results • Coefficients on the financial variables make sense. • Black is statistically significant in all specifications (except for the last one that includes two interaction variables for “black”). • Race-financial variable interactions aren’t significant. • Including the covariates sharply reduces the effect of race on denial probability. • LPM, probit, logit: similar estimates of effect of race on the probability of denial. • Estimated effects are large in a “real world” sense. © Pearson Education Limited 2015 1-51 Remaining Threats to Internal, External Validity Internal validity 1. Omitted variable bias? 2. Wrong functional form? 3. Errors-in-variables bias? 4. Sample selection bias? 5. Simultaneous causality bias? What do you think? External validity These data are from Boston in 1990-91. Do you think the results also apply today, where you live? © Pearson Education Limited 2015 1-52 *Conclusion (SW Section 11.5) • If Yi is binary, then E(Y| X) = Pr(Y=1|X) • Three models: – linear probability model (linear multiple regression) – probit (cumulative standard normal distribution) – logit (cumulative standard logistic distribution) • LPM, probit, logit all produce predicted probabilities • Effect of ΔX is change in conditional probability that Y=1. For logit and probit, this depends on the initial X • Probit and logit are estimated via maximum likelihood – Coefficients are normally distributed for large n – Large-n hypothesis testing, conf. intervals is as usual © Pearson Education Limited 2015 1-53