Podcast
Questions and Answers
What does the validity of estimates in econometrics depend on?
What does the validity of estimates in econometrics depend on?
- The number of observations in the dataset
- The underlying assumptions related to the estimator (correct)
- The complexity of the model used
- The presence of measurement errors in the data
Which assumption states that the expected value of the error term is zero?
Which assumption states that the expected value of the error term is zero?
- A4: Independence of Errors
- A3: Constant Variance
- A1: Linearity
- A2: Mean Zero (correct)
What does Assumption A3 concerning errors imply?
What does Assumption A3 concerning errors imply?
- The sum of errors must equal zero
- The variance of errors must be constant across observations (correct)
- Errors must be normally distributed
- Errors must be correlated with the independent variable
Which of the following is true about Assumption A5?
Which of the following is true about Assumption A5?
Assumption A6 is necessary for which aspect of econometric modeling?
Assumption A6 is necessary for which aspect of econometric modeling?
What is the consequence if assumptions do not hold in the data-collection process?
What is the consequence if assumptions do not hold in the data-collection process?
In the Classical Linear Regression Model, what do α and β represent?
In the Classical Linear Regression Model, what do α and β represent?
Which of the following reflects randomness and unobserved factors in the Classical Linear Regression Model?
Which of the following reflects randomness and unobserved factors in the Classical Linear Regression Model?
What is the primary estimation required for conducting a Wald test?
What is the primary estimation required for conducting a Wald test?
What does a larger decrease in the likelihood function when imposing restrictions indicate?
What does a larger decrease in the likelihood function when imposing restrictions indicate?
What is the form of the test statistic for the likelihood ratio test?
What is the form of the test statistic for the likelihood ratio test?
In limited dependent variable models, what estimation is used when the dependent variable is binary?
In limited dependent variable models, what estimation is used when the dependent variable is binary?
When dealing with a count model, what type of regression can be used for estimation?
When dealing with a count model, what type of regression can be used for estimation?
What characteristic defines a qualitative response model?
What characteristic defines a qualitative response model?
What is typically necessary for maximizing the likelihood function in MLE?
What is typically necessary for maximizing the likelihood function in MLE?
Which of the following correctly describes the conditions for a censored model?
Which of the following correctly describes the conditions for a censored model?
What is the significance of taking the natural logarithm of the likelihood in maximum likelihood estimation?
What is the significance of taking the natural logarithm of the likelihood in maximum likelihood estimation?
Which statement accurately describes the role of θ̂ML in maximum likelihood estimation?
Which statement accurately describes the role of θ̂ML in maximum likelihood estimation?
Which equation is crucial in identifying the maximum likelihood estimator for θ?
Which equation is crucial in identifying the maximum likelihood estimator for θ?
What form does the probability Pr(yi |xi , θ) take in the context of the example provided?
What form does the probability Pr(yi |xi , θ) take in the context of the example provided?
What does the likelihood function lead to when it is maximized in this context?
What does the likelihood function lead to when it is maximized in this context?
Which of the following properties is NOT associated with maximum likelihood estimators?
Which of the following properties is NOT associated with maximum likelihood estimators?
In the context of the provided equations, what does the εi term represent?
In the context of the provided equations, what does the εi term represent?
What role does the parameter σ² play in the likelihood equation?
What role does the parameter σ² play in the likelihood equation?
What is a characteristic of an ordered QR model?
What is a characteristic of an ordered QR model?
In which of the following situations is a binary model applicable?
In which of the following situations is a binary model applicable?
What is a limitation of the Linear Probability Model (LPM)?
What is a limitation of the Linear Probability Model (LPM)?
Which functional form models the probability of a binary outcome using a proper bounded approach?
Which functional form models the probability of a binary outcome using a proper bounded approach?
Which of the following correctly describes a feature of the Probit model?
Which of the following correctly describes a feature of the Probit model?
What does the Logit model estimate using the logistic function?
What does the Logit model estimate using the logistic function?
What is one significant limitation of using an Ordinary Least Squares (OLS) method with LPM?
What is one significant limitation of using an Ordinary Least Squares (OLS) method with LPM?
How does a cumulative distribution function (CDF) function in binary models?
How does a cumulative distribution function (CDF) function in binary models?
What is typically the format of the dependent variable in qualitative response models?
What is typically the format of the dependent variable in qualitative response models?
In the multinomial logit model, what is the significance of the parameters βj?
In the multinomial logit model, what is the significance of the parameters βj?
What does the functional form of F in the multinomial logit model describe?
What does the functional form of F in the multinomial logit model describe?
Which of the following methods is considered more complex than the multinomial logit model?
Which of the following methods is considered more complex than the multinomial logit model?
In the context of multinomial logit, what can the log odds ratio be used to compare?
In the context of multinomial logit, what can the log odds ratio be used to compare?
What type of response models analyze choices among ordered alternatives?
What type of response models analyze choices among ordered alternatives?
Which of the following is an example of a situation where qualitative response models would be applicable?
Which of the following is an example of a situation where qualitative response models would be applicable?
What does the marginal effect represent in a probit or logit model?
What does the marginal effect represent in a probit or logit model?
Which of the following is NOT a common way to report marginal effects?
Which of the following is NOT a common way to report marginal effects?
In the latent variable framework, what does $y_i^*$ represent?
In the latent variable framework, what does $y_i^*$ represent?
What is the correct interpretation of $y_i$ in the model $y_i = 1$ if $y_i^* > 0$ and $y_i = 0$ if $y_i^* ≤ 0$?
What is the correct interpretation of $y_i$ in the model $y_i = 1$ if $y_i^* > 0$ and $y_i = 0$ if $y_i^* ≤ 0$?
Which of the following is true about marginal effects in regression analysis?
Which of the following is true about marginal effects in regression analysis?
In the context of probit/logit models, what does MLE stand for?
In the context of probit/logit models, what does MLE stand for?
What is the purpose of the error term $ε_i$ in the model $y_i^* = x_i β + ε_i$?
What is the purpose of the error term $ε_i$ in the model $y_i^* = x_i β + ε_i$?
Flashcards
Classical Linear Regression Model (CLRM)
Classical Linear Regression Model (CLRM)
A statistical model that describes the relationship between a dependent variable and one or more independent variables, assuming a linear form and a certain set of properties for the error term.
Assumptions of CLRM
Assumptions of CLRM
A set of conditions that must be met for the OLS (Ordinary Least Squares) estimators to be unbiased, consistent, and efficient.
Linearity Assumption (A1)
Linearity Assumption (A1)
The true relationship between the dependent variable (y) and the independent variable(s) (x) is linear, meaning it can be described by a straight line in a graph.
Zero Conditional Mean (A2)
Zero Conditional Mean (A2)
Signup and view all the flashcards
Homoscedasticity (A3)
Homoscedasticity (A3)
Signup and view all the flashcards
No Autocorrelation (A4)
No Autocorrelation (A4)
Signup and view all the flashcards
Exogeneity (A5)
Exogeneity (A5)
Signup and view all the flashcards
Normality Assumption (A6)
Normality Assumption (A6)
Signup and view all the flashcards
Maximum Likelihood Estimation (MLE)
Maximum Likelihood Estimation (MLE)
Signup and view all the flashcards
Likelihood function (L(θ))
Likelihood function (L(θ))
Signup and view all the flashcards
MLE estimate (θ̂ML)
MLE estimate (θ̂ML)
Signup and view all the flashcards
Normal PDF
Normal PDF
Signup and view all the flashcards
Sum of Squared Residuals
Sum of Squared Residuals
Signup and view all the flashcards
OLS
OLS
Signup and view all the flashcards
Properties of MLE
Properties of MLE
Signup and view all the flashcards
iid
iid
Signup and view all the flashcards
CLRM
CLRM
Signup and view all the flashcards
εi
εi
Signup and view all the flashcards
Wald Test
Wald Test
Signup and view all the flashcards
Likelihood Ratio Test
Likelihood Ratio Test
Signup and view all the flashcards
LR Statistic
LR Statistic
Signup and view all the flashcards
χ² Distribution
χ² Distribution
Signup and view all the flashcards
Limited Dependent Variable Models
Limited Dependent Variable Models
Signup and view all the flashcards
Binary Model
Binary Model
Signup and view all the flashcards
Censored Model
Censored Model
Signup and view all the flashcards
Count Model
Count Model
Signup and view all the flashcards
Qualitative Response Model
Qualitative Response Model
Signup and view all the flashcards
Binary Dependent Variable
Binary Dependent Variable
Signup and view all the flashcards
Linear Probability Model (LPM)
Linear Probability Model (LPM)
Signup and view all the flashcards
Problem with LPM
Problem with LPM
Signup and view all the flashcards
Probit Model
Probit Model
Signup and view all the flashcards
Logit Model
Logit Model
Signup and view all the flashcards
Ordered Choice Models
Ordered Choice Models
Signup and view all the flashcards
Ordered Logit / Ordered Probit
Ordered Logit / Ordered Probit
Signup and view all the flashcards
Qualitative Response Models
Qualitative Response Models
Signup and view all the flashcards
Multinomial Logit
Multinomial Logit
Signup and view all the flashcards
Multinomial Logit Probability (Pij)
Multinomial Logit Probability (Pij)
Signup and view all the flashcards
Log Odds Ratio (Pij/Pi0)
Log Odds Ratio (Pij/Pi0)
Signup and view all the flashcards
Alternative Estimation Method
Alternative Estimation Method
Signup and view all the flashcards
Ordered Response Models
Ordered Response Models
Signup and view all the flashcards
Marginal Effect
Marginal Effect
Signup and view all the flashcards
Probit
Probit
Signup and view all the flashcards
Logit
Logit
Signup and view all the flashcards
Latent Variable
Latent Variable
Signup and view all the flashcards
Marginal Effect Calculation
Marginal Effect Calculation
Signup and view all the flashcards
Marginal effect reported at sample mean
Marginal effect reported at sample mean
Signup and view all the flashcards
STATA commands
STATA commands
Signup and view all the flashcards
Probit/Logit Model
Probit/Logit Model
Signup and view all the flashcards
Study Notes
Course Information
- Course Title: Advanced Econometrics
- Course Code: ADEC-3070
- Instructor: Dr. Manini Ojha
- Semester: Fall, 2024
- JSGP Elective
Lecture Design
- Lectures aim to expose students to various models rather than focusing on specific models in detail.
- Consistent estimation is essential, as no estimator is guaranteed to yield consistent results.
- All estimators rely on assumptions, and the validity of the estimates depends on the validity of these assumptions.
- Assumptions can change due to data collection processes (e.g., measurement error, sample selection).
Recap: Ordinary Least Squares (OLS)
- Classical Linear Regression Model (CLRM) assumes a true relationship: yi = a + βxi + εi, where i = 1,..., N.
- α, β are population parameters.
- â, β are parameter estimates.
- εi is idiosyncratic error (randomness, unobserved factors).
- Estimated residual = €i.
- Exercise: State the assumptions!
Assumptions of CLRM
- (A1) Linearity: The relationship is linear in parameters.
- (A2) Mean Zero Error: E[εi] = 0. This implies E[yixi] = a + βxi.
- (A3) Homoscedasticity: Var(εi) = σ². The variance is constant across observations.
- (A4) No Autocorrelation: Cov (εi, εj) = 0 for i ≠j (no covariance between errors).
- (A5) Exogeneity: E[εixi] = 0. x is independent of the error.
- (A6) Normality: ε¡ ~ N(0, σ²). Errors are normally distributed. (Not required for unbiasedness or consistency, needed for inference).
OLS Estimation
- Given a random sample, OLS minimizes the sum of squared residuals.
- Solution implies formulas for â and β.
Properties of OLS
- â and Β are unbiased (finite sample property) and consistent (asymptotic property).
- â and β are efficient (smallest variance of any linear, unbiased estimate).
Consistency (Asymptotics)
- Unbiasedness is not always attainable.
- Consistency is a fundamental requirement for any estimator (essential for reliable results).
Multiple Regression Model
- True relationship: yi = x¡β + ε¡, where K = number of independent variables.
- Stacking observations: X y = xβ + ε y is N × 1; x is N × (K + 1); β is (K + 1) × 1; ε is N × 1 xβ is N × 1.
- Assumptions:
- E[Xiki] = 0 for all k
- x's are linearly independent(no perfect multicollinearity)
Maximum Likelihood Estimation (MLE)
- Alternative to OLS, especially useful for nonlinear models.
- Equivalent to OLS in the classical linear regression model.
- Likelihood function L(θ) captures the probability of observing the realized data.
Intuition of MLE
- Outcome variable depends on parameters (e.g. θ).
- Estimation aims to maximize the probability of observing the given data.
- MLE identifies parameter values that are most likely to have generated the observed data.
MLE Likelihood Function
- The likelihood function gives the total probability of observing realized data as a function of parameters.
- Joint density is the likelihood function.
- To maximize the likelihood function, a commonly used approach is to maximize the log-likelihood function.
Example CLRM and Probability
- yi = x¡β + ε¡, and ε; α N(0, σ²).
- The probability equation is given by the normal PDF.
Properties of MLE
- Consistent (plim @ML → θ).
- Asymptotically normal
- Asymptotically efficient.
Hypothesis testing - MLE
- Wald tests are equivalent to F-tests in OLS and only involve the unrestricted model.
- Likelihood Ratio test, which involves estimating both the restricted and unrestricted models. Intuition: Imposing restrictions/dropping variables tends to yield a smaller likelihood function, and larger decrease indicates a likely invalid restriction.
Limited Dependent Variable Models (LDV)
- Models which have a dependent variable that is not continuous include:
- Binary models (e.g., {0, 1})-probit, logit, LPM (labor force participation, loan default).
- Censored models (e.g., [a, b])-censored regression, tobit (income, wealth).
- Counts models (e.g., {0, 1, 2,...})- Poisson, negative binomial (e.g., number of children).
- Qualitative response (QR) models, Ordered QR/Ordinal models (e.g. brand choice, mode of transportation, schooling level, bond ratings).
Linear Probability Model (LPM)
- Estimated by OLS and has issues with predictions that lack appropriate boundaries.
Probit/Logit model
- These are functional forms that use the cumulative distribution functions (CDFs) of probability distributions (e.g. normal function (probit)); logistic function (logit)).
- Interpretations of B are marginal effects.
Latent variable framework
- Probit and logit models can be expressed using a latent variable framework.
- Similar models use indicator functions that produce binary or categorical outputs.
STATA commands for LDV
- probit, logit, dprobit, -margin-, -mfx-
Censored Regression Models
- Applicable to situations where the dependent variable is censored (potentially from above and below)
- Examples: income/wealth top coding, age at first birth
- Latent framework approach
- Estimation via MLE
- Tobit model (a¡ = 0, b; = ∞)
Count Models
- Applicable to situations with a non-negative integer counts.
- Examples: number of children, number of patents held by a firm, number of doctor visits, etc.
Poisson Count Model Setup
- Aims to model expected count of events conditional on a variable. Estimated via MLE, and uses a Poisson distribution assumption, dependent on a mean value.
- Alternative models are negative binomial models.
Zero-inflated Poisson Models (ZIP)
- Models count data with a mass at zero, where the decision for the zero vs a positive outcome may hinge on different factors from those which influence larger positive counted outcomes.
Application: Talley et al. (2005)
- Analyzing determinants of crew injuries in ship accidents (use of count data, mixture of discrete and continuous data over years).
Qualitative Response Model
- Applicable for analyzing choices among a set of alternatives that are unordered (e.g., choice of brand, mode of transport). Multinomial logit and probit;
- The dependent variable is typically coded with integers corresponding to alternative choices.
Multinomial Logit Setup
- Used to model the probability of a particular choice among a set of J+1 alternatives, in the presence of several covariates or other variables which may impact the choice. Uses an exponential functional structure.
Properties of Multinomial Logit Setup
- Note Beta's are choice specific outputs, the output for the Log odds relative to the base choice. The output can be interpreted as a percentage change in odds from a unit change in another parameter.
Alternative Estimation Methods
- Multinomial probit (more complex), Nested logit (sometimes used).
Ordered Response Models
- Applicable to analyses of choices with ordinal alternatives (e.g., labor force status, education levels, bond ratings).
- Data is typically coded as discrete integers; the ordering of values is relevant.
Latent Variable Framework for Ordered Models
The model setup is similar to logit/probit models; using indicator functions with thresholds for each category.
Applications and Estimations
- Specific example application examples are given for each model type
- Data and estimations are carried out using STATA commands ( e.g., -probit-, -logit-, -dprobit-, -margin-, -mfx-, -tobit-, -cnreg-, -poisson-, -nbreg-, -zip-, -zinb-, -ivreg2-, -xtreg-, -fe-, -fd-, and -areg-).
Endogenous Sample Selection
- If outcome and selection variables are mutually dependent, then standard OLS is not suitable.
- Selection based on unobserved factors leads to bias.
Inverse Mills Ratio (IMR)
- Used to correct for endogenous sample selection
- IMR estimation involves a probit model (use of the latent variable approach) and regression.
Endogenous Variables
- An endogenous variable is correlated with the error term in a model; which would cause bias.
- Potential reasons for endogeneity include omitted variable bias, reverse causation and measurement error.
- A relevant regressor is excluded from a model and is correlated with regressor, which leads to omitted variable bias.
Reverse Causation
- The outcome and the treatment variables themselves may reciprocally influence one another.
Measurement Error (ME)
- Data measurement error can lead to bias.
- If ME affects only the dependent variable, then consistent estimates may be obtained.
Classical Error-in-Variables (CEV) Model
- Assumptions for the model include μ and ε are independent from one another; μ and x are also independent from one another.
- If these assumptions do not hold, then OLS estimators will be subject to bias.
Instrumental Variables (IV)
- Used for correcting endogeneity.
- A valid instrument (z) correlates with the endogenous variable (x) but is uncorrelated with the error.
Two-Stage Least Squares (2SLS)
- A two-step process for obtaining IV estimates.
- First stage, finds the predicted values for the endogenous regressor(s).
- Second stage uses the predicted value(s) for the endogenous variable(s) when running the regression- thus using instrumental variables to improve the endogeneity issue.
Panel Data Models
- Models which utilize multiple observations for various groups (e.g., countries, individuals, firms) across different time periods.
- Models include (pooled OLS, fixed effects, random effects, and difference-in-difference (DID))
Time Trends
- The impact of a time trend on the model, which can take into account any variations or tendencies over time.
Structural Breaks
- Models which explicitly factor in structural breaks (e.g., changes in the intercepts or slopes in the presence of sudden events like a war/policy change)
Time Specific Intercepts
- A model which factors in varying time periods to account for various intercepts which may accompany any time variable or treatment effect.
Difference-in-differences (DID)
- Used in policy analysis, especially for situations involving policy implementation in a subset of groups, utilizing variation in observable factors between groups in a pre- and post- policy change to isolate the impact of the policy itself.
Summary
- This course material presents a comprehensive overview of advanced econometrics, covering various methodologies in the presence of several challenging considerations including issues with endogeneity and estimation in cases where data is limited.
Studying That Suits You
Use AI to generate personalized quizzes and flashcards to suit your learning preferences.