Regression Analysis Concepts

Podcast

Play an AI-generated podcast conversation about this lesson

Download our mobile app to listen on the go

Get App

Questions and Answers

What is the problem with including too many variables in a regression model?

It can lead to overfitting and an inaccurate prediction of the dependent variable.
It can lead to misspecification, where the model's form does not accurately represent the relationship between the variables.
It can lead to underspecification, where important variables are omitted, causing omitted variable bias.
It can lead to overspecification, where irrelevant variables are included, but it does not bias the coefficients. (correct)

What is the potential problem with including too few variables in a regression model?

It can lead to misspecification, where the model's form does not accurately represent the relationship between the variables.
It can lead to overspecification, where irrelevant variables are included, but it does not bias the coefficients.
It can lead to overfitting, where the model is too closely fit to the data and may not generalize well to new data.
It can lead to underspecification, where important variables are omitted, causing omitted variable bias. (correct)

What is omitted variable bias?

The bias that occurs when the dependent variable is not measured accurately.
The bias that occurs when the model is overspecified, including irrelevant variables.
The bias that occurs when important variables are omitted from the regression model. (correct)
The bias that occurs when the independent variables are not independent of each other.

When does omitted variable bias occur?

When the omitted variable correlates with both the dependent variable and at least one independent variable in the model. (D) Signup and view all the answers

Which of the following is NOT a problem associated with omitted variable bias?

Reducing the R-squared value of the model. (C) Signup and view all the answers

What does the adjusted R-squared statistic measure?

The proportion of variance in the dependent variable that is explained by the independent variables, adjusted for the number of variables in the model. (A) Signup and view all the answers

Which of the following is a naive approach to variable selection in regression analysis?

Stepwise regression. (B) Signup and view all the answers

What is a 'kitchen sink' regression?

A regression model that includes all variables, regardless of their significance. (D) Signup and view all the answers

What is the consequence of multicollinearity in regression analysis?

Biased standard errors (B) Signup and view all the answers

Which of the following issues can lead to biased coefficients in regression analysis?

Endogeneity (C) Signup and view all the answers

What is the first step in the recipe for conducting a regression analysis?

Start with assumptions about relationships between variables (C) Signup and view all the answers

How does heteroscedasticity specifically affect regression analysis?

It leads to biased standard errors. (D) Signup and view all the answers

In regression analysis, what is an essential consideration regarding the data sample used?

The sample must be a random sample of the population. (D) Signup and view all the answers

What is the Gauss-Markov Theorem primarily concerned with?

The use of random samples in regression analysis (C) Signup and view all the answers

Which rule of thumb is commonly used regarding observations in regression analysis?

Signup and view all the answers

What aspect of sample selection can lead to non-problems if based on the independent variable?

Exogenous sample selection (B) Signup and view all the answers

What is a consequence of perfect collinearity in a regression model?

Multicollinearity (C) Signup and view all the answers

Which of the following is a method to test for multicollinearity in regression analysis?

Variance Inflation Factor (VIF) (A) Signup and view all the answers

Which statement about independent variables in a regression model is correct?

They should ideally be independent of each other (D) Signup and view all the answers

What is meant by the 'dummy trap' in regression analysis?

Including a dummy variable for each category (C) Signup and view all the answers

If a researcher is only surveying future students among high school graduates, what is the result of this sampling method?

Highly selective sample (B) Signup and view all the answers

In regression analysis, when is multicollinearity considered a problem?

When one independent variable perfectly predicts another (A) Signup and view all the answers

What should be done if a variable causes perfect collinearity in a regression model?

Exclude the perfectly collinear variable (A) Signup and view all the answers

What is a primary advantage of using Principal Component Analysis (PCA) in regression?

It reduces the complexity of the regression model. (A), It eliminates multicollinearity among variables. (B) Signup and view all the answers

Which of the following is a disadvantage of Principal Component Analysis?

It does not provide intuitive interpretation of results. (C) Signup and view all the answers

In what context is Principal Component Analysis commonly applied?

Generating socio-economic status indices. (C) Signup and view all the answers

What does PCA primarily address when multiple variables are included in a regression model?

Reducing variable correlations. (A) Signup and view all the answers

Why might researchers prefer to use PCA before regression analysis?

To enhance the power of statistical tests. (B) Signup and view all the answers

Which of the following is NOT a reason to use Principal Component Analysis?

Statistical significance of regression variables. (D) Signup and view all the answers

What is often a result of applying PCA in data analysis?

Reduction in data dimensionality. (B) Signup and view all the answers

What mathematical characteristic is significant when performing PCA?

It is driven by mathematical and statistical principles. (D) Signup and view all the answers

What is autocorrelation?

A condition where residuals are dependent on each other (C) Signup and view all the answers

Which method can be used for detecting autocorrelation?

Residuals vs. observation order visual inspection (C) Signup and view all the answers

In which situation may autocorrelation occur?

Time series data with repeated measurements (A) Signup and view all the answers

What is a consequence of failing to meet the OLS assumptions?

You may utilize transformations or different estimators (C) Signup and view all the answers

Which of the following is NOT an assumption outlined in the Gauss-Markov Theorem?

Normal distribution of independent variables (C) Signup and view all the answers

What is the purpose of the normality assumption in regression analysis?

To allow for significance testing via p values (B) Signup and view all the answers

How can one approximate normality in residuals if the sample size is large?

Collecting a sufficiently large sample (D) Signup and view all the answers

What is indicated when OLS is described as BLUE?

It consistently provides the best linear unbiased estimation (A) Signup and view all the answers

What is the primary concern regarding the selection of the sample in regression analysis?

The sample selection can influence the accuracy of the results. (A) Signup and view all the answers

Why is having a larger sample size beneficial in regression analysis?

It increases the reliability of statistical inferences. (C) Signup and view all the answers

What does the formula for degrees of freedom in regression output represent?

The total sample size minus one and one for the intercept. (B) Signup and view all the answers

What is the suggested minimum number of observations per variable when constructing a regression model?

10 observations to ensure stability. (B) Signup and view all the answers

What is a significant drawback of selecting a sample based on the dependent variable?

It may introduce selection bias that complicates inference. (C) Signup and view all the answers

How does an increase in degrees of freedom affect regression predictions?

It leads to lower critical values of t and improves prediction accuracy. (B) Signup and view all the answers

What is the relationship between sample size and the inclusion of independent variables in regression analysis?

More independent variables necessitate a larger sample size. (D) Signup and view all the answers

What is a common rule of thumb regarding the total number of observations in regression analysis?

The more variables included, the more observations required. (B) Signup and view all the answers

Flashcards

Principal Component Regression

A regression technique using PCA to handle correlated variables.

Principal Component Analysis (PCA)

A mathematical method to transform correlated variables into uncorrelated variables.