Podcast
Questions and Answers
What is another term for the 'dependent variable'?
What is another term for the 'dependent variable'?
In a regression model, 'x' and 'z' are typically considered features.
In a regression model, 'x' and 'z' are typically considered features.
True
What is the purpose of a prediction model?
What is the purpose of a prediction model?
To make good predictions for new observations, also known as out-of-sample data.
A polynomial regression of degree 8 is also known as an ______________ function.
A polynomial regression of degree 8 is also known as an ______________ function.
Signup and view all the answers
What is the problem with the octic curve in the plot?
What is the problem with the octic curve in the plot?
Signup and view all the answers
Match the following terms with their definitions:
Match the following terms with their definitions:
Signup and view all the answers
The term 'feature' is only used in machine learning.
The term 'feature' is only used in machine learning.
Signup and view all the answers
What is the problem with a model that is highly influenced by the random errors in the training data?
What is the problem with a model that is highly influenced by the random errors in the training data?
Signup and view all the answers
What is the main goal of a pure prediction problem?
What is the main goal of a pure prediction problem?
Signup and view all the answers
Linear regression is a powerful method for pure prediction problems.
Linear regression is a powerful method for pure prediction problems.
Signup and view all the answers
What is the name of the book that is recommended for starters in machine learning?
What is the name of the book that is recommended for starters in machine learning?
Signup and view all the answers
Machine learning is also known as ______________________ learning.
Machine learning is also known as ______________________ learning.
Signup and view all the answers
What is the purpose of splitting a sample into a training and test data set?
What is the purpose of splitting a sample into a training and test data set?
Signup and view all the answers
Econometrics and machine learning use the same expressions and terminology.
Econometrics and machine learning use the same expressions and terminology.
Signup and view all the answers
Match the following machine learning methods with their characteristics:
Match the following machine learning methods with their characteristics:
Signup and view all the answers
In a linear regression, the dependent variable is denoted by ______________________.
In a linear regression, the dependent variable is denoted by ______________________.
Signup and view all the answers
Study Notes
Prediction Problems and Machine Learning
- In linear regression, we study how to consistently estimate the coefficients (e.g., β) to understand how an explanatory variable is related to or causally affects a dependent variable.
- In contrast, in a pure prediction problem, we only want to find and estimate a model that allows us to make good predictions of the dependent variable for new observations.
Machine Learning Techniques
- Machine learning has substantially advanced techniques for pure prediction problems, which can often outperform linear regressions.
- Examples of powerful prediction methods include:
- Random forests
- Gradient boosted trees
- Lasso and ridge regression
- Deep neural networks
- Established procedures for prediction problems include:
- Splitting the sample into a training and test data set
- Using cross-validation for parameter tuning
Thesaurus: Machine Learning vs. Econometrics
- The machine learning literature uses different expressions than econometrics, including:
- Dependent variable = response
- Explanatory variable = regressor = predictor = feature
- Estimate a model = train a model
- Nominal variable = categorical variable = factor variable = non-numeric variable
- Dummy variable = one-hot encoded variable
Polynomial Regression for Prediction
- A prediction model should make good predictions for new observations (out-of-sample data).
- The trade-offs that affect out-of-sample prediction accuracy can be illustrated using a simulated data set with one explanatory variable.
- The true data generating process can be a polynomial of degree 8 of the explanatory variable plus some iid error term.
Example: Overfitting
- The octic curve can predict the training data points most closely, but it looks quite wiggly, which is a typical sign of overfitting.
- Overfitting means that the form of the curve is strongly influenced by the realization of the random errors in the training data set.
Studying That Suits You
Use AI to generate personalized quizzes and flashcards to suit your learning preferences.
Description
This quiz covers market analysis using econometrics and machine learning techniques, including linear regression and parameter tuning.