Podcast
Questions and Answers
Which of the following is NOT considered a primary source of error in forming predictions?
Which of the following is NOT considered a primary source of error in forming predictions?
- Bias
- Variance
- Noise
- Overfitting (correct)
A more complex model invariably leads to better performance on testing data.
A more complex model invariably leads to better performance on testing data.
False (B)
What term describes the irreducible aspect of data error that one can never eliminate?
What term describes the irreducible aspect of data error that one can never eliminate?
noise
___________ refers to the error due to the model's inability to capture the true relationship in the data.
___________ refers to the error due to the model's inability to capture the true relationship in the data.
In the context of machine learning, what does 'variance' refer to?
In the context of machine learning, what does 'variance' refer to?
Bias is independent of the model used and is solely determined by the data.
Bias is independent of the model used and is solely determined by the data.
If a model exhibits 'high bias,' what characteristic does it likely possess?
If a model exhibits 'high bias,' what characteristic does it likely possess?
The difference between the average of the estimated function and the true function is known as ________.
The difference between the average of the estimated function and the true function is known as ________.
Which scenario exemplifies high bias in a model?
Which scenario exemplifies high bias in a model?
Models with high complexity generally exhibit low bias.
Models with high complexity generally exhibit low bias.
Explain how 'high variance' affects the stability of a model's predictions.
Explain how 'high variance' affects the stability of a model's predictions.
Over all possible size N training sets; what do I expect my fit to be; is part of _______ contribution.
Over all possible size N training sets; what do I expect my fit to be; is part of _______ contribution.
What is the primary characteristic of a model with low variance?
What is the primary characteristic of a model with low variance?
Models with low complexity generally exhibit low variance.
Models with low complexity generally exhibit low variance.
In terms of model complexity, how does high variance typically manifest?
In terms of model complexity, how does high variance typically manifest?
The squared deviation of $\hat{f_w}$ from its expected value $\overline{f_w}$ over different realizations of training data.
The squared deviation of $\hat{f_w}$ from its expected value $\overline{f_w}$ over different realizations of training data.
Which issue is more likely to be addressed by increasing the size of the training dataset?
Which issue is more likely to be addressed by increasing the size of the training dataset?
Regularization techniques are primarily used to combat high bias.
Regularization techniques are primarily used to combat high bias.
What term describes the situation when a model performs well with training data, but poorly with new data?
What term describes the situation when a model performs well with training data, but poorly with new data?
The class of models can't fit the data exhibits ____________, while the models which could fit but don't because it is hard to fit exhibits ________________.
The class of models can't fit the data exhibits ____________, while the models which could fit but don't because it is hard to fit exhibits ________________.
Match the following scenarios with whether they describe high bias or high variance:
Match the following scenarios with whether they describe high bias or high variance:
In parallel universes where only Niantic knows f, what serves as training data to find f*?
In parallel universes where only Niantic knows f, what serves as training data to find f*?
In the context of estimator, f* is the true function known to everyone.
In the context of estimator, f* is the true function known to everyone.
According to the Pokemon Case Study, if we average all the estimated functions f*, is it close to true function f?
According to the Pokemon Case Study, if we average all the estimated functions f*, is it close to true function f?
In the bias variance trade off, E[f*] - f equals ______.
In the bias variance trade off, E[f*] - f equals ______.
Which method can introduce bias, but may effective to combat large variance?
Which method can introduce bias, but may effective to combat large variance?
More data can always solve an issue of large variance.
More data can always solve an issue of large variance.
When there is an issue with high bias, is it an issue with overfitting or underfitting, and why?
When there is an issue with high bias, is it an issue with overfitting or underfitting, and why?
When is Underfitting diagnosised, when the model doesn't even fit the _____________.
When is Underfitting diagnosised, when the model doesn't even fit the _____________.
Match the following diagnoses with their corresponding solution to combat this effect:
Match the following diagnoses with their corresponding solution to combat this effect:
In the bias-variance tradeoff, what typically happens to bias as model complexity increases?
In the bias-variance tradeoff, what typically happens to bias as model complexity increases?
Having both low bias and low variance guarantees perfect model performance in real-world scenarios.
Having both low bias and low variance guarantees perfect model performance in real-world scenarios.
How does the amount of training data typically impact a model's variance?
How does the amount of training data typically impact a model's variance?
A model with high variance is likely to perform well on the ______ dataset but poorly on the ______ dataset.
A model with high variance is likely to perform well on the ______ dataset but poorly on the ______ dataset.
Which of the following strategies is MOST directly aimed at reducing overfitting?
Which of the following strategies is MOST directly aimed at reducing overfitting?
A model with high bias is likely to capture noise in the data.
A model with high bias is likely to capture noise in the data.
What are the effects of noise; and can it be combatted?
What are the effects of noise; and can it be combatted?
In the equation for Expected Prediction Error: EPE(x) = _________² + bias² + variance
In the equation for Expected Prediction Error: EPE(x) = _________² + bias² + variance
Match each scenario with the appropriate action to take:
Match each scenario with the appropriate action to take:
Which term describes when a model fits training data well, but poorly with new data?
Which term describes when a model fits training data well, but poorly with new data?
Underfitting occurs when there is high variance in the data.
Underfitting occurs when there is high variance in the data.
Flashcards
What is Bias?
What is Bias?
Error in predictions due to oversimplified assumptions in the learning algorithm.
What is Variance?
What is Variance?
Error in predictions due to the model's sensitivity to small fluctuations in the training data.
What is Noise?
What is Noise?
Error due to randomness or inherent variability in the data that cannot be reduced by any model.
What is Irreducible Error?
What is Irreducible Error?
Signup and view all the flashcards
What is Underfitting?
What is Underfitting?
Signup and view all the flashcards
What is Overfitting?
What is Overfitting?
Signup and view all the flashcards
What is Regularization?
What is Regularization?
Signup and view all the flashcards
Low complexity vs. Variance
Low complexity vs. Variance
Signup and view all the flashcards
High complexity vs. Bias
High complexity vs. Bias
Signup and view all the flashcards
What is Bias of Estimator?
What is Bias of Estimator?
Signup and view all the flashcards
What is Variance of Estimator?
What is Variance of Estimator?
Signup and view all the flashcards
Bias of function estimator
Bias of function estimator
Signup and view all the flashcards
Variance of Function Estimator
Variance of Function Estimator
Signup and view all the flashcards
What is the formula for Expected Prediction Error?
What is the formula for Expected Prediction Error?
Signup and view all the flashcards
What is y (the target)?
What is y (the target)?
Signup and view all the flashcards
Study Notes
- In forming predictions, there are 3 sources of error, which are noise, bias, and variance. Noise is an aspect of the data that can never be eliminated.
- The average error on testing data can be attributed to "bias" and "variance". A more complex model does not always lead to better performance on testing data.
Data Inherently Noisy
- Data is inherently noisy, and this is represented by the equation Yi = fw(true) (xi)+Ci, where €i represents the variance of noise. The error from noise is called irreducible error and is an aspect of the data you can never beat.
Bias Contribution
- Bias is inherent to the model. The bias of a model can be represented as Bias(x) = fw(true)(x) - fw. In this equation, fw(true) represents the true function and fw represents the model's prediction.
- Bias results in error in predictions if a model is not flexible enough to capture the true function, fw(true). If you fit a constant function, like a horizontal line, to data with an upward trend, there will be a high bias. Low complexity in a model leads to high bias.
- The average estimated function is represented as f̄w(x), and the true function is represented as fw(x). The average estimated function is the function you expect to get over all possible size N training sets.
- Bias of a function estimator is bias(f̄w(xt)) = fw(xt) - f̄w(xt). This equation represents the difference between the true underlying function fw and the average value of prediction f̄w over different realization of training data.
High Bias
- High bias occurs when the true function f(x) cannot be accurately modeled by the chosen model.
Variance Contribution
- Variance shows how much specific fits vary from the expected fit, f̄w. Low complexity leads to low variance.
Variance of High-Complexity Models
- Applying a high-order polynomial results in a large space for variance.
Variance of Function Estimator
- The variance of a function estimator is var(fw(xt)) = Etrain[(fw(train)(xt) - f̄w(xt))^2]
- Variance is the squared deviation of fw from its expected value f̄w over different realizations of training data.
High Variance
- High variance occurs when the data is not accurately modeled on the true function f(x), resulting in a variance.
Relationship of Bias and Variance
- With low bias and variance, the data points are close to each other and to the center of the target. As the bias increases, the distances between the predicted dot location and the targeted dot location increase. As the variance increases, the predicted dot location becomes more dispersed.
Pokemon Case study
- Niantic knows the true function f, but from training data we can find estimator f*.
Parallel Universes
- In all universes, 10 Pokémons are being collected to find f*.
- In different universes, the same model is used but different f* is obtained.
f* in 100 Universes
- It is possible to have different polynomial function orders.
Bias of Estimators
- Bias is when we average all the f*, is it close to f?: E[f*] = f̄
- Bias is Bias(f*) = E[f*] - f
- The differences between the training of a large bias versus a small bias are that the dots are more dispersed, and the average is further away from the target as the bias increases.
Variance of Estimators
- Variance(f*) = E[(f* – E(f*))²]
- Simpler models are less influenced by the sampled data.
Bias and Variance of Estimator
- Bias is the difference between this estimator's expected value and the true value of the parameter being estimated: Bias(f*, f) = Bias(f*) = E[f*] − f = E[f* - f]
- Variance is how far, on average, the collection of estimates are from the expected value of the estimates: Variance(f*, f) = Variance(f*) = E[(f - E(f*))²]
Bias v.s. Variance
- As bias increases, the variance decreases. A balance can be obtained to get the proper model.
Bias v.s. Variance Graph
- The overall error can be broken down into errors from bias and variance.
- Underfitting occurs when there is a large bias with a small variance.
- Overfitting occurs when there is a small bias with a large variance.
Handling Bias
- Diagnosis using underfitting means the model cannot even fit the training examples, which means they have large bias
- For bias, redesign the model. Add more features as input, and/or use a more complex model.
Handling Variance
- With large variance, more data is needed, which increases the bias.
- Regularization can be used, which may increase bias.
Summary
- Bias means the class of models can't fit the data.
- Variance means the class of models could fit the data but it's too hard to fit.
Summary Cat Classification
- In this cat classification scenario, the proper balance yields validation of 1%.
Expected Prediction Error
- Ep {(y - f*(x))} represents the expected prediction error and is expressed as Ep {(f(x) + € - f*(x))2}. An example may be, drawing size n sample D = {(x1, y1), ..., (Xn yn)}.
Bias and Variance of Estimator
- EPE(x) represents the expected prediction error and can be defined as the sum of noise², bias², and variance
- A Bias-Variance decomposition exists.
Studying That Suits You
Use AI to generate personalized quizzes and flashcards to suit your learning preferences.