Machine Learning Model Evaluation and Improvement
30 Questions
1 Views

Choose a study mode

Play Quiz
Study Flashcards
Spaced Repetition
Chat to Lesson

Podcast

Play an AI-generated podcast conversation about this lesson

Questions and Answers

What is the recommended percentage split for training and testing data?

  • 80% for training and 20% for testing
  • 70% for training and 30% for testing (correct)
  • 50% for training and 50% for testing
  • 60% for training and 40% for testing

Why is it important to use new data when evaluating a model?

  • To train the model on a larger dataset
  • To increase the model's complexity
  • To prevent overfitting to the training set (correct)
  • To speed up the evaluation process

What is the purpose of a validation set in machine learning?

  • To train the model on additional data
  • To evaluate the model while building and tuning it (correct)
  • To measure the model's performance on the training data
  • To use as the final test set

Why should the model not be trained on the entire dataset?

<p>To prevent overfitting to the training set (D)</p> Signup and view all the answers

What risk is associated with using the test set to select model parameters?

<p>The risk of overfitting (A)</p> Signup and view all the answers

What happens if the model is tuned based on performance only on the test data?

<p>The model may overfit to the test set (D)</p> Signup and view all the answers

Why is squared error commonly used in machine learning?

<p>Because it reports that the prediction was incorrect regardless of whether it was too high or too low (C)</p> Signup and view all the answers

What does the R2 coefficient represent in machine learning?

<p>The proportion of variance in the outcome that the model can predict based on its features (B)</p> Signup and view all the answers

What happens when a machine learning model has high bias?

<p>It underfits the data and is limited from learning the true trend (C)</p> Signup and view all the answers

What is the purpose of validation curves in machine learning?

<p>To diagnose whether a model is suffering from high bias or high variance (D)</p> Signup and view all the answers

What does a gap between the training and validation error in learning curves indicate?

<p>High variance (A)</p> Signup and view all the answers

What is a common consequence of models with high variance?

<p>Overfitting of the data (D)</p> Signup and view all the answers

How can models suffering from high bias be improved?

<p>By adding additional features to the dataset (C)</p> Signup and view all the answers

What is one common use of reducing a dataset into two dimensions when evaluating a classifier model?

<p>To visualize observations and decision boundary for performance evaluation (D)</p> Signup and view all the answers

Which region in a validation curve indicates that a model is subject to high bias?

<p>When both training and validation errors are high (B)</p> Signup and view all the answers

What does underfitting refer to in machine learning?

<p>Limitation from learning the true trend and performing poorly on new data (C)</p> Signup and view all the answers

What is an appropriate approach for improving models that suffer from high variance?

<p>Feeding more data during training (A)</p> Signup and view all the answers

When will training on more data do very little to improve a model with high bias?

<p>When models underfit the data and pay little attention to it (D)</p> Signup and view all the answers

What percentage of the data is typically used for training in a train/test/validation split?

<p>60% (A)</p> Signup and view all the answers

Which metric is defined as the percentage of correct predictions for the test data?

<p>Accuracy (B)</p> Signup and view all the answers

What fraction is precision defined as?

<p>True positives / (True positives + False positives) (D)</p> Signup and view all the answers

In which scenario is recall important?

<p>When developing a classification algorithm for disease prediction (A)</p> Signup and view all the answers

What is the common approach for combining precision and recall metrics?

<p>F-score (B)</p> Signup and view all the answers

Why do we have a different set of evaluation metrics for regression models compared to classification models?

<p>Regression models predict in a continuous range while classification models predict in discrete classes (B)</p> Signup and view all the answers

What does explained variance metric represent?

<p>The amount of variation in the original dataset that our model is able to explain (D)</p> Signup and view all the answers

What does mean squared error measure?

<p>The average of squared differences between the predicted output and the true output (A)</p> Signup and view all the answers

Which metric compares the variance within the expected outcomes to the variance in the error of a regression model?

<p>Explained variance (D)</p> Signup and view all the answers

Which parameter allows us to control the tradeoff of importance between precision and recall?

<p>Beta (C)</p> Signup and view all the answers

What should be done before making splits in a train/test/validation scenario to ensure an accurate representation of the dataset?

<p>Shuffle the data (B)</p> Signup and view all the answers

Why are precision and recall useful in cases where classes aren't evenly distributed?

<p>To ensure balanced predictions despite class imbalance (B)</p> Signup and view all the answers

More Like This

Use Quizgecko on...
Browser
Browser