Machine Learning Lecture 12: Ensemble Learning

FerventPlot avatar
FerventPlot
·
·
Download

Start Quiz

Study Flashcards

10 Questions

What is a Random Forest?

Bagging with trees + random feature subsets

Random Forest can only be used for classification problems.

False

What is the main idea behind Stacking?

Combining multiple classification models via a meta-classifier.

The default criterion for sklearn.ensemble.RandomForestClassifier is _______________.

gini

What is the purpose of the meta-classifier in Stacking?

To combine the outputs of individual models

StackingCVClassifier uses pre-fitted classifiers.

True

What is the difference between RandomForestClassifier and RandomForestRegressor?

One is for classification and the other is for regression.

In Random Forest, max_features is set to _______________ by default.

'sqrt'

Match the following ensemble learning techniques with their descriptions:

Random Forest = Combining multiple decision trees with random feature subsets Stacking = Combining multiple models via a meta-classifier Bagging = Bootstrapping and aggregating multiple models

Stacking can only be used with models that operate on the same feature subsets.

False

Study Notes

Ensemble Learning

  • Ensemble learning is a method of combining multiple base models to create a stronger predictive model.

Voting

  • Majority voting is a type of ensemble learning where multiple independent classifiers are combined to make a prediction.
  • The error rate of the ensemble is computed using a binomial probability distribution.
  • The probability of a wrong prediction by the ensemble is calculated based on the number of classifiers making a wrong prediction.
  • Soft voting is a type of voting where classifiers are assigned weights based on their performance.

Bagging (Bootstrap Aggregating)

  • Bagging is an ensemble learning technique that combines multiple instances of the same base model, each trained on a random subset of the training data.
  • Bootstrap sampling is used to create the random subsets of the training data.

Boosting

  • Boosting is an ensemble learning technique that combines multiple weak learners to create a strong learner.
  • In general boosting, each subsequent model is trained on the mistakes of the previous model.
  • AdaBoost is a type of boosting that uses a weighted training dataset to focus on the mistakes of the previous model.

Gradient Boosting

  • Gradient boosting is a type of boosting that combines multiple weak learners to create a strong learner.
  • Trees are fit sequentially to improve the error of the previous trees.
  • The trees are combined using an additive approach.

Random Forest

  • Random Forest is an ensemble learning technique that combines bagging with trees and random feature subsets.
  • The algorithm combines multiple decision trees, each trained on a random subset of the training data and a random subset of features.
  • Random Forest can be used for both classification and regression tasks.

Stacking

  • Stacking is an ensemble learning technique that combines multiple classification models using a meta-classifier.
  • The individual classification models are trained on the complete training set, and then the meta-classifier is fitted based on the outputs of the individual models.
  • The meta-classifier can be trained on either the predicted class labels or probabilities from the individual models.

This quiz covers the concepts of ensemble learning, including majority voting and error rates, as part of a machine learning course.

Make Your Own Quizzes and Flashcards

Convert your notes into interactive study material.

Get started for free
Use Quizgecko on...
Browser
Browser