Machine Learning Fundamentals

Podcast

Play an AI-generated podcast conversation about this lesson

Download our mobile app to listen on the go

Get App

Questions and Answers

Which of the following tasks are examples of unsupervised learning?

Predicting the likelihood of a customer clicking 'like' on a product, given their demographic information.
Identifying social clusters within a community. (correct)
Sorting news articles into predefined categories based on their titles.
Predicting the exchange rate between USD and EURO based on historical data.

Which of the following is TRUE regarding Reinforcement Learning (RL)?

The primary goal of an RL policy is immediate reward, ignoring long-term consequences.
RL excels in creating self-improving game-playing agents, as shown in strategic games. (correct)
During training, an RL agent needs an external supervisor to explicitly guide its actions.
The actions taken by an RL agent have absolutely no influence on the environment.

Which of the following presents a regression task?

Segmenting customers into distinct groups based on purchasing patterns.
Predicting if a patient will develop disease X, given their genome sequence.
Categorizing customer reviews as positive, negative, or neutral.
Predicting the float value of the height of a tree, given environmental conditions. (correct)

Which of the following is most clearly a classification task?

Determining whether a satellite image shows a forest or a desert. (B)

Signup and view all the answers

Consider a dataset to fit a linear regression model. What is the predicted value of $y$ at the point $(x_1, x_2) = (0.5, -1.0)$, If (y = \beta_0 + \beta_1x_1 + \beta_2x_2), given the mean-squared error loss, (\beta_0 = 2), (\beta_1 = 1), and (\beta_2 = -1.5)?

3.75 (A)

Signup and view all the answers

Consider a k-NN regression model with (k = 3). Which of the following is the predicted (y) value at the point ((x_1, x_2) = (1.0, 0.5)), if the three nearest neighbors have (y) values of 2.65, -2.05, and 1.95?

0.85 (D)

Signup and view all the answers

With a k-NN classifier with (k = 5), you want to predict the class label at point ((x_1, x_2) = (1.0, 1.0)). The five nearest neighbors have class labels: 0, 0, 1, 1, 2. What is the resulting classification?

1 (D)

Signup and view all the answers

Regarding linear regression and k-NN regression models, which statement is correct?

k-NN regression requires the training instances during the prediction phase. (A)

Signup and view all the answers

Which of the following options correctly relates model bias and variance to overfitting/underfitting?

Low bias and high variance often signify overfitting. (D)

Signup and view all the answers

Model (i) is given by $y = \beta_0 + \beta_1x_1 + \beta_2x_2$ and Model (ii) as $y = \beta_0 + \beta_1x_1 + \beta_2x_2 + \beta_3x_1x_2 + \beta_4x_1^2 + \beta_5x_2^2$. Which of the following options is correct?

Model (i) is expected to showcase higher variance than Model (ii). (D)

Signup and view all the answers

Which of the following correctly describes feature scaling?

A method to normalize the range of independent variables or features of data. (D)

Signup and view all the answers

What is the primary purpose of cross-validation?

To estimate the performance of a model on unseen data. (C)

Signup and view all the answers

What is the role of the activation function in a neural network?

To introduce non-linearity into the model. (C)

Signup and view all the answers

What is the learning rate in the context of training neural networks?

The step size at each iteration while moving toward a minimum of a loss function. (A)

Signup and view all the answers

What is the purpose of regularization in machine learning?

To prevent overfitting. (B)

Signup and view all the answers

Which of the following is NOT a common distance metric used in k-NN algorithms?

Bayesian distance (B)

Signup and view all the answers

What is the primary goal of anomaly detection?

To identify rare or unusual data points. (C)

Signup and view all the answers

Which of the following statements about decision trees is true?

Decision trees can be used for both classification and regression tasks. (C)

Signup and view all the answers

What is the purpose of a confusion matrix in classification?

To evaluate the accuracy of a classification model. (C)

Signup and view all the answers

Which of the following describes transfer learning?

Using a model trained on one task as a starting point for another task. (B)

Signup and view all the answers

Flashcards

Unsupervised learning

A learning method where the system learns patterns from unlabeled data.

Identifying close-knit communities

Finding clusters or groups within a dataset without predefined labels.

Generate artificial human faces

Creating new, realistic images from a dataset using generative models.

Reinforcement Learning (RL)

An ML approach where the agent learns to make decisions by maximizing cumulative reward in an environment.

Signup and view all the flashcards

RL agents playing against themselves

RL agents can improve by competing against a clone of itself, leveraging self reflection.

Signup and view all the flashcards

Regression task

Predicting a continuous numerical output.

Signup and view all the flashcards

Predicting CoVID cases and football goals

Predicting numerical values like new COVID cases or football goals

Signup and view all the flashcards

Classification task

Assigning an input to a category or class.

Signup and view all the flashcards

Customer repays loan

Deciding group membership based on input features.

Signup and view all the flashcards

k-Nearest Neighbors (k-NN)

An algorithm that classifies new data points based on the majority class of its k-nearest neighbors in the feature space.

Signup and view all the flashcards

Overfitting

A model characteristic where it performs well on training data but poorly on unseen data.

Signup and view all the flashcards

Variance in ML

A measure of how much a model's predictions vary for different training sets.

Signup and view all the flashcards

Bias in ML

The difference between the expected prediction of our model and the correct value which we are trying to predict.

Signup and view all the flashcards

Overfitting sign

Low bias but high variance

Signup and view all the flashcards

Higher variance

Models with greater complexity, like (ii), tend to have higher variance, and overfit easily.

Signup and view all the flashcards

Study Notes

Unsupervised learning problems include:
Identifying close-knit communities in a social network
Learning to generate artificial human faces using faces from a facial recognition dataset
For Reinforcement Learning (RL), the following are true:
RL agents used for playing turn-based games like chess can be trained by playing the agent against itself (self play)
RL can be used in an autonomous driving system
Regression tasks include:
Predicting the number of new CoVID cases in a given time period
Predicting the total number of goals a given football team scores in a year
Classification tasks include:
Predicting whether or not a customer will repay a loan based on their credit history
Predicting if a house will be standing 50 years after it is constructed
Fitting a linear regression model with mean-squared error loss, the predicted value of y at the point (x1,x2) = (0.5, -1.0) is 4.05
Using a k-nearest neighbor (k-NN) regression model with k = 3 and Euclidean distance, the predicted value of y at (x1,x2) = (1.0, 0.5) is 1.733
Using a k-NN classifier with k = 5 and Euclidean distance, the class label at the point (x1,x2) = (1.0, 1.0) is 1
Regarding linear regression and k-NN regression models:
A k-NN regressor requires the training data points during inference
A k-NN regressor with a higher value of k is less prone to overfitting
Regarding bias and variance:
Bias=E[f^(x)]-f(x); Variance=E[(E[f^(x)]-f^(x))^2]
Low bias and high variance is a sign of overfitting
Given two regression models:
(i) y=β0+β1x1+β2x2
(ii) y=β0+β1x1+β2x2+β3x1x2+β4x1^2+β5x2^2
On a given training dataset, the mean-squared error of (i) is always less than or equal to that of (ii)
(ii) is likely to have a higher variance than (i)
If (i) overfits the data, then (ii) will definitely overfit
If (ii) underfits the data, then (i) will definitely underfit

Studying That Suits You

Use AI to generate personalized quizzes and flashcards to suit your learning preferences.

Machine Learning Fundamentals

Choose a study mode

Podcast

Questions and Answers

Which of the following tasks are examples of unsupervised learning?

Which of the following is TRUE regarding Reinforcement Learning (RL)?

Which of the following presents a regression task?

Which of the following is most clearly a classification task?

Consider a dataset to fit a linear regression model. What is the predicted value of $y$ at the point $(x_1, x_2) = (0.5, -1.0)$, If (y = \beta_0 + \beta_1x_1 + \beta_2x_2), given the mean-squared error loss, (\beta_0 = 2), (\beta_1 = 1), and (\beta_2 = -1.5)?

Consider a k-NN regression model with (k = 3). Which of the following is the predicted (y) value at the point ((x_1, x_2) = (1.0, 0.5)), if the three nearest neighbors have (y) values of 2.65, -2.05, and 1.95?

With a k-NN classifier with (k = 5), you want to predict the class label at point ((x_1, x_2) = (1.0, 1.0)). The five nearest neighbors have class labels: 0, 0, 1, 1, 2. What is the resulting classification?

Regarding linear regression and k-NN regression models, which statement is correct?

Which of the following options correctly relates model bias and variance to overfitting/underfitting?

Model (i) is given by $y = \beta_0 + \beta_1x_1 + \beta_2x_2$ and Model (ii) as $y = \beta_0 + \beta_1x_1 + \beta_2x_2 + \beta_3x_1x_2 + \beta_4x_1^2 + \beta_5x_2^2$. Which of the following options is correct?

Which of the following correctly describes feature scaling?

What is the primary purpose of cross-validation?

What is the role of the activation function in a neural network?

What is the learning rate in the context of training neural networks?

What is the purpose of regularization in machine learning?

Which of the following is NOT a common distance metric used in k-NN algorithms?

What is the primary goal of anomaly detection?

Which of the following statements about decision trees is true?

What is the purpose of a confusion matrix in classification?

Which of the following describes transfer learning?

Flashcards

Unsupervised learning

Identifying close-knit communities

Generate artificial human faces

Reinforcement Learning (RL)

RL agents playing against themselves

Regression task

Predicting CoVID cases and football goals

Classification task

Customer repays loan

k-Nearest Neighbors (k-NN)

Overfitting

Variance in ML

Bias in ML

Overfitting sign

Higher variance

Study Notes

Studying That Suits You

Related Documents

More Like This

Supervised Learning Algorithms Overview

Algorytmy regresji i klasyfikacji w machine learning - quiz i flashcar...

Machine Learning: Algorithms Overview