Machine Learning Fundamentals

Choose a study mode

Play Quiz
Study Flashcards
Spaced Repetition
Chat to Lesson

Podcast

Play an AI-generated podcast conversation about this lesson
Download our mobile app to listen on the go
Get App

Questions and Answers

Which of the following tasks are examples of unsupervised learning?

  • Predicting the likelihood of a customer clicking 'like' on a product, given their demographic information.
  • Identifying social clusters within a community. (correct)
  • Sorting news articles into predefined categories based on their titles.
  • Predicting the exchange rate between USD and EURO based on historical data.

Which of the following is TRUE regarding Reinforcement Learning (RL)?

  • The primary goal of an RL policy is immediate reward, ignoring long-term consequences.
  • RL excels in creating self-improving game-playing agents, as shown in strategic games. (correct)
  • During training, an RL agent needs an external supervisor to explicitly guide its actions.
  • The actions taken by an RL agent have absolutely no influence on the environment.

Which of the following presents a regression task?

  • Segmenting customers into distinct groups based on purchasing patterns.
  • Predicting if a patient will develop disease X, given their genome sequence.
  • Categorizing customer reviews as positive, negative, or neutral.
  • Predicting the float value of the height of a tree, given environmental conditions. (correct)

Which of the following is most clearly a classification task?

<p>Determining whether a satellite image shows a forest or a desert. (B)</p>
Signup and view all the answers

Consider a dataset to fit a linear regression model. What is the predicted value of $y$ at the point $(x_1, x_2) = (0.5, -1.0)$, If (y = \beta_0 + \beta_1x_1 + \beta_2x_2), given the mean-squared error loss, (\beta_0 = 2), (\beta_1 = 1), and (\beta_2 = -1.5)?

<p>3.75 (A)</p>
Signup and view all the answers

Consider a k-NN regression model with (k = 3). Which of the following is the predicted (y) value at the point ((x_1, x_2) = (1.0, 0.5)), if the three nearest neighbors have (y) values of 2.65, -2.05, and 1.95?

<p>0.85 (D)</p>
Signup and view all the answers

With a k-NN classifier with (k = 5), you want to predict the class label at point ((x_1, x_2) = (1.0, 1.0)). The five nearest neighbors have class labels: 0, 0, 1, 1, 2. What is the resulting classification?

<p>1 (D)</p>
Signup and view all the answers

Regarding linear regression and k-NN regression models, which statement is correct?

<p>k-NN regression requires the training instances during the prediction phase. (A)</p>
Signup and view all the answers

Which of the following options correctly relates model bias and variance to overfitting/underfitting?

<p>Low bias and high variance often signify overfitting. (D)</p>
Signup and view all the answers

Model (i) is given by $y = \beta_0 + \beta_1x_1 + \beta_2x_2$ and Model (ii) as $y = \beta_0 + \beta_1x_1 + \beta_2x_2 + \beta_3x_1x_2 + \beta_4x_1^2 + \beta_5x_2^2$. Which of the following options is correct?

<p>Model (i) is expected to showcase higher variance than Model (ii). (D)</p>
Signup and view all the answers

Which of the following correctly describes feature scaling?

<p>A method to normalize the range of independent variables or features of data. (D)</p>
Signup and view all the answers

What is the primary purpose of cross-validation?

<p>To estimate the performance of a model on unseen data. (C)</p>
Signup and view all the answers

What is the role of the activation function in a neural network?

<p>To introduce non-linearity into the model. (C)</p>
Signup and view all the answers

What is the learning rate in the context of training neural networks?

<p>The step size at each iteration while moving toward a minimum of a loss function. (A)</p>
Signup and view all the answers

What is the purpose of regularization in machine learning?

<p>To prevent overfitting. (B)</p>
Signup and view all the answers

Which of the following is NOT a common distance metric used in k-NN algorithms?

<p>Bayesian distance (B)</p>
Signup and view all the answers

What is the primary goal of anomaly detection?

<p>To identify rare or unusual data points. (C)</p>
Signup and view all the answers

Which of the following statements about decision trees is true?

<p>Decision trees can be used for both classification and regression tasks. (C)</p>
Signup and view all the answers

What is the purpose of a confusion matrix in classification?

<p>To evaluate the accuracy of a classification model. (C)</p>
Signup and view all the answers

Which of the following describes transfer learning?

<p>Using a model trained on one task as a starting point for another task. (B)</p>
Signup and view all the answers

Flashcards

Unsupervised learning

A learning method where the system learns patterns from unlabeled data.

Identifying close-knit communities

Finding clusters or groups within a dataset without predefined labels.

Generate artificial human faces

Creating new, realistic images from a dataset using generative models.

Reinforcement Learning (RL)

An ML approach where the agent learns to make decisions by maximizing cumulative reward in an environment.

Signup and view all the flashcards

RL agents playing against themselves

RL agents can improve by competing against a clone of itself, leveraging self reflection.

Signup and view all the flashcards

Regression task

Predicting a continuous numerical output.

Signup and view all the flashcards

Predicting CoVID cases and football goals

Predicting numerical values like new COVID cases or football goals

Signup and view all the flashcards

Classification task

Assigning an input to a category or class.

Signup and view all the flashcards

Customer repays loan

Deciding group membership based on input features.

Signup and view all the flashcards

k-Nearest Neighbors (k-NN)

An algorithm that classifies new data points based on the majority class of its k-nearest neighbors in the feature space.

Signup and view all the flashcards

Overfitting

A model characteristic where it performs well on training data but poorly on unseen data.

Signup and view all the flashcards

Variance in ML

A measure of how much a model's predictions vary for different training sets.

Signup and view all the flashcards

Bias in ML

The difference between the expected prediction of our model and the correct value which we are trying to predict.

Signup and view all the flashcards

Overfitting sign

Low bias but high variance

Signup and view all the flashcards

Higher variance

Models with greater complexity, like (ii), tend to have higher variance, and overfit easily.

Signup and view all the flashcards

Study Notes

  • Unsupervised learning problems include:
  • Identifying close-knit communities in a social network
  • Learning to generate artificial human faces using faces from a facial recognition dataset
  • For Reinforcement Learning (RL), the following are true:
  • RL agents used for playing turn-based games like chess can be trained by playing the agent against itself (self play)
  • RL can be used in an autonomous driving system
  • Regression tasks include:
  • Predicting the number of new CoVID cases in a given time period
  • Predicting the total number of goals a given football team scores in a year
  • Classification tasks include:
  • Predicting whether or not a customer will repay a loan based on their credit history
  • Predicting if a house will be standing 50 years after it is constructed
  • Fitting a linear regression model with mean-squared error loss, the predicted value of y at the point (x1,x2) = (0.5, -1.0) is 4.05
  • Using a k-nearest neighbor (k-NN) regression model with k = 3 and Euclidean distance, the predicted value of y at (x1,x2) = (1.0, 0.5) is 1.733
  • Using a k-NN classifier with k = 5 and Euclidean distance, the class label at the point (x1,x2) = (1.0, 1.0) is 1
  • Regarding linear regression and k-NN regression models:
  • A k-NN regressor requires the training data points during inference
  • A k-NN regressor with a higher value of k is less prone to overfitting
  • Regarding bias and variance:
  • Bias=E[f^(x)]-f(x); Variance=E[(E[f^(x)]-f^(x))^2]
  • Low bias and high variance is a sign of overfitting
  • Given two regression models:
  • (i) y=β0+β1x1+β2x2
  • (ii) y=β0+β1x1+β2x2+β3x1x2+β4x1^2+β5x2^2
  • On a given training dataset, the mean-squared error of (i) is always less than or equal to that of (ii)
  • (ii) is likely to have a higher variance than (i)
  • If (i) overfits the data, then (ii) will definitely overfit
  • If (ii) underfits the data, then (i) will definitely underfit

Studying That Suits You

Use AI to generate personalized quizzes and flashcards to suit your learning preferences.

Quiz Team

Related Documents

More Like This

Supervised Learning Algorithms Overview
10 questions
Machine Learning: Algorithms Overview
20 questions

Machine Learning: Algorithms Overview

SelfRespectWildflowerMeadow8127 avatar
SelfRespectWildflowerMeadow8127
Use Quizgecko on...
Browser
Browser