Machine Learning Practice Questions

Summary

This document contains practice questions on various machine learning topics. It includes questions related to unsupervised learning, reinforcement learning (RL), regression tasks, classification tasks, and linear regression models. The questions cover topics such as predicting outcomes, analyzing datasets, and understanding the properties of different machine learning models, such as bias and variance.

Full Transcript

Week 1 Which of the following is/are unsupervised learning problem(s)? Sorting a set of news articles into four categories based on their titles Forecasting the stock price of a given company based on historical data Predicting the type of interaction (positive/negative) between a new drug and a s...

Week 1 Which of the following is/are unsupervised learning problem(s)? Sorting a set of news articles into four categories based on their titles Forecasting the stock price of a given company based on historical data Predicting the type of interaction (positive/negative) between a new drug and a set of human proteins Identifying close-knit communities of people in a social network Learning to generate artificial human faces using the faces from a facial recognition dataset Yes, the answer is correct. Score: 1 Accepted Answers: Identifying close-knit communities of people in a social network Learning to generate artificial human faces using the faces from a facial recognition dataset 1 point Which of the following statement(s) about Reinforcement Learning (RL) is/are true? While learning a policy, the goal is to maximize the reward for the current time step During training, the agent is explicitly provided the most optimal action to be taken in each state. The actions taken by an agent do no affect the environment in any way. RL agents used for playing turn based games like chess can be trained by playing the agent against itself (self play). RL can be used in a autonomous driving system. Yes, the answer is correct. Score: 1 Accepted Answers: RL agents used for playing turn based games like chess can be trained by playing the agent against itself (self play). RL can be used in a autonomous driving system. 1 point Which of the following is/are regression tasks(s)? Predicting whether an email is spam or not spam Predicting the number of new CoVID cases in a given time period Predicting the total number of goals a given football team scores in an year Identifying the language used in a given text document Yes, the answer is correct. Score: 1 Accepted Answers: Predicting the number of new CoVID cases in a given time period Predicting the total number of goals a given football team scores in an year 1 point Which of the following is/are classification task(s)? Predicting whether or not a customer will repay a loan based on their credit history Forecasting the weather (temperature, humidity, rainfall etc.) at a given place for the following 24 hours Predict the price of a house 10 years after it is constructed. Predict if a house will be standing 50 years after it is constructed. Yes, the answer is correct. Score: 1 Accepted Answers: Predicting whether or not a customer will repay a loan based on their credit history Predict if a house will be standing 50 years after it is constructed. Consider the following dataset. Fit a linear regression model of the form y=β0+β1x1+β2x2y=β0+β1x1+β2x2 using the mean-squared error loss. Using this model, the predicted value of yy at the point (x1,x2x1,x2) = (0.5, −1.0) is 4.05 2.05 −1.95 −3.95 Yes, the answer is correct. Score: 1 Accepted Answers: 4.05 Consider the following dataset. Using a k-nearest neighbour (k-NN) regression model with kk = 3, predict the value of yy at (x1,x2x1,x2) = (1.0, 0.5). Use the Euclidean distance to find the nearest neighbours. −1.766 −1.166 1.133 1.733 Yes, the answer is correct. Score: 1 Accepted Answers: 1.733 1 point Consider the following dataset with three classes: 0, 1 and 2. x1 and x2 are the independent variables whereas y is the class label. Using a k-NN classifier with k = 5, predict the class label at the point (x1,x2x1,x2) = (1.0, 1.0). Use the Euclidean distance to find the nearest neighbours. 0 1 2 Cannot be predicted Yes, the answer is correct. Score: 1 Accepted Answers: 1 1 point Consider the following statements regarding linear regression and k-NN regression models. Select the true statements. A linear regressor requires the training data points during inference. A k-NN regressor requires the training data points during inference. A k-NN regressor with a higher value of k is less prone to overfitting. A linear regressor partitions the input space into multiple regions such that the prediction over a given region is constant. Yes, the answer is correct. Score: 1 Accepted Answers: A k-NN regressor requires the training data points during inference. A k-NN regressor with a higher value of k is less prone to overfitting. Which of the following statement(s) regarding bias and variance is/are correct? Bias=E[(E[f^(x)]−f^(x))2];Variance=E[(f^(x)−f(x))2]Bias=E[(E[f^(x)] −f^(x))2];Variance=E[(f^(x)−f(x))2] Bias=E[f^(x)]−f(x);Variance=E[(E[f^(x)]−f^(x))2]Bias=E[f^(x)] −f(x);Variance=E[(E[f^(x)]−f^(x))2] Low bias and high variance is a sign of overfitting Low variance and high bias is a sign of overfitting Low variance and high bias is a sign of underfitting Partially Correct. Score: 0.67 Accepted Answers: Bias=E[f^(x)]−f(x);Variance=E[(E[f^(x)]−f^(x))2]Bias=E[f^(x)] −f(x);Variance=E[(E[f^(x)]−f^(x))2] Low bias and high variance is a sign of overfitting Low variance and high bias is a sign of underfitting 1 point Suppose that we train two kinds of regression models corresponding to the following equations.  (i). y=β0+β1x1+β2x2y=β0+β1x1+β2x2  (ii). y=β0+β1x1+β2x2+β3x1x2+β4x21+β5x22y=β0+β1x1+β2x2+β3x1x 2+β4x12+β5x22 Which of the following statement(s) is/are correct? On a given training dataset, the mean-squared error of (i) is always less than or equal to that of (ii). (i) is likely to have a higher variance than (ii). (ii) is likely to have a higher variance than (i). If (i) overfits the data, then (ii) will definitely overfit. If (ii) underfits the data, then (i) will definitely underfit. Partially Correct. Score: 0.66 Accepted Answers: (ii) is likely to have a higher variance than (i). If (i) overfits the data, then (ii) will definitely overfit. If (ii) underfits the data, then (i) will definitely underfit.