Introduction to Machine Learning Concepts

Choose a study mode

Play Quiz
Study Flashcards
Spaced Repetition
Chat to Lesson

Podcast

Play an AI-generated podcast conversation about this lesson
Download our mobile app to listen on the go
Get App

Questions and Answers

Which of the following statements about the running time of k-means clustering is true?

  • Increasing k will always decrease running time.
  • The step that updates the cluster means runs in at most O(nk) time.
  • The k-means algorithm cannot handle more than two dimensions.
  • Each sample point contributes to one cluster in O(d) time. (correct)

Increasing the number of clusters, k, always increases the running time of the k-means algorithm.

True (A)

What is the maximum time complexity for the step that updates the cluster assignments in k-means clustering?

O(nkd)

The k-means algorithm runs in at most ______ time.

<p>O(nkd)</p> Signup and view all the answers

Match each statement about k-means clustering with the correct truth value (True or False):

<p>The step that updates cluster means runs in O(nd) time = True k = n will lead to termination after the first iteration = False Distance calculations from n points to k centroids is inexpensive = False The algorithm may require exponentially many steps to converge = True</p> Signup and view all the answers

Define the term 'training set' in your own words.

<p>A set of input-output pair examples used to train a machine learning model.</p> Signup and view all the answers

Which type of learning involves training data with labeled outputs?

<p>Supervised Learning (A)</p> Signup and view all the answers

In unsupervised learning, the goal is to predict specific labels for the data.

<p>False (B)</p> Signup and view all the answers

Machine learning algorithms improve their performance __________.

<p>over time with experience</p> Signup and view all the answers

What is the likely approach to improve training accuracy from only 50%?

<p>Both A and B (A)</p> Signup and view all the answers

What type of learning is used to identify different groups for targeted treatments from unlabeled medical data?

<p>Unsupervised Learning (A)</p> Signup and view all the answers

Match the following terms with their descriptions:

<p>Supervised Learning = Training data with labeled outputs Unsupervised Learning = Finding patterns in unlabeled data Reinforcement Learning = Learning through rewards and penalties Hypothesis = Function mapping inputs to outputs</p> Signup and view all the answers

Machine learning algorithms build a model based on sample data, known as __________.

<p>training data</p> Signup and view all the answers

In Figure 1, subplot I, how are bias and variance related to the true model?

<p>High bias, Low variance (A)</p> Signup and view all the answers

K-fold cross-validation can be used for hyperparameter tuning.

<p>True (A)</p> Signup and view all the answers

What is the primary reason for using stochastic gradient descent instead of gradient descent?

<p>To speed up per-iteration computation</p> Signup and view all the answers

The cross-validation error is a better estimate of the true error than the ______ error.

<p>training</p> Signup and view all the answers

Which of the following does not increase the complexity of a neural network?

<p>Reducing the learning rate (C)</p> Signup and view all the answers

Solving the k-means objective is a supervised learning problem.

<p>False (B)</p> Signup and view all the answers

In Figure 1, subplot IV, how are bias and variance related to the true model?

<p>Low bias, Low variance (B)</p> Signup and view all the answers

State one reason why ReLUs may be preferred over sigmoids as activation functions.

<p>The forward and backward passes are computationally cheaper with ReLUs than with sigmoid.</p> Signup and view all the answers

Which of the following enables computers to learn from data and improve themselves without explicit programming?

<p>Machine Learning (B)</p> Signup and view all the answers

Linear Regression is based on supervised learning.

<p>True (A)</p> Signup and view all the answers

What does RMSE stand for in Linear Regression?

<p>Root Mean Squared Error</p> Signup and view all the answers

Regression models a target prediction value based on _____ variables.

<p>independent</p> Signup and view all the answers

What does the correlation coefficient measure?

<p>The strength of the relationship between the x and y variables (D)</p> Signup and view all the answers

If a linear regression model has zero training error, the test error will also be zero.

<p>False (B)</p> Signup and view all the answers

What is the average squared difference between classifier predicted output and actual output called?

<p>Mean squared error (D)</p> Signup and view all the answers

If {v1, v2, ··· , vn} and {w1, w2, ··· , wn} are linearly independent, then {v1 + w1, v2 + w2, ··· , vn + wn} are _____ independent.

<p>linearly</p> Signup and view all the answers

What characteristic makes the cost function of a ReLU-based neural network convex?

<p>The nature of the activation function (A)</p> Signup and view all the answers

ReLU functions are more susceptible to the vanishing gradient problem compared to sigmoid functions.

<p>False (B)</p> Signup and view all the answers

What is the main issue caused by the vanishing gradient problem in training neural networks?

<p>Slow training</p> Signup and view all the answers

The cost function of a neural network trained with the squared-error loss is defined everywhere in weight space when using the ______ activation function.

<p>ReLU</p> Signup and view all the answers

Which of the following steps is NOT part of training a neural network's weights with backpropagation?

<p>Computing derivatives of a cost function with respect to input features (D)</p> Signup and view all the answers

Increasing the number of layers in a sigmoid-based neural network will alleviate the vanishing gradient problem.

<p>False (B)</p> Signup and view all the answers

What specific type of activation function could be used to combat the vanishing gradient problem?

<p>ReLU</p> Signup and view all the answers

Match the following statements about neural network training to their validity:

<p>Weights depend on each other = False We need gradients for weight updates = True Intermediate results are needed for gradients = True Derivatives of inputs are useful = False</p> Signup and view all the answers

Flashcards are hidden until you start studying

Study Notes

Machine Learning Definitions

  • Training set: A dataset of input-output pairs given to a machine learning model to create a prediction model or hypothesis.
  • Hypothesis: A function learned from training data, which predicts an output based on an input.

Supervised, Unsupervised, and Reinforcement Learning

  • Supervised learning: Uses labelled data (input-output pairs) to train a model to predict the output for new input data.
  • Unsupervised learning: Uses unlabelled data to find patterns or clusters in the data. The model learns to identify structures without specific output guidance.
  • Reinforcement learning: The learning system interacts with an environment and receives feedback, in the form of rewards or penalties, to learn the optimal behavior or strategy to maximize rewards.

Key Concepts

  • Linear regression: A supervised learning algorithm used to predict a continuous target variable based on one or more independent variables.
  • Logistic regression: A supervised learning algorithm used for classification tasks, predicting the probability of an input belonging to a particular category.
  • Neural network: A type of machine learning model inspired by the structure of the human brain, consisting of interconnected nodes (neurons) organized in layers.
  • Backpropagation: An algorithm used for training neural networks by adjusting weights to reduce error. It propagates the error backwards from the output layer to the input layer.
  • Gradient descent: An optimization algorithm used to find the minimum of a function by iteratively moving in the direction of the negative gradient.
  • Clustering: An unsupervised learning technique used to group data points into clusters based on their similarity based on their characteristics.
  • Centroid: The center or average of a cluster, used as a representative point for the cluster in k-means clustering.

Bias and Variance

  • Bias: A model's tendency to consistently under- or over-predict values. High bias implies the model is too simple and cannot capture the underlying relationship effectively.
  • Variance: A model's sensitivity to changes in the training data. High variance implies the model is too complex, and may overfit the training data, leading to poor generalization to new data.

Key Terms

  • RMSE (Root Mean Squared Error): A measure of the difference between the predicted values and the actual values, used to evaluate the performance of regression models.
  • Correlation coefficient: A statistical measure that quantifies the strength and direction of the linear relationship between two variables.

Neural Network Concepts

  • ReLU (Rectified Linear Unit): A type of activation function used in neural networks. It outputs the input directly if it is positive and zero if it is negative.
  • Vanishing gradient problem: A common problem in deep neural networks with sigmoid activation functions, where gradients become very small as information propagates through the layers, slowing down training.

Cross-validation

  • K-fold cross-validation: A technique used to evaluate the performance of a machine learning model by splitting the data into k folds and training the model on k-1 folds and validating on the remaining fold.
  • Stochastic Gradient Descent (SGD): An optimization algorithm used for training machine learning models, which updates the weights using a single data point (or a small batch of data) at a time, instead of the entire dataset. This can be faster in practice compared to using the full dataset, but it often leads to higher variance in training.

Important Notes

  • Overfitting occurs when a model performs well on the training data but poorly on unseen data.
  • Increasing the complexity of a neural network by adding layers, increasing hidden layer size, or reducing regularization strength can lead to overfitting.
  • K-means clustering is an unsupervised learning problem with a goal to find clusters of data points.

Studying That Suits You

Use AI to generate personalized quizzes and flashcards to suit your learning preferences.

Quiz Team

Related Documents

AI Exam 2 Sample Questions PDF

More Like This

Use Quizgecko on...
Browser
Browser