Introduction to Machine Learning Concepts
37 Questions
0 Views

Choose a study mode

Play Quiz
Study Flashcards
Spaced Repetition
Chat to lesson

Podcast

Play an AI-generated podcast conversation about this lesson

Questions and Answers

Which of the following statements about the running time of k-means clustering is true?

  • Increasing k will always decrease running time.
  • The step that updates the cluster means runs in at most O(nk) time.
  • The k-means algorithm cannot handle more than two dimensions.
  • Each sample point contributes to one cluster in O(d) time. (correct)
  • Increasing the number of clusters, k, always increases the running time of the k-means algorithm.

    True

    What is the maximum time complexity for the step that updates the cluster assignments in k-means clustering?

    O(nkd)

    The k-means algorithm runs in at most ______ time.

    <p>O(nkd)</p> Signup and view all the answers

    Match each statement about k-means clustering with the correct truth value (True or False):

    <p>The step that updates cluster means runs in O(nd) time = True k = n will lead to termination after the first iteration = False Distance calculations from n points to k centroids is inexpensive = False The algorithm may require exponentially many steps to converge = True</p> Signup and view all the answers

    Define the term 'training set' in your own words.

    <p>A set of input-output pair examples used to train a machine learning model.</p> Signup and view all the answers

    Which type of learning involves training data with labeled outputs?

    <p>Supervised Learning</p> Signup and view all the answers

    In unsupervised learning, the goal is to predict specific labels for the data.

    <p>False</p> Signup and view all the answers

    Machine learning algorithms improve their performance __________.

    <p>over time with experience</p> Signup and view all the answers

    What is the likely approach to improve training accuracy from only 50%?

    <p>Both A and B</p> Signup and view all the answers

    What type of learning is used to identify different groups for targeted treatments from unlabeled medical data?

    <p>Unsupervised Learning</p> Signup and view all the answers

    Match the following terms with their descriptions:

    <p>Supervised Learning = Training data with labeled outputs Unsupervised Learning = Finding patterns in unlabeled data Reinforcement Learning = Learning through rewards and penalties Hypothesis = Function mapping inputs to outputs</p> Signup and view all the answers

    Machine learning algorithms build a model based on sample data, known as __________.

    <p>training data</p> Signup and view all the answers

    In Figure 1, subplot I, how are bias and variance related to the true model?

    <p>High bias, Low variance</p> Signup and view all the answers

    K-fold cross-validation can be used for hyperparameter tuning.

    <p>True</p> Signup and view all the answers

    What is the primary reason for using stochastic gradient descent instead of gradient descent?

    <p>To speed up per-iteration computation</p> Signup and view all the answers

    The cross-validation error is a better estimate of the true error than the ______ error.

    <p>training</p> Signup and view all the answers

    Which of the following does not increase the complexity of a neural network?

    <p>Reducing the learning rate</p> Signup and view all the answers

    Solving the k-means objective is a supervised learning problem.

    <p>False</p> Signup and view all the answers

    In Figure 1, subplot IV, how are bias and variance related to the true model?

    <p>Low bias, Low variance</p> Signup and view all the answers

    State one reason why ReLUs may be preferred over sigmoids as activation functions.

    <p>The forward and backward passes are computationally cheaper with ReLUs than with sigmoid.</p> Signup and view all the answers

    Which of the following enables computers to learn from data and improve themselves without explicit programming?

    <p>Machine Learning</p> Signup and view all the answers

    Linear Regression is based on supervised learning.

    <p>True</p> Signup and view all the answers

    What does RMSE stand for in Linear Regression?

    <p>Root Mean Squared Error</p> Signup and view all the answers

    Regression models a target prediction value based on _____ variables.

    <p>independent</p> Signup and view all the answers

    What does the correlation coefficient measure?

    <p>The strength of the relationship between the x and y variables</p> Signup and view all the answers

    If a linear regression model has zero training error, the test error will also be zero.

    <p>False</p> Signup and view all the answers

    What is the average squared difference between classifier predicted output and actual output called?

    <p>Mean squared error</p> Signup and view all the answers

    If {v1, v2, ··· , vn} and {w1, w2, ··· , wn} are linearly independent, then {v1 + w1, v2 + w2, ··· , vn + wn} are _____ independent.

    <p>linearly</p> Signup and view all the answers

    What characteristic makes the cost function of a ReLU-based neural network convex?

    <p>The nature of the activation function</p> Signup and view all the answers

    ReLU functions are more susceptible to the vanishing gradient problem compared to sigmoid functions.

    <p>False</p> Signup and view all the answers

    What is the main issue caused by the vanishing gradient problem in training neural networks?

    <p>Slow training</p> Signup and view all the answers

    The cost function of a neural network trained with the squared-error loss is defined everywhere in weight space when using the ______ activation function.

    <p>ReLU</p> Signup and view all the answers

    Which of the following steps is NOT part of training a neural network's weights with backpropagation?

    <p>Computing derivatives of a cost function with respect to input features</p> Signup and view all the answers

    Increasing the number of layers in a sigmoid-based neural network will alleviate the vanishing gradient problem.

    <p>False</p> Signup and view all the answers

    What specific type of activation function could be used to combat the vanishing gradient problem?

    <p>ReLU</p> Signup and view all the answers

    Match the following statements about neural network training to their validity:

    <p>Weights depend on each other = False We need gradients for weight updates = True Intermediate results are needed for gradients = True Derivatives of inputs are useful = False</p> Signup and view all the answers

    Study Notes

    Machine Learning Definitions

    • Training set: A dataset of input-output pairs given to a machine learning model to create a prediction model or hypothesis.
    • Hypothesis: A function learned from training data, which predicts an output based on an input.

    Supervised, Unsupervised, and Reinforcement Learning

    • Supervised learning: Uses labelled data (input-output pairs) to train a model to predict the output for new input data.
    • Unsupervised learning: Uses unlabelled data to find patterns or clusters in the data. The model learns to identify structures without specific output guidance.
    • Reinforcement learning: The learning system interacts with an environment and receives feedback, in the form of rewards or penalties, to learn the optimal behavior or strategy to maximize rewards.

    Key Concepts

    • Linear regression: A supervised learning algorithm used to predict a continuous target variable based on one or more independent variables.
    • Logistic regression: A supervised learning algorithm used for classification tasks, predicting the probability of an input belonging to a particular category.
    • Neural network: A type of machine learning model inspired by the structure of the human brain, consisting of interconnected nodes (neurons) organized in layers.
    • Backpropagation: An algorithm used for training neural networks by adjusting weights to reduce error. It propagates the error backwards from the output layer to the input layer.
    • Gradient descent: An optimization algorithm used to find the minimum of a function by iteratively moving in the direction of the negative gradient.
    • Clustering: An unsupervised learning technique used to group data points into clusters based on their similarity based on their characteristics.
    • Centroid: The center or average of a cluster, used as a representative point for the cluster in k-means clustering.

    Bias and Variance

    • Bias: A model's tendency to consistently under- or over-predict values. High bias implies the model is too simple and cannot capture the underlying relationship effectively.
    • Variance: A model's sensitivity to changes in the training data. High variance implies the model is too complex, and may overfit the training data, leading to poor generalization to new data.

    Key Terms

    • RMSE (Root Mean Squared Error): A measure of the difference between the predicted values and the actual values, used to evaluate the performance of regression models.
    • Correlation coefficient: A statistical measure that quantifies the strength and direction of the linear relationship between two variables.

    Neural Network Concepts

    • ReLU (Rectified Linear Unit): A type of activation function used in neural networks. It outputs the input directly if it is positive and zero if it is negative.
    • Vanishing gradient problem: A common problem in deep neural networks with sigmoid activation functions, where gradients become very small as information propagates through the layers, slowing down training.

    Cross-validation

    • K-fold cross-validation: A technique used to evaluate the performance of a machine learning model by splitting the data into k folds and training the model on k-1 folds and validating on the remaining fold.
    • Stochastic Gradient Descent (SGD): An optimization algorithm used for training machine learning models, which updates the weights using a single data point (or a small batch of data) at a time, instead of the entire dataset. This can be faster in practice compared to using the full dataset, but it often leads to higher variance in training.

    Important Notes

    • Overfitting occurs when a model performs well on the training data but poorly on unseen data.
    • Increasing the complexity of a neural network by adding layers, increasing hidden layer size, or reducing regularization strength can lead to overfitting.
    • K-means clustering is an unsupervised learning problem with a goal to find clusters of data points.

    Studying That Suits You

    Use AI to generate personalized quizzes and flashcards to suit your learning preferences.

    Quiz Team

    Related Documents

    AI Exam 2 Sample Questions PDF

    Description

    This quiz covers essential definitions and core concepts in machine learning, including training sets, hypotheses, and various learning paradigms such as supervised, unsupervised, and reinforcement learning. You'll also learn about key algorithms like linear regression and their applications.

    More Like This

    Use Quizgecko on...
    Browser
    Browser