ELE888 - Midterm (Theory)

Podcast

Play an AI-generated podcast conversation about this lesson

Download our mobile app to listen on the go

Get App

Questions and Answers

What is the significance of the center of the ellipses in the context of a cost function?

Answer hidden

Which of the following is a characteristic of using normal equations to minimize a cost function?

Answer hidden

What potential issue can arise from using a learning rate that is too large in gradient descent?

Answer hidden

How does a small learning rate affect the training process in gradient descent?

Answer hidden

You are working with a dataset that has a relatively small number of data points. Which method of minimizing the cost function would be most appropriate?

Answer hidden

Which of the following scenarios is best addressed using unsupervised learning?

Answer hidden

A machine learning model is trained to predict whether a customer will click on an advertisement (yes/no). What type of supervised learning is being used?

Answer hidden

In reinforcement learning, what is the primary goal of an agent?

Answer hidden

Which of the following machine learning approaches would be most suitable for building a system that recommends movies to users based on the viewing history of similar users?

Answer hidden

Why might dimensionality reduction be a useful step in building a machine learning model?

Answer hidden

How does the cost function contribute to improving a model's predictive capability?

Answer hidden

What is the primary purpose of cross-validation in model training?

Answer hidden

What distinguishes Ridge Regression from standard linear regression?

Answer hidden

In Lasso Regression, what is the significance of reducing some coefficients to zero?

Answer hidden

How does increasing the value of alpha in Ridge Regression affect the model?

Answer hidden

What is the key difference between overfitting and underfitting in machine learning models?

Answer hidden

In linear regression, what role does the regression line play in prediction?

Answer hidden

What type of relationship is typically visualized using contour plots in the context of machine learning?

Answer hidden

In the context of training a machine learning model, what is a potential drawback of using a very small batch size?

Answer hidden

Which of the following is the primary purpose of normalizing features before applying gradient descent?

Answer hidden

What distinguishes Stochastic Gradient Descent (SGD) from Batch Gradient Descent?

Answer hidden

In the context of machine learning, what does an 'epoch' represent?

Answer hidden

If a learning curve plateaus during the training of a machine learning model, what does this typically indicate?

Answer hidden

Which of the following best describes 'model hyperparameters' in machine learning?

Answer hidden

What is the main purpose of Logistic Regression?

Answer hidden

How does increasing the batch size typically affect the stability and memory requirements of the training process?

Answer hidden

Why is feature normalization important in machine learning?

Answer hidden

Which of the following is most likely to cause numerical instability?

Answer hidden

A model shows high accuracy on the training set but performs poorly on the validation set. Which of the following is the most likely cause?

Answer hidden

Which of the following strategies is least likely to help reduce overfitting?

Answer hidden

Which regularization technique is most likely to perform feature selection by setting some feature weights exactly to zero?

Answer hidden

What is true about an overfitted model?

Answer hidden

How does increasing the amount of training data help to reduce overfitting?

Answer hidden

What is the primary purpose of cross-validation in machine learning?

Answer hidden

In a fraud detection model with skewed learning, 99% of transactions are legitimate. Which of the following strategies would be MOST effective in addressing the challenges posed by this skewed dataset?

Answer hidden

A medical diagnosis model is trained on a dataset with 98% healthy patients and 2% having a rare disease. If the model consistently predicts 'healthy,' which evaluation metric would be MOST informative in assessing its performance?

Answer hidden

Which of the following is a key similarity between logistic regression and neural networks?

Answer hidden

How does the range of output values differ between the Sigmoid and Tanh activation functions, and what is the implication of this difference?

Answer hidden

Zero-centering is a desirable property for activation functions because it helps prevent large positive or negative values from accumulating in deeper layers. Which activation function is zero-centered?

Answer hidden

Both Sigmoid and Tanh activation functions suffer from the vanishing gradient problem. Under what conditions does this problem typically occur?

Answer hidden

In the context of skewed learning, which of the following is the MOST critical consideration when evaluating model performance?

Answer hidden

In what way do both Logistic Regression and Neural Networks utilize a weighted sum of inputs?

Answer hidden

Flashcards

Machine Learning

Learning from data patterns instead of explicit programming.

Classification

A type of supervised learning that predicts categories or classes.