Machine Learning: Core Concepts

Choose a study mode

Play Quiz
Study Flashcards
Spaced Repetition
Chat to Lesson

Podcast

Play an AI-generated podcast conversation about this lesson
Download our mobile app to listen on the go
Get App

Questions and Answers

Which of the following is the primary goal of supervised learning?

  • Generating new data points similar to the training data
  • Learning a mapping from input to output using labeled data (correct)
  • Reducing the dimensionality of the data
  • Discovering hidden patterns in unlabeled data

Unsupervised learning algorithms require labeled data to train.

False (B)

What type of machine learning is used when the data is unlabeled and the algorithm must discover hidden structures?

Unsupervised learning

In machine learning, the process of optimizing model parameters to best fit the training data is known as ______.

<p>training</p> Signup and view all the answers

Match the following machine learning tasks with their corresponding types:

<p>Classification = Assigning data points to predefined categories Regression = Predicting a continuous value Clustering = Grouping similar data points together Dimensionality Reduction = Reducing the number of features in the data</p> Signup and view all the answers

Which evaluation metric is most suitable for assessing the performance of a classification model balancing precision and recall?

<p>F1-score (B)</p> Signup and view all the answers

Overfitting occurs when a model performs well on the training data but poorly on unseen data.

<p>True (A)</p> Signup and view all the answers

What is the purpose of cross-validation in machine learning?

<p>To estimate the performance of a model on unseen data</p> Signup and view all the answers

The technique used to prevent overfitting by adding a penalty term to the loss function is called ______.

<p>regularization</p> Signup and view all the answers

Match the following regularization techniques with their corresponding effects:

<p>L1 Regularization = Feature selection by setting some feature weights to zero L2 Regularization = Reducing the magnitude of all feature weights Dropout = Randomly disabling neurons during training Early Stopping = Stopping the training process when performance on the validation set degrades</p> Signup and view all the answers

In the context of machine learning, what does the term 'bias' refer to?

<p>A model's tendency to consistently make errors in the same direction (D)</p> Signup and view all the answers

A high-variance model is likely to underfit the training data.

<p>False (B)</p> Signup and view all the answers

What is the bias-variance tradeoff in machine learning?

<p>Balancing model's complexity to minimize both bias and variance</p> Signup and view all the answers

The process of transforming features to a similar scale is called ______.

<p>feature scaling</p> Signup and view all the answers

Match the following feature scaling techniques with their corresponding formulas:

<p>Min-Max Scaling = $(x - min(x)) / (max(x) - min(x))$ Standardization = $(x - mean(x)) / std(x)$ Robust Scaling = $(x - median(x)) / IQR(x)$ Power Transformer = Applies a power transform to each feature to make the data more Gaussian-like</p> Signup and view all the answers

Which of the following algorithms is commonly used for dimensionality reduction?

<p>Principal Component Analysis (PCA) (B)</p> Signup and view all the answers

Dimensionality reduction always improves the performance of a machine learning model.

<p>False (B)</p> Signup and view all the answers

What is Feature Engineering?

<p>Creating new features from existing ones</p> Signup and view all the answers

The process of selecting a subset of relevant features from the original set is known as ______.

<p>feature selection</p> Signup and view all the answers

Match the following Feature Selection methods with their corresponding descriptions:

<p>Filter Methods = Select features based on statistical measures like correlation Wrapper Methods = Evaluate feature subsets by training and testing a model Embedded Methods = Feature selection is part of the model training process</p> Signup and view all the answers

Which of the following is a common algorithm used for clustering data points?

<p>K-Means (A)</p> Signup and view all the answers

In K-Means clustering, the number of clusters 'k' must always be determined automatically by the algorithm.

<p>False (B)</p> Signup and view all the answers

What statistical method can be used to estimate the optimal number of clusters 'k' in K-Means clustering?

<p>Elbow method</p> Signup and view all the answers

The goal of K-Means clustering is to minimize the ______ within each cluster.

<p>variance</p> Signup and view all the answers

Match each of the following distances to their formulas in 2D space:

<p>Euclidean Distance = $\sqrt{(x_2 - x_1)^2 + (y_2 - y_1)^2}$ Manhattan Distance = $|x_2 - x_1| + |y_2 - y_1|$ Minkowski Distance = $(\sum_{i=1}^n |x_i - y_i|^p)^{1/p}$</p> Signup and view all the answers

Which of the following techniques is commonly used to handle imbalanced datasets?

<p>Oversampling the minority class (B)</p> Signup and view all the answers

Undersampling the majority class always leads to a loss of important information.

<p>True (A)</p> Signup and view all the answers

Besides oversampling and undersampling, name another popular Technique utilized on Imbalanced Datasets.

<p>Cost-sensitive learning</p> Signup and view all the answers

The metric used to evaluate performance on imbalanced datasets that considers both precision and recall is the ______ score.

<p>F1</p> Signup and view all the answers

Match the following sampling techniques with their corresponding descriptions:

<p>Random Oversampling = Duplicates instances from the minority class randomly SMOTE (Synthetic Minority Oversampling Technique) = Generates synthetic instances by interpolating between existing minority class instances Random Undersampling = Removes instances from the majority class randomly</p> Signup and view all the answers

What is the purpose of hyperparameter tuning in machine learning?

<p>To select the best set of hyperparameters that maximize model performance (A)</p> Signup and view all the answers

Grid search is a hyperparameter tuning technique that explores all possible combinations of hyperparameter values.

<p>True (A)</p> Signup and view all the answers

What Hyperparameter tuning technique randomly samples combinations of hyperparameters from a defined range of values?

<p>Random search</p> Signup and view all the answers

Bayesian optimization uses ______ to model the objective function and guide the search for the optimal hyperparameters.

<p>probabilistic models</p> Signup and view all the answers

Match the hyperparameter tuning techniques with their corresponding descriptions:

<p>Grid Search = Exhaustively searches all combinations of hyperparameters in a grid Random Search = Randomly samples combinations of hyperparameters from a distribution Bayesian Optimization = Uses probabilistic models to guide the search for optimal hyperparameters</p> Signup and view all the answers

Which of the following machine learning techniques is most effective for time-series data?

<p>Recurrent Neural Network (RNN) (B)</p> Signup and view all the answers

In time series analysis, stationarity implies that the statistical properties of the series do not change over time.

<p>True (A)</p> Signup and view all the answers

Name a technique used to makes a time series stationary by removing trends and seasonality?

<p>Differencing</p> Signup and view all the answers

ARIMA models utilize ______, Integrated, and Moving Average components to forecast future values in a time series.

<p>Autoregressive</p> Signup and view all the answers

Match the components of the ARIMA model with their corresponding descriptions:

<p>Autoregressive (AR) = Uses past values to predict future values Integrated (I) = Applies differencing to make the time series stationary Moving Average (MA) = Uses past forecast errors to predict future values</p> Signup and view all the answers

Insanely difficult: Which of the listed activation functions is known for addressing the vanishing gradient problem in deep neural networks?

<p>ReLU (A)</p> Signup and view all the answers

Insanely difficult: Explain the concept of 'transfer learning' in machine learning and provide a specific example of its application.

<p>Transfer learning involves using knowledge gained from solving one problem and applying it to a different but related problem. For example, using a model pre-trained on ImageNet for image classification tasks to improve the performance of a model for medical image analysis with limited labeled data.</p> Signup and view all the answers

Flashcards

Database Schema

A way to organize and manage data in a relational database.

SQL

A language used for managing and manipulating databases.

Record

A single row in a database table.

Primary Key

A set of fields in a table that ensure data integrity.

Signup and view all the flashcards

Foreign Key

A field in one table that refers to the primary key in another table.

Signup and view all the flashcards

Transaction

A unit of work performed in a database.

Signup and view all the flashcards

Data Security

Ensuring that only authorized users can access and modify data.

Signup and view all the flashcards

Encryption

Method of converting data into unreadable format.

Signup and view all the flashcards

Data Integrity

A set of rules ensuring data accuracy in a database.

Signup and view all the flashcards

Data Analysis

The process of examining data to find useful information.

Signup and view all the flashcards

Data Mining

Tools and techniques to extract insights from large datasets.

Signup and view all the flashcards

Big Data

A dataset so large and complex that it becomes difficult to process.

Signup and view all the flashcards

Data Visualization

Visual representations of data.

Signup and view all the flashcards

Document Management System (DMS)

A system that helps to store, categorize, locate, and retrieve documents.

Signup and view all the flashcards

Data Backup

Backing up data to ensure it can be recovered if lost.

Signup and view all the flashcards

Data Recovery

Returning data to its original state after corruption or loss.

Signup and view all the flashcards

More Like This

Types of Machine Learning Algorithms
18 questions
Machine Learning Algorithms Quiz
51 questions
Intro to Machine Learning
10 questions

Intro to Machine Learning

EyeCatchingForeshadowing avatar
EyeCatchingForeshadowing
Use Quizgecko on...
Browser
Browser