Introduction to ML for Business Applications

Podcast

Play an AI-generated podcast conversation about this lesson

Questions and Answers

What is Machine Learning?

Algorithmic decisions or predictions that are based on data

What is the main difference between Machine Learning and Artificial Intelligence?

Machine Learning is a broader concept that encompasses Artificial Intelligence, as it deals with the creation of machines that can mimic human cognitive functions.
Machine Learning is a subset of Artificial Intelligence that focuses on building algorithms that can learn from data to perform specific tasks. (correct)
Machine Learning focuses on analyzing data to make predictions, while Artificial Intelligence is concerned with creating machines that can reason and think independently.
Machine Learning and Artificial Intelligence are essentially the same thing, with no real distinction between them.

Deep Learning is a type of machine learning that relies on artificial neural networks to perform complex tasks.

True (A)

What are some of the major applications of Machine Learning in the business context?

Fraud detection, recommendations, chatbots, image generation, customer segmentation, and demand/load prediction Signup and view all the answers

Which of the following is NOT a basic machine learning problem?

Optimization (C) Signup and view all the answers

Which type of machine learning problem deals with predicting a continuous output based on historical data?

Regression (C) Signup and view all the answers

Which type of machine learning problem deals with identifying groups or patterns in unlabeled data?

Clustering (B) Signup and view all the answers

What are the three types of features used in Machine Learning?

Categorical, ordinal, and numerical Signup and view all the answers

Categorical features have a natural ordering, allowing for comparisons between different categories.

False (B) Signup and view all the answers

Ordinal features are typically encoded as numbers to represent their ordering, and comparisons between these numbers (<, >, =) are meaningful

True (A) Signup and view all the answers

Normalizing numerical features helps ensure that all features have similar scales, preventing one feature from dominating during training and improving the model's performance.

True (A) Signup and view all the answers

Which type of machine learning problem is particularly relevant when dealing with predicting whether a patient has a particular disease based on their characteristics?

Classification (A) Signup and view all the answers

Which type of machine learning algorithm is often used in recommendation systems to suggest items that similar users have liked?

K-Nearest Neighbors (C) Signup and view all the answers

Which type of machine learning problem is most relevant in identifying communities or groups of users with similar interests or connections within a social network?

Clustering (C) Signup and view all the answers

Which type of machine learning algorithm is particularly effective in reducing the dimensionality of high-dimensional data for pattern recognition?

Self-Organizing Maps (SOM) (B) Signup and view all the answers

Overfitting occurs when a model is too complex and learns the training data too well, resulting in poor performance on new data.

True (A) Signup and view all the answers

What are some ways to mitigate overfitting in machine learning models?

Model simplification, early stopping, regularization, pruning, and ensemble methods Signup and view all the answers

What is the purpose of a validation set in machine learning?

To tune model parameters and prevent overfitting by evaluating the model's performance on unseen data. Signup and view all the answers

What is the purpose of a test set in machine learning?

To provide an unbiased evaluation of the model's ability to generalize to new data after training is complete. Signup and view all the answers

What is the main idea behind k-fold cross-validation, and why is it important in Machine Learning?

K-fold cross-validation is a technique for evaluating the performance of a machine learning model by splitting the dataset into k folds. The model is trained and tested k times, each time using a different fold as the test set. This helps in obtaining a more robust and reliable estimate of the model's performance, reducing the chance of overfitting. Signup and view all the answers

When dealing with unbalanced classes, accuracy is a robust metric for evaluating the performance of a classifier.

False (B) Signup and view all the answers

Which of the following metrics is best suited for evaluating the performance of a model in cases where the costs of false positives and false negatives are unequal or when the classes are unbalanced?

F1-Score (A) Signup and view all the answers

Outliers are data points that have unusual values compared to other data points and can significantly affect a model's performance.

True (A) Signup and view all the answers

What is 'Survivor Bias'?

A type of cognitive bias that occurs when we focus on observations that survive a process, overlooking those that did not survive. (B) Signup and view all the answers

Feature Engineering involves transforming raw data into a set of meaningful and informative features that can be used as input for machine learning models.

True (A) Signup and view all the answers

What is the main goal of Feature Engineering?

To improve the predictive performance of machine learning models by creating features that are meaningful and informative for the task. Signup and view all the answers

What is the main idea behind machine learning?

Algorithmic decisions or predictions that are based on data. Signup and view all the answers

Define artificial intelligence (AI) and its relationship with machine learning (ML).

Artificial intelligence (AI) is an umbrella term for computer software that mimics human cognition to perform complex tasks and learns from them. Signup and view all the answers

What is the term used for the set of data used to train the model in machine learning?

Training set (A) Signup and view all the answers

Which of the following is NOT a core ML problem?

Deep Learning (A) Signup and view all the answers

What type of ML is used to predict continuous-valued outputs based on historical data?

Regression (B) Signup and view all the answers

Which of the following is NOT a common application of ML in a business context?

Sentiment Analysis (A) Signup and view all the answers

What is the difference between descriptive analytics and predictive Analytics?

Descriptive analytics focuses on analyzing historical data to uncover trends and patterns, while predictive analytics utilizes statistical models to forecast future outcomes based on past data. Signup and view all the answers

Define the term 'feature' and 'target variable' in ML.

A feature represents an independent variable or characteristic of a data point, while the target variable is the dependent variable we aim to predict. Signup and view all the answers

Which type of feature represents instances that fall into one category of a set of categories?

Categorical (A) Signup and view all the answers

Which type of feature has a natural ordering among its categories?

Ordinal (B) Signup and view all the answers

Why is normalization often applied to target variables in ML?

Normalization helps standardize target variables to ensure they have a specific range, typically zero mean and unit variance, or a range of [0, 1]. Signup and view all the answers

What is the main purpose of 'overfitting' in ML?

Overfitting occurs when a model becomes too closely tailored to the training data and performs poorly on unseen data. Signup and view all the answers

Which of the following is NOT a solution to mitigate overfitting?

Feature Engineering (D) Signup and view all the answers

What is the main role of the 'validation set' in ML?

It is used during training to tune model parameters and prevents overfitting. Signup and view all the answers

What is the main difference between accuracy and precision in ML?

Accuracy measures the overall percentage of correct predictions, while precision focuses on the proportion of correct positive predictions out of all positive predictions. Signup and view all the answers

Which metric is used to measure the proportion of actual positive instances correctly identified as positive?

Recall (C) Signup and view all the answers

What are 'outliers' in ML data?

Outliers are data points that deviate significantly from the typical patterns observed in a dataset. Signup and view all the answers

Which of the following is NOT a common method to handle outliers in ML data?

Ignore outliers (D) Signup and view all the answers

Define 'Survivor Bias' in ML.

Survivor bias occurs when we focus on observations that survive a process while overlooking those that did not survive, as they are no longer visible. This can lead to misleading conclusions. Signup and view all the answers

How are 'features' used in machine learning?

Features are used to represent data points in a way that is meaningful for the ML algorithm to learn patterns. They can be categorical, ordinal, or numerical. Signup and view all the answers

Explain the main difference in the independence assumption between Naive Bayes and Bayesian networks.

Naive Bayes assumes conditional independence between features, while Bayesian networks explicitly model conditional dependencies among features. Signup and view all the answers

The chain rule of probability allows us to express a joint distribution as a product of local distributions in Bayesian networks.

True (A) Signup and view all the answers

What are the key components of Bayesian networks?

They consist of nodes representing random variables or features, edges representing conditional dependencies between variables, and conditional probability tables quantifying the relationships between connected nodes. Signup and view all the answers

What are the key benefits of Bayesian networks over Naive Bayes models?

They are less restrictive and more flexible, capable of handling complex relationships and dependencies between features, and they can incorporate prior knowledge about variable dependencies. Signup and view all the answers

What is the primary challenge associated with learning Bayesian networks?

The process of learning Bayesian networks is computationally complex, which makes it a challenging task and a subject of ongoing research. Signup and view all the answers

Which of the following is NOT a key takeaway from the summary of the lecture on Naïve Bayes and Bayesian networks?

The concept of a feature vector in machine learning was introduced. (C) Signup and view all the answers

What is the main difference between Naive Bayes and Bayesian networks in terms of handling dependencies?

Naive Bayes assumes dependencies between features are independent, while Bayesian networks allow for complex dependencies between features. Signup and view all the answers

Which of the following ML families is best suited for dealing with complex relationships and dependencies between features?

Bayesian Networks (B) Signup and view all the answers

Explain the role of data preparation in machine learning.

Data preparation is crucial for ensuring that the data is processed and transformed into a format that is usable and suitable for the ML algorithms to learn patterns and make predictions. Signup and view all the answers

Flashcards

What is machine learning?

Algorithmic decisions or predictions based on data, involving training on historical data and applying the learned model to new data.

What is machine learning (ML)?

A subfield of AI that uses trained algorithms on data to create adaptable models for specific tasks.

What is the "AI Winter"?

A period of reduced funding and interest in AI research due to unmet expectations and limited computing power.

What is supervised learning?

Machine learning where algorithms learn from labeled data, meaning each data point has a known outcome associated with it.