Introduction to ML for Business Applications
55 Questions
0 Views

Choose a study mode

Play Quiz
Study Flashcards
Spaced Repetition
Chat to Lesson

Podcast

Play an AI-generated podcast conversation about this lesson

Questions and Answers

What is Machine Learning?

Algorithmic decisions or predictions that are based on data

What is the main difference between Machine Learning and Artificial Intelligence?

  • Machine Learning is a broader concept that encompasses Artificial Intelligence, as it deals with the creation of machines that can mimic human cognitive functions.
  • Machine Learning is a subset of Artificial Intelligence that focuses on building algorithms that can learn from data to perform specific tasks. (correct)
  • Machine Learning focuses on analyzing data to make predictions, while Artificial Intelligence is concerned with creating machines that can reason and think independently.
  • Machine Learning and Artificial Intelligence are essentially the same thing, with no real distinction between them.

Deep Learning is a type of machine learning that relies on artificial neural networks to perform complex tasks.

True (A)

What are some of the major applications of Machine Learning in the business context?

<p>Fraud detection, recommendations, chatbots, image generation, customer segmentation, and demand/load prediction</p> Signup and view all the answers

Which of the following is NOT a basic machine learning problem?

<p>Optimization (C)</p> Signup and view all the answers

Which type of machine learning problem deals with predicting a continuous output based on historical data?

<p>Regression (C)</p> Signup and view all the answers

Which type of machine learning problem deals with identifying groups or patterns in unlabeled data?

<p>Clustering (B)</p> Signup and view all the answers

What are the three types of features used in Machine Learning?

<p>Categorical, ordinal, and numerical</p> Signup and view all the answers

Categorical features have a natural ordering, allowing for comparisons between different categories.

<p>False (B)</p> Signup and view all the answers

Ordinal features are typically encoded as numbers to represent their ordering, and comparisons between these numbers (<, >, =) are meaningful

<p>True (A)</p> Signup and view all the answers

Normalizing numerical features helps ensure that all features have similar scales, preventing one feature from dominating during training and improving the model's performance.

<p>True (A)</p> Signup and view all the answers

Which type of machine learning problem is particularly relevant when dealing with predicting whether a patient has a particular disease based on their characteristics?

<p>Classification (A)</p> Signup and view all the answers

Which type of machine learning algorithm is often used in recommendation systems to suggest items that similar users have liked?

<p>K-Nearest Neighbors (C)</p> Signup and view all the answers

Which type of machine learning problem is most relevant in identifying communities or groups of users with similar interests or connections within a social network?

<p>Clustering (C)</p> Signup and view all the answers

Which type of machine learning algorithm is particularly effective in reducing the dimensionality of high-dimensional data for pattern recognition?

<p>Self-Organizing Maps (SOM) (B)</p> Signup and view all the answers

Overfitting occurs when a model is too complex and learns the training data too well, resulting in poor performance on new data.

<p>True (A)</p> Signup and view all the answers

What are some ways to mitigate overfitting in machine learning models?

<p>Model simplification, early stopping, regularization, pruning, and ensemble methods</p> Signup and view all the answers

What is the purpose of a validation set in machine learning?

<p>To tune model parameters and prevent overfitting by evaluating the model's performance on unseen data.</p> Signup and view all the answers

What is the purpose of a test set in machine learning?

<p>To provide an unbiased evaluation of the model's ability to generalize to new data after training is complete.</p> Signup and view all the answers

What is the main idea behind k-fold cross-validation, and why is it important in Machine Learning?

<p>K-fold cross-validation is a technique for evaluating the performance of a machine learning model by splitting the dataset into k folds. The model is trained and tested k times, each time using a different fold as the test set. This helps in obtaining a more robust and reliable estimate of the model's performance, reducing the chance of overfitting.</p> Signup and view all the answers

When dealing with unbalanced classes, accuracy is a robust metric for evaluating the performance of a classifier.

<p>False (B)</p> Signup and view all the answers

Which of the following metrics is best suited for evaluating the performance of a model in cases where the costs of false positives and false negatives are unequal or when the classes are unbalanced?

<p>F1-Score (A)</p> Signup and view all the answers

Outliers are data points that have unusual values compared to other data points and can significantly affect a model's performance.

<p>True (A)</p> Signup and view all the answers

What is 'Survivor Bias'?

<p>A type of cognitive bias that occurs when we focus on observations that survive a process, overlooking those that did not survive. (B)</p> Signup and view all the answers

Feature Engineering involves transforming raw data into a set of meaningful and informative features that can be used as input for machine learning models.

<p>True (A)</p> Signup and view all the answers

What is the main goal of Feature Engineering?

<p>To improve the predictive performance of machine learning models by creating features that are meaningful and informative for the task.</p> Signup and view all the answers

What is the main idea behind machine learning?

<p>Algorithmic decisions or predictions that are based on data.</p> Signup and view all the answers

Define artificial intelligence (AI) and its relationship with machine learning (ML).

<p>Artificial intelligence (AI) is an umbrella term for computer software that mimics human cognition to perform complex tasks and learns from them.</p> Signup and view all the answers

What is the term used for the set of data used to train the model in machine learning?

<p>Training set (A)</p> Signup and view all the answers

Which of the following is NOT a core ML problem?

<p>Deep Learning (A)</p> Signup and view all the answers

What type of ML is used to predict continuous-valued outputs based on historical data?

<p>Regression (B)</p> Signup and view all the answers

Which of the following is NOT a common application of ML in a business context?

<p>Sentiment Analysis (A)</p> Signup and view all the answers

What is the difference between descriptive analytics and predictive Analytics?

<p>Descriptive analytics focuses on analyzing historical data to uncover trends and patterns, while predictive analytics utilizes statistical models to forecast future outcomes based on past data.</p> Signup and view all the answers

Define the term 'feature' and 'target variable' in ML.

<p>A feature represents an independent variable or characteristic of a data point, while the target variable is the dependent variable we aim to predict.</p> Signup and view all the answers

Which type of feature represents instances that fall into one category of a set of categories?

<p>Categorical (A)</p> Signup and view all the answers

Which type of feature has a natural ordering among its categories?

<p>Ordinal (B)</p> Signup and view all the answers

Why is normalization often applied to target variables in ML?

<p>Normalization helps standardize target variables to ensure they have a specific range, typically zero mean and unit variance, or a range of [0, 1].</p> Signup and view all the answers

What is the main purpose of 'overfitting' in ML?

<p>Overfitting occurs when a model becomes too closely tailored to the training data and performs poorly on unseen data.</p> Signup and view all the answers

Which of the following is NOT a solution to mitigate overfitting?

<p>Feature Engineering (D)</p> Signup and view all the answers

What is the main role of the 'validation set' in ML?

<p>It is used during training to tune model parameters and prevents overfitting.</p> Signup and view all the answers

What is the main difference between accuracy and precision in ML?

<p>Accuracy measures the overall percentage of correct predictions, while precision focuses on the proportion of correct positive predictions out of all positive predictions.</p> Signup and view all the answers

Which metric is used to measure the proportion of actual positive instances correctly identified as positive?

<p>Recall (C)</p> Signup and view all the answers

What are 'outliers' in ML data?

<p>Outliers are data points that deviate significantly from the typical patterns observed in a dataset.</p> Signup and view all the answers

Which of the following is NOT a common method to handle outliers in ML data?

<p>Ignore outliers (D)</p> Signup and view all the answers

Define 'Survivor Bias' in ML.

<p>Survivor bias occurs when we focus on observations that survive a process while overlooking those that did not survive, as they are no longer visible. This can lead to misleading conclusions.</p> Signup and view all the answers

How are 'features' used in machine learning?

<p>Features are used to represent data points in a way that is meaningful for the ML algorithm to learn patterns. They can be categorical, ordinal, or numerical.</p> Signup and view all the answers

Explain the main difference in the independence assumption between Naive Bayes and Bayesian networks.

<p>Naive Bayes assumes conditional independence between features, while Bayesian networks explicitly model conditional dependencies among features.</p> Signup and view all the answers

The chain rule of probability allows us to express a joint distribution as a product of local distributions in Bayesian networks.

<p>True (A)</p> Signup and view all the answers

What are the key components of Bayesian networks?

<p>They consist of nodes representing random variables or features, edges representing conditional dependencies between variables, and conditional probability tables quantifying the relationships between connected nodes.</p> Signup and view all the answers

What are the key benefits of Bayesian networks over Naive Bayes models?

<p>They are less restrictive and more flexible, capable of handling complex relationships and dependencies between features, and they can incorporate prior knowledge about variable dependencies.</p> Signup and view all the answers

What is the primary challenge associated with learning Bayesian networks?

<p>The process of learning Bayesian networks is computationally complex, which makes it a challenging task and a subject of ongoing research.</p> Signup and view all the answers

Which of the following is NOT a key takeaway from the summary of the lecture on Naïve Bayes and Bayesian networks?

<p>The concept of a feature vector in machine learning was introduced. (C)</p> Signup and view all the answers

What is the main difference between Naive Bayes and Bayesian networks in terms of handling dependencies?

<p>Naive Bayes assumes dependencies between features are independent, while Bayesian networks allow for complex dependencies between features.</p> Signup and view all the answers

Which of the following ML families is best suited for dealing with complex relationships and dependencies between features?

<p>Bayesian Networks (B)</p> Signup and view all the answers

Explain the role of data preparation in machine learning.

<p>Data preparation is crucial for ensuring that the data is processed and transformed into a format that is usable and suitable for the ML algorithms to learn patterns and make predictions.</p> Signup and view all the answers

Flashcards

What is machine learning?

Algorithmic decisions or predictions based on data, involving training on historical data and applying the learned model to new data.

What is machine learning (ML)?

A subfield of AI that uses trained algorithms on data to create adaptable models for specific tasks.

What is the "AI Winter"?

A period of reduced funding and interest in AI research due to unmet expectations and limited computing power.

What is supervised learning?

Machine learning where algorithms learn from labeled data, meaning each data point has a known outcome associated with it.

Signup and view all the flashcards

What is classification?

A type of supervised learning where the model predicts a categorical outcome, such as classifying an email as spam or not spam.

Signup and view all the flashcards

What is regression?

A type of supervised learning where the model predicts a continuous numerical outcome, such as predicting the price of a house based on its features.

Signup and view all the flashcards

What is unsupervised learning?

Machine learning where algorithms learn from unlabeled data, meaning the data doesn't have pre-defined outcomes.

Signup and view all the flashcards

What is clustering?

A type of unsupervised learning where the model groups data points into clusters based on their similarity.

Signup and view all the flashcards

What is feature engineering?

The process of selecting, creating, and transforming features to improve the performance of machine learning algorithms.

Signup and view all the flashcards

What is overfitting?

A situation where a machine learning model performs well on the training data but poorly on unseen data, indicating that it has learned the training data too well and cannot generalize to new examples.

Signup and view all the flashcards

What is underfitting?

A situation where a machine learning model performs poorly on both training and test data, indicating that it is too simple and cannot capture the underlying patterns in the data.

Signup and view all the flashcards

What is the training set?

A subset of data used to train the machine learning model.

Signup and view all the flashcards

What is the validation set?

A subset of data used to tune the parameters of the machine learning model during training and prevent overfitting.

Signup and view all the flashcards

What is the test set?

A subset of data used to evaluate the performance of the trained machine learning model on unseen data.

Signup and view all the flashcards

What is cross-validation?

A technique to evaluate a model's performance by splitting the dataset into multiple training and test sets.

Signup and view all the flashcards

What is a false negative?

A type of error that occurs when a model incorrectly predicts a positive instance as a negative instance.

Signup and view all the flashcards

What is a false positive?

A type of error that occurs when a model incorrectly predicts a negative instance as a positive instance.

Signup and view all the flashcards

What is precision?

The percentage of true positive predictions among all predictions where the model predicted a positive outcome.

Signup and view all the flashcards

What is recall?

The percentage of true positive predictions among all actual positive instances.

Signup and view all the flashcards

What is the F1-score?

A single metric that combines precision and recall, providing a balanced measure of model performance.

Signup and view all the flashcards

What are outliers?

Data points that are significantly different from other data points in the same class, potentially affecting the performance of machine learning models.

Signup and view all the flashcards

What is survivor bias?

A cognitive bias where we focus on surviving examples while overlooking those that didn't survive, potentially leading to incorrect conclusions.

Signup and view all the flashcards

What is model evaluation?

The process of assessing the performance of a trained machine learning model to understand its accuracy, strengths, and weaknesses.

Signup and view all the flashcards

What is binary classification?

A type of classification where there are only two possible categories or classes to predict.

Signup and view all the flashcards

What is multi-class classification?

A type of classification where there are more than two possible categories or classes to predict.

Signup and view all the flashcards

What is a linear model?

A type of statistical model that predicts a numerical outcome by finding a linear relationship between the input features and the target variable.

Signup and view all the flashcards

What is normalization?

The process of changing data values to a common scale, ensuring that features with different ranges don't dominate the learning process.

Signup and view all the flashcards

What is the y-axis intercept?

The value that represents the starting point of the linear relationship, where the line intersects the vertical axis.

Signup and view all the flashcards

What is the slope?

A measure of the steepness of the line in a linear model, representing the rate of change of the target variable with respect to the input feature.

Signup and view all the flashcards

What is the residual error?

The difference between the actual value of the target variable and the predicted value, representing the model's error in making predictions.

Signup and view all the flashcards

Probabilistic Models

Models that use probability distributions to predict outcomes and quantify uncertainty by leveraging the principles of probability theory.

Signup and view all the flashcards

Probability

A measure of how likely an event is to occur.

Signup and view all the flashcards

Conditional Probability

The likelihood of an event happening given that another event has already occurred.

Signup and view all the flashcards

Independent Events

Two events that don't influence each other. Knowing the outcome of one event doesn't change the probability of the other event.

Signup and view all the flashcards

Dependent Events

Two events that do influence each other. Knowing the outcome of one event does change the probability of the other event.

Signup and view all the flashcards

Bayes' Theorem

A formula that updates the probability of a hypothesis based on new evidence. It takes into account both the initial probability (prior) and how likely the new evidence is.

Signup and view all the flashcards

Naïve Bayes Classifier

A probabilistic classifier based on Bayes' Theorem that assumes all features are equally important and independent of each other.

Signup and view all the flashcards

Independence Assumption

The assumption that all features in a Naïve Bayes model are independent of each other, meaning knowing the value of one feature doesn't tell you anything about the value of another feature.

Signup and view all the flashcards

Laplace Smoothing

A way to handle cases where a feature has never been observed with a particular class. It adds a small value to all counts to avoid zero probabilities.

Signup and view all the flashcards

Bayesian Belief Network

A graphical model that represents probabilistic relationships between features using a directed acyclic graph (DAG). It captures relationships between features that are not necessarily independent.

Signup and view all the flashcards

Directed Acyclic Graph (DAG)

A directed graph where there are no cycles. The graph can be traversed in one direction without looping back to a previously visited node.

Signup and view all the flashcards

Posterior Probability

The probability of an event occurring given the occurrence of another event.

Signup and view all the flashcards

Conditional Probability Table (CPT)

A probability table that shows the probability of an event (dependent variable) for every possible combination of its determining variables.

Signup and view all the flashcards

Zero-frequency Problem

A situation in Naïve Bayes when a particular feature has never been observed with a particular class, leading to zero probability. Laplace smoothing is used to handle this.

Signup and view all the flashcards

Conditional Independence Assumption for Missing Values

A method used to handle missing values in a Naïve Bayes model by computing the likelihood based on known features and ignoring missing ones.

Signup and view all the flashcards

Assumed Independence in Naïve Bayes

A situation where a Naïve Bayes model incorrectly assumes that all features are independent, leading to potential inaccuracies in predictions.

Signup and view all the flashcards

Feature Modelling

A process used to create a new set of features that are more informative and relevant to the task. This can help to improve the performance of Naïve Bayes models.

Signup and view all the flashcards

Generalization

The ability of a model to generalize to new, unseen data. A model that overfits is likely to perform poorly on unseen data because it has learned the training data too well.

Signup and view all the flashcards

Precision

A measure of how many true positive predictions are made out of all predictions where the model predicted a positive outcome.

Signup and view all the flashcards

Recall

A measure of how many true positive predictions are made out of all actual positive examples.

Signup and view all the flashcards

F1 Score

A single metric that combines precision and recall to provide a balanced measure of a model's performance.

Signup and view all the flashcards

Cross-validation

A method used to estimate the model's performance by splitting the dataset into multiple training and test sets.

Signup and view all the flashcards

False Negative

A type of error that occurs when a model incorrectly predicts a positive instance as a negative instance.

Signup and view all the flashcards

False Positive

A type of error that occurs when a model incorrectly predicts a negative instance as a positive instance.

Signup and view all the flashcards

Sample Space

The set of all possible values that a variable can take on.

Signup and view all the flashcards

Relative Frequency

The probability of an event happening divided by the total number of possible outcomes.

Signup and view all the flashcards

Standard Deviation

A measure of the variability or spread of data around the mean.

Signup and view all the flashcards

Linear Model

A statistical model that assumes a linear relationship between the input features and the target variable. It's often used for regression tasks.

Signup and view all the flashcards

Study Notes

Course Information

  • Course title: Machine Learning for Business Applications
  • Lecture: Introduction to ML for BA – Lecture A.0
  • Instructor: Prof. Dr. Maximilian Schiffer
  • Department: Professorship of Business Analytics & Intelligent Systems, TUM School of Management
  • Institute: Munich Data Science Institute
  • Semester: Winter 2024/25

Agenda

  • Motivation
  • Basics of Machine Learning
  • Essentials & Training Strategies

What is Machine Learning?

  • Machine Learning = algorithmic decisions or predictions based on data
  • Training phase: based on historic data
  • Application/Inference phase: based on new data

Artificial Intelligence & Machine Learning

  • Artificial Intelligence (AI): umbrella term for computer software mimicking human cognition
  • Machine Learning (ML): a subfield of AI using algorithms trained on data to create adaptable models performing specific tasks

Introduction to ML - History

  • 1940-1950: Early Days (Boolean circuit model of brain, Turing's "Computing Machinery and Intelligence")
  • 1950-1970: Excitement (Early AI programs, Dartmouth meeting, algorithms for logical reasoning)
  • 1970-1990: Knowledge-based approaches (AI winter, expert systems)
  • 2000-2020: High Performance Computing (Big Data & Deep Learning)

Introduction to ML - Overview

  • Supervised Learning: Classification, Regression
  • Unsupervised Learning: Clustering
  • Reinforcement Learning

Machine Learning in the Business Context

  • Fraud Detection
  • Recommendations
  • Chatbots
  • Image Generation
  • Customer Segmentation
  • Image Recognition
  • Demand/Load Prediction
  • Predictive Maintenance
  • Predictive Supply Chain Management
  • Personalized Marketing Campaigns

Scope of the Course

  • Introduction to Machine Learning for Business Applications
  • Naive Bayes & Bayesian Networks
  • Decision Trees
  • Clustering
  • Regression
  • Neural Networks
  • Data Preparation, Generalization & Evaluation
  • Recap & Exam Preparation

From Data to Information

  • Data Consolidation
  • Selection and Preprocessing
  • Prediction
  • Interpretation & Evaluation

Focus of this Course

  • Descriptive Analytics (Analysis of historical data)
  • Predictive Analytics (Use statistical models to forecast)
  • Prescriptive Analytics (Recommend actions to optimize)

Datasets: Features and Target Variables

  • Dataset D = {(xᵢ, yᵢ)}₁
  • xᵢ: K-dimensional feature vector (independent variable)
  • yᵢ: respective target variable (dependent variable)

Feature Types

  • Categorical Features
  • Ordinal Features
  • Numerical Features

Excursion: Normalization

  • Rescaling numerical data to similar scales preventing one feature dominating

Credit Scoring - Features and Target Variables

  • Numerical Features (Loan Amount, Disposable Income)
  • Ordinal Features (Savings, Employment)
  • Categorical Features (Purpose of loan, housing)

Three basic ML Problems

  • Classification (Predicting categories from existing data)
  • Regression (Predicting continuous values from historic data)
  • Clustering (Finding patterns in data without associated labels)

Classification - Example

  • Creating statistical models to determine label for new observations

Regression - Example

  • Creating statistical models that allow predictions for numerical labels based on existing data

Clustering - Example

  • Grouping data into clusters

Classification - Notation

  • A mapping y = f(x) from input vectors to outputs (binary classification, C = 2)

Regression - Notation

  • Mapping y=f(x) with y being continuous values

Clustering - Notation

  • Data set D = {xᵢ}₁, set of K clusters, zᵢ ∈ {1, .., K} representing cluster

Feature Engineering - Example

  • Describing states using a vector of features (properties), example: Distance to closest ghost

Feature Engineering

  • Transforming raw data into meaningful features to improve ML algorithm performance

Overfitting

  • Model performs well on training data but poorly on new data
  • Model very targeted for training data, hard to generalize.
  • Model simplification, regularization, early stopping, pruning, ensemble methods solve this

Training, Validation, and Testing

  • Training Set: data used to train the model
  • Validation Set: separate subset for tuning parameters and prevent overfitting
  • Test Set: for assessing model performance on new data

Excursion: Cross Validation

  • Technique used for evaluating model performance with multiple training and test sets
  • Usually 10-fold cross validation (Data split into 10 subsets)

Accuracy and Un-balanced Classes

  • Accuracy/Error rate is not good measurement for imbalanced classes
  • Better metrics include precision, recall, and F1-score

Outliers in the Data

  • Isolated instances that are unlike other instances
  • Dealing with outliers is very important
  • Methods of handling outliers: Removal, identification and fixing

Survivor Bias

  • Cognitive bias overlooking observations that did not survive a process

Recap Introduction

  • Key takeaways- Machine learning, historical development and types
  • Classification, clustering and regression problems and their implementations
  • Next topic: basic probabilities, conditional probabilities, and classification of new observations

Studying That Suits You

Use AI to generate personalized quizzes and flashcards to suit your learning preferences.

Quiz Team

Description

This quiz covers the foundational concepts of Machine Learning (ML) and its applications in business. Participants will explore motivation, training strategies, and the relationship between ML and Artificial Intelligence. Get ready to deepen your understanding of these essential topics for today's data-driven environment.

More Like This

Lec 8 non CNN
6 questions

Lec 8 non CNN

FirstRatePlumTree avatar
FirstRatePlumTree
تطبيقات الذكاء الاصطناعي
10 questions
Use Quizgecko on...
Browser
Browser