Intro to Machine Learning

Choose a study mode

Play Quiz
Study Flashcards
Spaced Repetition
Chat to Lesson

Podcast

Play an AI-generated podcast conversation about this lesson
Download our mobile app to listen on the go
Get App

Questions and Answers

Who coined the term 'Machine Learning'?

  • John McCarthy
  • Arthur Samuel (correct)
  • Geoffrey Hinton
  • Alan Turing

A 'learner' is another term for a machine learning program.

True (A)

Name the four basic components of the learning process.

Data storage, abstraction, generalization, and evaluation

The process of fitting a model to a dataset is known as ______.

<p>training</p> Signup and view all the answers

Match the learning type with its description:

<p>Supervised learning = Learning a function that maps an input to an output based on example input-output pairs. Unsupervised learning = Drawing inferences from datasets consisting of input data without labeled responses. Reinforcement learning = Getting an agent to act in the world so as to maximize its rewards.</p> Signup and view all the answers

Which of the following is a key aspect of 'generalization' in machine learning?

<p>Turning knowledge into a form usable for future tasks (D)</p> Signup and view all the answers

Data mining involves applying machine learning methods to small databases.

<p>False (B)</p> Signup and view all the answers

What is a 'feature' in the context of machine learning datasets?

<p>A recorded property or characteristic of examples</p> Signup and view all the answers

A categorical feature is also known as a ______ feature.

<p>nominal</p> Signup and view all the answers

Which type of data represents categories falling in an ordered list?

<p>Ordinal data (B)</p> Signup and view all the answers

Association rule learning is primarily used for prediction in machine learning.

<p>False (B)</p> Signup and view all the answers

In association rule learning, what does the conditional probability P(Y|X) represent?

<p>The likelihood of buying Y given that X is already bought</p> Signup and view all the answers

In classification, a rule or function used to assign labels to new observations is called a ______.

<p>discriminant</p> Signup and view all the answers

Which machine learning algorithm is commonly used for classification?

<p>k-NN algorithm (B)</p> Signup and view all the answers

A multi-class classification problem involves classifying examples into only two classes.

<p>False (B)</p> Signup and view all the answers

Define a regression problem in machine learning.

<p>Predicting the value of a numeric variable based on observed values</p> Signup and view all the answers

In regression, the mathematical relation between input and output variables is called the ______ function.

<p>regression</p> Signup and view all the answers

Which regression model assumes a linear relationship between one independent variable and the dependent variable?

<p>Simple linear regression (A)</p> Signup and view all the answers

In supervised learning, the training data consists of input data without labeled responses.

<p>False (B)</p> Signup and view all the answers

What is the key characteristic of unsupervised learning?

<p>Data without labeled responses</p> Signup and view all the answers

Discovering hidden patterns or groupings in data is primarily done using ______ analysis in unsupervised learning.

<p>cluster</p> Signup and view all the answers

Which type of learning involves an agent learning to maximize rewards through trial and error?

<p>Reinforcement learning (B)</p> Signup and view all the answers

Reinforcement learning relies on learning from a knowledgeable expert who provides examples.

<p>False (B)</p> Signup and view all the answers

Give an example of how machine learning is applied in the retail sector.

<p>Studying customer behavior</p> Signup and view all the answers

In machine learning, the smallest entity with measured properties of interest for a study is called a ______ of observation.

<p>unit</p> Signup and view all the answers

Which of the following is an example of a 'numeric' feature?

<p>Year (B)</p> Signup and view all the answers

The Apriori algorithm is used for classification problems.

<p>False (B)</p> Signup and view all the answers

What is the purpose of 'evaluation' in the machine learning process?

<p>To measure the utility of the learned knowledge</p> Signup and view all the answers

The process of extracting knowledge from stored data, involving creating general concepts, is known as ______.

<p>abstraction</p> Signup and view all the answers

In the context of spam e-mail identification, what would be considered a 'feature'?

<p>Words used in the messages (B)</p> Signup and view all the answers

Logistic regression is used when the dependent variable is continuous.

<p>False (B)</p> Signup and view all the answers

What is the goal of supervised learning?

<p>To learn a function that maps inputs to outputs based on example input-output pairs</p> Signup and view all the answers

In reinforcement learning, the program must discover which actions yield the most ______ by trying them.

<p>reward</p> Signup and view all the answers

Which industry uses machine learning for network optimization and maximizing the quality of service?

<p>Telecommunications (A)</p> Signup and view all the answers

An 'example' in machine learning always refers to a negative instance.

<p>False (B)</p> Signup and view all the answers

Provide an example of how classification rules can be used for 'compression'.

<p>By fitting a rule to the data, we get an explanation that is simpler than the data</p> Signup and view all the answers

In regression analysis, optimizing the parameters to minimize the approximation error is done by the machine learning ______.

<p>algorithm</p> Signup and view all the answers

What distinguishes reinforcement learning from supervised learning?

<p>Involves maximizing rewards through trial and error (C)</p> Signup and view all the answers

Cluster analysis is a supervised learning method.

<p>False (B)</p> Signup and view all the answers

What is a 'training set' in the context of classification?

<p>A set of data containing observations whose category membership is known</p> Signup and view all the answers

In machine learning, optimizing a performance criterion using example data is a key aspect of the definition of ______.

<p>learning</p> Signup and view all the answers

Flashcards

Machine Learning

The field giving computers the ability to learn without explicit programming.

Machine Learning (Another Definition)

Optimizing a performance criterion using example data or past experience.

Model (in Machine Learning)

A mathematical expression, structure, or set of rules used to represent a real-world process.

Learning (Definition)

Learning from experience E regarding task T, improves performance P.

Signup and view all the flashcards

Data Storage

Facilities for storing and retrieving large amounts of data.

Signup and view all the flashcards

Abstraction

Extracting knowledge and creating general concepts from stored data.

Signup and view all the flashcards

Generalization

Turning knowledge about stored data into a form used for future actions.

Signup and view all the flashcards

Evaluation

Giving feedback to measure the utility of the learned knowledge.

Signup and view all the flashcards

Data Mining

Applying machine learning methods to large databases to build simple, valuable models.

Signup and view all the flashcards

Unit of Observation

The smallest entity with measured properties of interest in a study.

Signup and view all the flashcards

Example (Instance)

An instance of the unit of observation for which properties have been recorded.

Signup and view all the flashcards

Feature (Attribute)

A recorded property or characteristic of examples.

Signup and view all the flashcards

Numeric Data

A characteristic measured in numbers.

Signup and view all the flashcards

Categorical (Nominal) Data

An attribute with a limited number of possible values based on a qualitative property.

Signup and view all the flashcards

Ordinal Data

A nominal variable with categories falling in an ordered list.

Signup and view all the flashcards

Association Rule Learning

Discovering interesting relations between variables in large databases.

Signup and view all the flashcards

Classification

Identifying which category a new observation belongs to, based on a training set.

Signup and view all the flashcards

Discriminant

A rule or function used to assign labels to new observations in a classification problem.

Signup and view all the flashcards

Regression

Predicting the value of a numeric variable based on observed values.

Signup and view all the flashcards

Regression Function

A mathematical relation between input variables and an output variable in regression.

Signup and view all the flashcards

Supervised Learning

Learning a function that maps an input to an output based on example input-output pairs.

Signup and view all the flashcards

Unsupervised Learning

Drawing inferences from datasets consisting of input data without labeled responses.

Signup and view all the flashcards

Reinforcement Learning

Getting an agent to act in the world to maximize its rewards.

Signup and view all the flashcards

Study Notes

  • Machine learning empowers computers to learn from data without explicit programming.

Machine Learning Definitions

  • Machine learning involves programming computers to optimize performance using data or experience.
  • Learning enhances computer programs automatically through experience.
  • A model can be a mathematical expression, equation, graph, rule set, or any structure used for prediction, description, or knowledge extraction from data.

The Essence of Learning in Machine Learning

  • A program learns if its performance (P) on a task (T) improves with experience (E).
  • Example: Handwriting recognition involves recognizing words in images (T), measured by correct classification rate (P), using a dataset of handwritten words (E).
  • Example: Robot driving learns to navigate highways (T), measured by distance traveled before error (P), using image and steering data from a human driver (E).
  • Example: Chess playing improves win rate (P) against opponents (T) by practicing against itself (E).
  • A machine learning program, also known as a learner, improves from experience.

Machine Learning Process

  • Data storage, abstraction, generalization, and evaluation are the four fundamental components.

Components of the Learning Process

  • Data storage is a core component, enabling advanced reasoning through storing and retrieving large datasets.
  • Abstraction extracts knowledge from stored data by creating general concepts and models, including training models on datasets.
  • Generalization turns knowledge into a form applicable to similar future tasks by identifying relevant data properties.
  • Evaluation provides feedback to measure the learned knowledge's utility and drive improvements.

Machine Learning Applications

  • Applying machine learning to large databases is known as data mining, constructing simple, valuable models.
  • Retail: Studying consumer behavior.
  • Finance: Building models for credit applications, fraud detection, and stock market analysis.
  • Manufacturing: Optimization, control, and troubleshooting.
  • Medicine: Medical diagnosis.
  • Telecommunications: Network optimization and service quality maximization through call pattern analysis.
  • Science: Analyzing large datasets in physics, astronomy, and biology.
  • Artificial intelligence: Teaching systems to adapt without pre-programmed solutions.
  • Vision, speech recognition, and robotics: Finding solutions to complex problems.
  • Computer-controlled vehicles: Steering correctly on various roads.
  • Games: Developing programs for chess, backgammon, and Go.

Data Types and Forms

  • Unit of observation: The smallest entity with measured properties of interest.
  • Examples: person, object, time point, geographic region, or measurement.

Examples and Features

  • "Example": A recorded instance of the unit of observation, also known as an "instance," "case," or "record."
  • "Feature": A recorded property or characteristic of examples, also known as "attribute" or "variable."
  • Cancer detection includes patients as units, cancer patients as examples, and gender, age, blood pressure, and pathology reports as features.
  • Pet selection has persons as units, pet owners as examples, and age, home region, and family income as features.
  • Spam email identification uses email messages as units, specific messages as examples, and words used in the messages as features.
  • Examples and features are commonly organized in a matrix format.

Data Forms

  • Numeric data: Features measured in numbers.
  • Categorical data: Attributes with a limited number of values based on qualitative properties. Also referred to as "nominal" data.
  • Ordinal data: Nominal variables with categories in a specific order.
  • Example: "year," "price," and "mileage" are numeric, while "model," "color," and "transmission" are categorical.

Machine Learning Problem Classes

  • Learning associations, classification, and regression are fundamental classes.

Learning Associations

  • Association rule learning discovers interesting relationships between variables in large databases, called "association rules."
  • Supermarket chain analysis identifies patterns in customer purchases, such as customers buying onions and potatoes also buying hamburger.
  • Association rules takes the form X ⇒ Y, for "if people buy X then they also buy Y."
  • Rules like these are used for cross-selling, promotional pricing, and product placement
  • The conditional probability of the form P(Y ∣X) estimates the likelihood of a customer buying product Y, given they have already bought product X.
  • Factors like customer attributes (gender, age, and marital status) can be expressed by P(Y ∣X, D).
  • Algorithms for generating association rules: Apriori, Eclat, and FP-Growth.

Classification

  • In machine learning, classification identifies the category a new observation belongs to, using a training set of data with known category memberships.
  • A problem can be posed as follows: If we have some new data, say “Score1 = 25” and “Score2 = 36”, what value should be assigned to “Result” corresponding to the new data; in other words, to which of the two categories or classes the new observation should be assigned?
  • Optical character recognition, face recognition, and speech recognition are real-life examples.
  • In medical diagnosis, inputs are patient information, and classes are illnesses.
  • Classification rules aid knowledge extraction, compression, and various decision-making processes.

Classification Rules

  • Rules can be used to classify patients as low-risk or high-risk based on variables like blood pressure and age.
  • Credit card companies classify applicants based on annual salary and age.
  • Astronomers label distant objects as stars, galaxies, or nebulas using digital images.
  • A discriminant is a rule or function that assigns labels to new observations.
  • Discriminant Example:
  • IF Score1 + Score2 ≥ 60, THEN “Pass” ELSE “Fail”.
  • IF Score1 ≥ 20 AND Score2 ≥ 40 THEN “Pass” ELSE “Fail”.
  • Examples:
  • Logistic regression
  • Naive Bayes algorithm
  • k-NN algorithm
  • Decision tree algorithm
  • Support vector machine algorithm
  • Random forest algorithm

Classes

  • Classification problems classify examples into categories.
  • Classifications involve real-valued or discrete input variables.
  • Two-class problems are called "binary classification," and those with more than two classes are "multi-class."
  • Assigning multiple classes to an example is known as "multi-label classification."

Regression

  • Predictions: Predicting numeric variable values based on observations.
  • Prediction values: integers or floating-point numbers.
  • Input: Discrete or real-valued independent values.

General Approach

  • A model presents a mathematical relation where y = f(x, θ).
  • "f(x, θ)" = regression function.
  • Machine learning algorithm optimizes the parameters in the set θ, minimizing the error.
  • Example:

Price = a0 + a1 × (Age) + a2 × (Distance) + a3 × (Weight)

Regression Models

  • Simple Linear Regression: one continuous independent variable
  • Multivariate Linear Regression: more than one independent variable
  • Polynomial Regression: one continuous independent variable x
  • Logistic Regression: dependent variable is binary

Types of Learning

  • In general, machine learning algorithms are broken into three types of learning: Supervised, unsupervised, and reinforcement

Supervised Learning

  • Task: Learning function to map input to output using input-output pairs.
  • Training: Each example is a pair of input and desired output.
  • Process: Trains data, produces a function to map new examples, and determines correct class labels for new instances.
  • Both classification and regression problems are supervised learning problems.
  • Supervised learning can be thought of as a teacher guiding the learning process with correct answers.
  • An algorithm iteratively predicts the training data, is corrected and then stops after achieving the required performance.

Unsupervised Learning

  • Task: Drawing inferences from input data without labeled responses.
  • Process: Classification or categorization is not included in the observations. There are no output values which means there is no estimation of functions
  • Common Method: cluster analysis, used for exploratory data analysis to find hidden patterns or grouping in data.

Reinforcement Learning

  • Task: Maximizing rewards by getting an agent to act in the world.
  • Process: Discover which actions yield the most reward. Actions may affect not only the immediate reward but also situations and, through that, all subsequent rewards.
  • Method: Teach a dog a new trick. You cannot tell it what to do, but you can reward/punish it if it does the right/wrong thing. It has to find out what it did that made it get the reward/punishment. A similar method is used to train computers to do many tasks, such as playing backgammon or chess, scheduling jobs, and controlling robot limbs.
  • Reinforcement learning teaches from trial and error, differing from supervised learning's expert-provided examples.

Studying That Suits You

Use AI to generate personalized quizzes and flashcards to suit your learning preferences.

Quiz Team

More Like This

Use Quizgecko on...
Browser
Browser