Intro to Machine Learning

Choose a study mode

Play Quiz
Study Flashcards
Spaced Repetition
Chat to Lesson

Podcast

Play an AI-generated podcast conversation about this lesson
Download our mobile app to listen on the go
Get App

Questions and Answers

In the context of machine learning, what primarily drives a computer program to be considered as 'learning'?

  • Its capacity to store and retrieve large amounts of data.
  • Its ability to perform pre-programmed tasks without errors.
  • Its improvement in performance at specific tasks (T) as it gains more experience (E), measured by a performance metric (P). (correct)
  • Its speed in executing algorithms, regardless of outcome accuracy.

What is the main goal of machine learning?

  • To create algorithms that solve problems using fixed rules.
  • To build robots that can perform physical tasks.
  • To manually code every possible solution to a problem.
  • To develop systems that can automatically improve their performance based on experience. (correct)

How does machine learning enhance the precision of computer actions, such as predictions or robot control?

  • By increasing the computational speed of processing data.
  • By replacing algorithms with human intuition.
  • By continuously adjusting actions based on feedback to more accurately reflect correct performance. (correct)
  • By eliminating the need for data analysis.

In the context of machine learning, what does 'generalizing' primarily involve?

<p>Identifying similarities across varied scenarios, enabling learned principles to be applied in diverse situations. (B)</p> Signup and view all the answers

Which of the following is a key characteristic of supervised learning?

<p>The algorithm is trained using a dataset with labeled inputs and desired outputs. (A)</p> Signup and view all the answers

If an online store uses machine learning to predict what a customer will buy next, what type of learning is it?

<p>Supervised Learning (A)</p> Signup and view all the answers

What does designing a machine learning learner involve?

<p>Choosing training experience, target function, representation of the target function, and a learning algorithm. (C)</p> Signup and view all the answers

Which of these is an example of a classification problem in supervised learning?

<p>Determining whether an email is spam or not spam. (D)</p> Signup and view all the answers

When is regression typically used in supervised learning?

<p>When the target variable is continuous. (D)</p> Signup and view all the answers

In which scenario would unsupervised learning be most appropriate?

<p>Grouping customers into different segments based on their purchasing behavior. (A)</p> Signup and view all the answers

Which of the following distinguishes unsupervised learning from supervised learning?

<p>Supervised learning uses labeled data, while unsupervised learning does not. (C)</p> Signup and view all the answers

What is the primary goal of clustering in unsupervised learning?

<p>To discover intrinsic groupings within the data. (C)</p> Signup and view all the answers

What does association rule learning aim to identify?

<p>Relationships between variables in a dataset to understand co-occurrence patterns. (A)</p> Signup and view all the answers

What type of machine learning is reinforcement learning?

<p>A type of learning which is somewhere in between supervised and unsupervised learning. (D)</p> Signup and view all the answers

Which of the following describes reinforcement learning?

<p>Learning from trial and error with feedback on correctness but not on how to improve. (C)</p> Signup and view all the answers

In the context of evolutionary learning, what is the role of the 'fitness function'?

<p>To quantify how well an individual solves a problem. (A)</p> Signup and view all the answers

What is the purpose of the 'offspring generation' step in evolutionary learning?

<p>To create new candidate solutions through processes like mutation and crossover. (C)</p> Signup and view all the answers

What is the main purpose of 'feature selection' in the machine learning process?

<p>Identifying the most relevant features for the problem. (C)</p> Signup and view all the answers

What is the purpose of evaluating a machine learning model on data it was not trained on?

<p>To test and evaluate its accuracy on unseen data. (D)</p> Signup and view all the answers

In machine learning terminology, what is 'dimensionality'?

<p>The number of elements in the input vector. (C)</p> Signup and view all the answers

Within the context of machine learning notation, if 'x' is an input vector, what does 'xi' typically represent?

<p>The 'i'th element of the input vector. (A)</p> Signup and view all the answers

In neural networks, what do the weights wij signify?

<p>They represent the strength of connections between nodes. (C)</p> Signup and view all the answers

What information is provided by 'target vectors' in supervised learning?

<p>Correct answers that the algorithm is learning about. (B)</p> Signup and view all the answers

What does the 'activation function' in a neural network determine?

<p>the firing of a neuron. (A)</p> Signup and view all the answers

In machine learning, what does ‘weight space’ refer to?

<p>A representation of the strengths of connection weights, using each weight as an axis. (B)</p> Signup and view all the answers

What is the 'curse of dimensionality' in machine learning?

<p>How the size of the unit hypersphere does not increase with the number of dimensions. (B)</p> Signup and view all the answers

How does 'overfitting' affect model generalization?

<p>It complicates learning about the function. (C)</p> Signup and view all the answers

What is the purpose of using a 'validation set' during training?

<p>To assess how well the model is generalizing at each step. (B)</p> Signup and view all the answers

What is the main purpose of randomly reordering the data?

<p>Address class separation. (B)</p> Signup and view all the answers

How does multi-fold cross-validation help to validate data?

<p>By using multiple subsets, using one as the validation set. (D)</p> Signup and view all the answers

In a confusion matrix, what does the element at position (i, j) represent?

<p>Number of patterns put into class i as the targets. It put class j in the algorithm (C)</p> Signup and view all the answers

In the context of the confusion matrix, what does a 'true positive' indicate?

<p>The model correctly predicts the positive class. (D)</p> Signup and view all the answers

Which measure does Matthew's Correlation Coefficient compute?

<p>Balanced accuracy (B)</p> Signup and view all the answers

How is 'precision' defined in the context of machine learning algorithm variability?

<p>How repeatable algorithm predictions are (A)</p> Signup and view all the answers

What is a 'mean'?

<p>Used average. (A)</p> Signup and view all the answers

How is 'median' used?

<p>It is the middle value. (C)</p> Signup and view all the answers

What function does the mode serve?

<p>Most common. (C)</p> Signup and view all the answers

What is variance?

<p>Sum of squared distances. (A)</p> Signup and view all the answers

If two variables are independent, what is the result of the covariance?

<p>Zero. (B)</p> Signup and view all the answers

Under which circumstance is co-variance positive?

<p>If they both increase. decrease. (A)</p> Signup and view all the answers

When does Mahalanobis Distance take place?

<p>Is tightly controlled. (B)</p> Signup and view all the answers

Flashcards

What is Machine Learning?

Machine learning modifies or adapts computer actions to become more accurate based on data.

Machine Learning Definition

A computer program learns from experience E, improving its performance P at tasks T.

Machine Learning Data

The experiences used to improve performance.

Types of Machine Learning

Supervised, Unsupervised, Reinforcement and Evolutionary Learning.

Signup and view all the flashcards

Supervised Learning

Training with labeled examples to generalize responses to all possible inputs.

Signup and view all the flashcards

Regression

A supervised learning task that fits a mathematical function to datapoints.

Signup and view all the flashcards

Classification

A supervised learning task that assigns inputs to predefined categories.

Signup and view all the flashcards

Unsupervised Learning

Explores unlabeled data to find patterns, without human supervision.

Signup and view all the flashcards

Clustering

A type of unsupervised learning that groups similar data points together.

Signup and view all the flashcards

Association

A type of unsupervised learning that discovers relationships between items.

Signup and view all the flashcards

Reinforcement Learning

Learning by receiving feedback (rewards/penalties) without explicit correction.

Signup and view all the flashcards

Evolutionary Learning

Uses principles of natural selection to optimize problem solutions.

Signup and view all the flashcards

Machine Learning Workflow

Collecting, preparing, choosing algorithm, training, evaluating, and applying model.

Signup and view all the flashcards

Machine Learning Inputs

Data fed into a machine learning algorithm.

Signup and view all the flashcards

Dimensionality

Number of elements in an input vector.

Signup and view all the flashcards

Weights (wij)

Weighted connections between nodes in a neural network.

Signup and view all the flashcards

Targets (t)

The target vector provides correct answers used for supervised learning.

Signup and view all the flashcards

Activation Function

Describes neuron's firing in response to weighted inputs.

Signup and view all the flashcards

Error (E)

Quantifies network inaccuracies based on outputs and targets.

Signup and view all the flashcards

Weight Space

Visualizing neuron connections and strengths in a multi-dimensional space.

Signup and view all the flashcards

Curse of Dimensionality

Volume decreases relative to dimensions, causing data to be sparse.

Signup and view all the flashcards

Overfitting

Learning noise and inaccuracies that prevents generalization.

Signup and view all the flashcards

Training, Testing, Validation Sets

Training data builds the model, validation data tunes it, test data evaluates it.

Signup and view all the flashcards

Multi-fold Cross-Validation

Randomly partitions data into k subsets, validates on one, trains on others.

Signup and view all the flashcards

Confusion Matrix

A table of predicted vs. actual classifications

Signup and view all the flashcards

True positive

Outcome where the model correctly predicts the positive class.

Signup and view all the flashcards

False positive

Outcome where the model incorrectly predicts the positive class.

Signup and view all the flashcards

True negative

Outcome where the model correctly predicts the negative class.

Signup and view all the flashcards

False negative

Outcome where the model incorrectly predicts the negative class.

Signup and view all the flashcards

Accuracy

Percentage of true results (TP + TN) out of total cases.

Signup and view all the flashcards

Sensitivity

TP / (TP + FN)

Signup and view all the flashcards

Specificity

TN / (TN + FP)

Signup and view all the flashcards

Precision

TP / (TP + FP)

Signup and view all the flashcards

Recall

TP / (TP + FN)

Signup and view all the flashcards

Plot of true positives vs. false positives.

ROC Curve

Signup and view all the flashcards

Unbalanced Datasets

Dataset with uneven distribution of classes.

Signup and view all the flashcards

Measurement Precision

Consistency of algorithm predictions on similar inputs.

Signup and view all the flashcards

Mean

Adding all data points and dividing by number of data points.

Signup and view all the flashcards

Median

Middle value in a sorted dataset.

Signup and view all the flashcards

Mode

Most frequent value in a dataset.

Signup and view all the flashcards

Variance

How spread-out the values are.

Signup and view all the flashcards

Standard Deviation

Square root of variance; typical deviation from the mean.

Signup and view all the flashcards

Covariance

Together variables vary.

Signup and view all the flashcards

Bias

Bad model's inaccurate and does not match

Signup and view all the flashcards

Variance

Higher the variation in the results.

Signup and view all the flashcards

Study Notes

Introduction to Machine Learning

  • Machine learning can be applied to disease prediction systems, price estimation, recommendation systems, and time series analysis
  • Suppose you run an online software store and collect data for each purchase such as computer type, web browser, country and time
  • A core problem involves predicting what the next person will buy based on collected data
  • One can solve the prediction problem, given the data that seems to be similar, and similar people often act similarly
  • Supervised learning is a type of machine learning
  • Learning improves performance on a task through remembering, adapting and generalizing
  • Generalizing involves transferring knowledge from one context to use in different situations
  • Machine learning enables computers to modify or adapt to make more accurate predictions, actions and control robots
  • Machine learning accuracy is measured by how well actions reflect correct ones
  • Imagine playing Scrabble against a computer; initially, you might win every game before the computer adapts and learns to beat you
  • Computer learns to beat a player and use the same strategy against other players

Definition of Machine Learning

  • A computer learns from experience E relative to tasks T and performance measure P, if its performance at tasks in T improves with experience E, as measured by P
  • Task: behavior or task to be improved, such as classification, acting in an environment
  • Data: experiences to improve performance in the task
  • Measures of improvement: increasing accuracy in prediction, acquiring new speed and efficiency

How Computer Science, Data Science, and ML relate

  • Computer Science comprises machine learning
  • Machine learning sits at the intersection of computer science, math, and statistics
  • Data analysis, traditional software, data science, and business domain expertise combine with computer science, maths, and statistics

Applications of Machine Learning

  • Machine learning is used in: medicine for disease diagnosis, computer vision, robotic control, natural language processing, speech recognition, machine translation, finance, and fraud detection

Algorithmic vs Machine Learning Solutions

  • algorithmic solutions use data as input to a program to get output
  • Machine learning solution uses data and output to get a program

Learner Design

  • To design a learner: choose the training experience, target function, how to represent the target function, and a learning algorithm to infer it

Types of Machine Learning

  • Supervised Learning
  • Unsupervised Learning
  • Reinforcement Learning
  • Evolutionary Learning

Supervised Learning

  • In supervised learning, a training set of examples with correct responses can allow generalizing correctly to possible inputs
  • Supervised learning is learning from exemplars
  • Supervised learning is represented as set of data, the inputs, the targets and the index suggests lots of pieces of data indexed by i running from 1 to some upper limit N

Supervised learning - Regression

  • Regression assumes values come from a function to find the function, giving the output value of y for any value of x
  • Regression in statistics describes a curve so it passes close to the data-points
  • Regression is a problem of function approximation or interpolation, working out the value between known values

Supervised learning - Classification

  • Classification problem consists of input vectors to decide corresponding classes based on training
  • With classification, each example belongs to only one class when the set of possible output space is complete
  • Two constraints about classification aren't necessarily realistic, since examples might belong partially to two different classes
  • Classification can include classifying if an email is spam, recognize handwritten characters, and classifying user behavior to determine churn

Classification Vs Regression

  • Input features X1, ..., Xn and a target feature Y uses training examples, where the value of the input features is given for each given example
  • The new example values predict target features for a new example, classification of Y is discrete, and regression of Y is continuous

Unsupervised Learning

  • Unsupervised learning is the type of machine learning that happens without human supervision. A machine tries to find any patterns in data by itself.
  • Applicable in clustering and association problems
  • Results may be less accuracte than supervised learning

Types of Unsupervised Learning

  • Applicable to find clustering and associations

Supervised vs Unsupervised Learning

  • With unsupervised, machine learns without human supervision
  • Supervised learning is a type of machine learning that happens under human supervision. Labels input data with answer keys showing a machine the desired outputs

Reinforcement Learning

  • Reinforcement learning is between supervised and unsupervised learning
  • Algorithm is told when the answer is wrong but not how to correct
  • Algorithm must work out how to get the answer right by trial and error
  • Reinforcement learning involves a monitor that scores the answer and suggest improvements with a critic

Evolutionary Learning

  • Evolutionary learning adapts biological organisms to improve survival rates
  • Evolutionary learning draws inspiration from biological evolution, particularly natural selection
  • Evolutionary learning also optimizes solutions to complex problems using algorithms that mimic the process of natural selection

Steps in Evolutionary Learning

  • Generate candidate solutions randomly or heuristically through initialization
  • Evaluate each individual in the population based on its performance in solving the problem, which is done using a fitness function
  • Select individuals with higher fitness values for the next stage of evolution
  • Use selected individuals to generate offspring through processes like mutation, crossover, or recombination for offspring generation
  • Offspring replace some individuals in the existing population by criteria and strategies such as elitism
  • Repeat steps 2-5 for multiple generations until a maximum number is reached

The Machine Learning Workflow

  • The workflow includes data collection, data preparation, choosing learning algorithm, and training
  • Evaluating the model determines the best predictions

The Machine Learning Process

  • Collect and prepare data
  • Identifying the best features for examing the problem through feature selection
  • Choose algorithm along with knowledge of its use
  • Parameter and Model Selection to identify appropriate values either manually or via experimentation
  • Use dataset, algorithm and parameters, alongside computational resources to build a model of the data in order to predict the outputs on new data during training
  • Before deployment evaluate for accuracy on data not trained on

Terminology used in ML

  • Inputs are the data fed into the algorithm
  • Machine learning algorithms take input values to produce an output as an answer for the input vector
  • An input vector consists of real numbers
  • Numbers of vector elements is dimensionality of the input

Notation

  • Vector x has elements (x1, x2, ..., xm) or xi, where i runs from 1 to the number of input dimensions, m
  • wij are the weighted connections between nodes i and j
  • Weights are analogous to synapses in the brain, which are arranged into a matrix W
  • Output vector is y, with elements yj, where j runs from 1 to the number of output dimensions, n
  • y(x,W) is the set of weights of the network
  • Target vector is t, with elements tj, where j runs from 1 to the number of output dimensions, n
  • t are the extra data for supervised learning to 'correct' the answers the algorithm gives
  • Activation function describes the firing of the neuron as a response to the weighted inputs
  • Error (E) compute inaccuracies of the network as a function of outputs (y) and targets (t)
  • Weight Space is thinking about the weights that connect into a particular neuron and plotting the strengths of the weights using one axis for each weight

The Curse of Dimensionality

  • The essence of the curse is the realization that as the number of dimensions increases, the volume of the unit hypersphere does not increase with it
  • Volume of Hypersphere is relative to its dimensions
  • The unit hypersphere is the region we get if we start at the origin and draw all the points that are 1 unit away
  • In 2 dimensions we get a circle, in 3 dimensions its a sphere, and then its a hypersphere for higher dimensions

Overfitting

  • Training too long, can cause overfitting the data, and learning noise and inaccuracies
  • Model becomes complicated and cannot generalize
  • Validate the generalization aspect and learning at each time step, by stopping the learning process before algorithm overfits
  • Validation set of data can be used to validate the learning

Training, Testing and Validation Sets

  • Three datasets are needed: training set to train an algorithm, the validation set to keep track of how well it is doing as it learns, and the test set to produce results
  • 50:25:25 is the proportion of training to testing to validation data
  • Data splitting can matter, some is presented with datapoints in class 1, then class 2 and so on
  • If first few points are training sets, results will be bad, since the training did not see all classes
  • can reorder the data randomly first to assign each data-point randomly

Multi-fold Cross-Validation

  • Dataset randomly partitioned into K subsets and 1 subset as validation and the algorithm trained on the others
  • A new model is trained after a different subset is left out, and repeat the process for all subsets.
  • The model with the lowest validation error is tested and used

The Confusion Matrix

  • Confusion matrix can work out whether the algorithm results were good
  • Confusion matrix is for classification problems and is a square matrix
  • Lists possible classes in both horizontal and vertical directions that can list along the top of a table as outputs and on the left hand side as the targets
  • Matrix element at (i, j) tells us how many input patterns were put into class i by the targets, but class j by the algorithm
  • Algorithm performance measured and compared
  • The elements of the leading diagonal relate to the data classified correctly

Accuracy Metrics - Confusion Matrix

  • A true positive accurately predicts the positive class
  • A true negative accurately predicts the negative class
  • A false positive are inaccurately predicts positive class
  • A false negative inaccurately predicts the negative class

Accuracy Metrics - Example

  • In the context of a wolf example, a "wolf" is a positive class, and "no wolf" is a negative class

Accuracy Definition

  • Accuracy is sum of true positives and negatives divided by total examples:
  • Accuracy = (#TP + #TN) / (#TP + #FP + #TN + #FN)
  • Sensitivity = #TP / (#TP + #FN)
  • Specificity = #TN / (#TN + #FP)
  • Precision = #TP / (#TP + #FP)
  • Recall = #TP / (#TP + #FN)
  • F1 = #TP / (#TP + (#FN + #FP)/2

Receiver Operator Characteristic (ROC) curve

  • ROC curve plots on the y axis the percentage of true positives against false positives on x-axis
  • A single classifier run produces a single point on plot, with point (0,1), 100% true positives and 0% false positives being a perfect classifier
  • Any classifier lying from (0,0) to (1,1) functions as at chance level and positive and negative classes are common, wasting learning effort

Unbalanced Datasets

  • In a balanced dataset assume there are similar numbers of positive and negative examples
  • Balanced accuracy computed with:
  • MCC = (#TP x #TN - #FP x #FN) / sqrt((#TP + #FP)(#TP + #FN)(#TN + #FP)(#TN + #FN))

Measurement Precision

  • Measurement precision evaluates accuracy of a learning system using word "precision" for measurement systems
  • Measurement happens by feeding in inputs and look at the outputs
  • Compare to target value, can measure if fed in a set of similar inputs, so expect similar output
  • Algorithms variability is the precision, that tells how repeatable its predictions are

Averages

  • Mean is the average of a dataset by adding the points and dividing by number of points
  • Median can be found by middle value after sorting the dataset
  • Mode can be found by the most common value and counting times an element appears

Variance and Standard Deviation

  • Variance of numbers in a set is measured by how spread out values are
  • Variance computed with sum of squared distances between each element and expected values of the set (mean, μ)
  • Square root of variance is standard deviation, represented as the sigma symbol

Covariance

  • Covariance is used to look at how two variables vary together
  • Covariance is a measure computed by cov({xi}, {yi}) = E({xi} – μ)E({yi} – ν)
  • If two variables are independent, the covariance is 0 and known to be uncorrelated
  • If both variables both increase and decrease at the same time, the covariance is positive
  • If variables display opposite patterns, covariance is negative

Calculate the average value for each variable

  • μ = (1,245 + 1,415 + 1,312 + 1,427 + 1,510 + 1,590) / 6 => μ = 1,416.5
  • v = (100 + 123 + 129 + 143 + 150 + 197) / 6 => v = 140.3

Steps in finding differences

  • Find the difference between each value and the mean for both variables

Covariance between all pairs of variables in a Dataset.

  • Covariance applied to used to look at the correlation between all pairs of variables within a set of data
  • To compute, it needs to compute each then put together in the convariance matrix
  • Element xi is a column vector describing the elements of the ith variable and μi is its mean
  • Matrix is square, elements from leading elements equal to variances, equal to covariance

Mahalanobis Distance

  • Distances between the center of the data indicate whether something would belong to a dataset
  • This assesses relationship to the spread of the data-points
  • Mahalanobis constructs a distance measure taking account, called the 1936 is written is:
  • DM(x) = sqrt((x – μ)TΣ-1(x – μ))

The Gaussian

  • Gaussian distribution is key because of the Central Limit Theorem which will add up to something Gaussian

The Bias-Variance Tradeoff

  • A model can be bad for 2 reasons: not accurate, not precise/has variation in results
  • First of the reasons is the bias and the second is the variance
  • With more complex classifiers improving the bias results in higher variance
  • Making a model more specific result in reducing variance, increasing the bias

Studying That Suits You

Use AI to generate personalized quizzes and flashcards to suit your learning preferences.

Quiz Team

Related Documents

More Like This

.machine learning
8 questions

.machine learning

AuthoritativeOakland avatar
AuthoritativeOakland
Introduction to AI and Machine Learning
10 questions
Use Quizgecko on...
Browser
Browser