Podcast
Questions and Answers
In the context of machine learning, what primarily drives a computer program to be considered as 'learning'?
In the context of machine learning, what primarily drives a computer program to be considered as 'learning'?
- Its capacity to store and retrieve large amounts of data.
- Its ability to perform pre-programmed tasks without errors.
- Its improvement in performance at specific tasks (T) as it gains more experience (E), measured by a performance metric (P). (correct)
- Its speed in executing algorithms, regardless of outcome accuracy.
What is the main goal of machine learning?
What is the main goal of machine learning?
- To create algorithms that solve problems using fixed rules.
- To build robots that can perform physical tasks.
- To manually code every possible solution to a problem.
- To develop systems that can automatically improve their performance based on experience. (correct)
How does machine learning enhance the precision of computer actions, such as predictions or robot control?
How does machine learning enhance the precision of computer actions, such as predictions or robot control?
- By increasing the computational speed of processing data.
- By replacing algorithms with human intuition.
- By continuously adjusting actions based on feedback to more accurately reflect correct performance. (correct)
- By eliminating the need for data analysis.
In the context of machine learning, what does 'generalizing' primarily involve?
In the context of machine learning, what does 'generalizing' primarily involve?
Which of the following is a key characteristic of supervised learning?
Which of the following is a key characteristic of supervised learning?
If an online store uses machine learning to predict what a customer will buy next, what type of learning is it?
If an online store uses machine learning to predict what a customer will buy next, what type of learning is it?
What does designing a machine learning learner involve?
What does designing a machine learning learner involve?
Which of these is an example of a classification problem in supervised learning?
Which of these is an example of a classification problem in supervised learning?
When is regression typically used in supervised learning?
When is regression typically used in supervised learning?
In which scenario would unsupervised learning be most appropriate?
In which scenario would unsupervised learning be most appropriate?
Which of the following distinguishes unsupervised learning from supervised learning?
Which of the following distinguishes unsupervised learning from supervised learning?
What is the primary goal of clustering in unsupervised learning?
What is the primary goal of clustering in unsupervised learning?
What does association rule learning aim to identify?
What does association rule learning aim to identify?
What type of machine learning is reinforcement learning?
What type of machine learning is reinforcement learning?
Which of the following describes reinforcement learning?
Which of the following describes reinforcement learning?
In the context of evolutionary learning, what is the role of the 'fitness function'?
In the context of evolutionary learning, what is the role of the 'fitness function'?
What is the purpose of the 'offspring generation' step in evolutionary learning?
What is the purpose of the 'offspring generation' step in evolutionary learning?
What is the main purpose of 'feature selection' in the machine learning process?
What is the main purpose of 'feature selection' in the machine learning process?
What is the purpose of evaluating a machine learning model on data it was not trained on?
What is the purpose of evaluating a machine learning model on data it was not trained on?
In machine learning terminology, what is 'dimensionality'?
In machine learning terminology, what is 'dimensionality'?
Within the context of machine learning notation, if 'x' is an input vector, what does 'xi' typically represent?
Within the context of machine learning notation, if 'x' is an input vector, what does 'xi' typically represent?
In neural networks, what do the weights wij
signify?
In neural networks, what do the weights wij
signify?
What information is provided by 'target vectors' in supervised learning?
What information is provided by 'target vectors' in supervised learning?
What does the 'activation function' in a neural network determine?
What does the 'activation function' in a neural network determine?
In machine learning, what does ‘weight space’ refer to?
In machine learning, what does ‘weight space’ refer to?
What is the 'curse of dimensionality' in machine learning?
What is the 'curse of dimensionality' in machine learning?
How does 'overfitting' affect model generalization?
How does 'overfitting' affect model generalization?
What is the purpose of using a 'validation set' during training?
What is the purpose of using a 'validation set' during training?
What is the main purpose of randomly reordering the data?
What is the main purpose of randomly reordering the data?
How does multi-fold cross-validation help to validate data?
How does multi-fold cross-validation help to validate data?
In a confusion matrix, what does the element at position (i, j) represent?
In a confusion matrix, what does the element at position (i, j) represent?
In the context of the confusion matrix, what does a 'true positive' indicate?
In the context of the confusion matrix, what does a 'true positive' indicate?
Which measure does Matthew's Correlation Coefficient compute?
Which measure does Matthew's Correlation Coefficient compute?
How is 'precision' defined in the context of machine learning algorithm variability?
How is 'precision' defined in the context of machine learning algorithm variability?
What is a 'mean'?
What is a 'mean'?
How is 'median' used?
How is 'median' used?
What function does the mode serve?
What function does the mode serve?
What is variance?
What is variance?
If two variables are independent, what is the result of the covariance?
If two variables are independent, what is the result of the covariance?
Under which circumstance is co-variance positive?
Under which circumstance is co-variance positive?
When does Mahalanobis Distance take place?
When does Mahalanobis Distance take place?
Flashcards
What is Machine Learning?
What is Machine Learning?
Machine learning modifies or adapts computer actions to become more accurate based on data.
Machine Learning Definition
Machine Learning Definition
A computer program learns from experience E, improving its performance P at tasks T.
Machine Learning Data
Machine Learning Data
The experiences used to improve performance.
Types of Machine Learning
Types of Machine Learning
Signup and view all the flashcards
Supervised Learning
Supervised Learning
Signup and view all the flashcards
Regression
Regression
Signup and view all the flashcards
Classification
Classification
Signup and view all the flashcards
Unsupervised Learning
Unsupervised Learning
Signup and view all the flashcards
Clustering
Clustering
Signup and view all the flashcards
Association
Association
Signup and view all the flashcards
Reinforcement Learning
Reinforcement Learning
Signup and view all the flashcards
Evolutionary Learning
Evolutionary Learning
Signup and view all the flashcards
Machine Learning Workflow
Machine Learning Workflow
Signup and view all the flashcards
Machine Learning Inputs
Machine Learning Inputs
Signup and view all the flashcards
Dimensionality
Dimensionality
Signup and view all the flashcards
Weights (wij)
Weights (wij)
Signup and view all the flashcards
Targets (t)
Targets (t)
Signup and view all the flashcards
Activation Function
Activation Function
Signup and view all the flashcards
Error (E)
Error (E)
Signup and view all the flashcards
Weight Space
Weight Space
Signup and view all the flashcards
Curse of Dimensionality
Curse of Dimensionality
Signup and view all the flashcards
Overfitting
Overfitting
Signup and view all the flashcards
Training, Testing, Validation Sets
Training, Testing, Validation Sets
Signup and view all the flashcards
Multi-fold Cross-Validation
Multi-fold Cross-Validation
Signup and view all the flashcards
Confusion Matrix
Confusion Matrix
Signup and view all the flashcards
True positive
True positive
Signup and view all the flashcards
False positive
False positive
Signup and view all the flashcards
True negative
True negative
Signup and view all the flashcards
False negative
False negative
Signup and view all the flashcards
Accuracy
Accuracy
Signup and view all the flashcards
Sensitivity
Sensitivity
Signup and view all the flashcards
Specificity
Specificity
Signup and view all the flashcards
Precision
Precision
Signup and view all the flashcards
Recall
Recall
Signup and view all the flashcards
Plot of true positives vs. false positives.
Plot of true positives vs. false positives.
Signup and view all the flashcards
Unbalanced Datasets
Unbalanced Datasets
Signup and view all the flashcards
Measurement Precision
Measurement Precision
Signup and view all the flashcards
Mean
Mean
Signup and view all the flashcards
Median
Median
Signup and view all the flashcards
Mode
Mode
Signup and view all the flashcards
Variance
Variance
Signup and view all the flashcards
Standard Deviation
Standard Deviation
Signup and view all the flashcards
Covariance
Covariance
Signup and view all the flashcards
Bias
Bias
Signup and view all the flashcards
Variance
Variance
Signup and view all the flashcards
Study Notes
Introduction to Machine Learning
- Machine learning can be applied to disease prediction systems, price estimation, recommendation systems, and time series analysis
- Suppose you run an online software store and collect data for each purchase such as computer type, web browser, country and time
- A core problem involves predicting what the next person will buy based on collected data
- One can solve the prediction problem, given the data that seems to be similar, and similar people often act similarly
- Supervised learning is a type of machine learning
- Learning improves performance on a task through remembering, adapting and generalizing
- Generalizing involves transferring knowledge from one context to use in different situations
- Machine learning enables computers to modify or adapt to make more accurate predictions, actions and control robots
- Machine learning accuracy is measured by how well actions reflect correct ones
- Imagine playing Scrabble against a computer; initially, you might win every game before the computer adapts and learns to beat you
- Computer learns to beat a player and use the same strategy against other players
Definition of Machine Learning
- A computer learns from experience E relative to tasks T and performance measure P, if its performance at tasks in T improves with experience E, as measured by P
- Task: behavior or task to be improved, such as classification, acting in an environment
- Data: experiences to improve performance in the task
- Measures of improvement: increasing accuracy in prediction, acquiring new speed and efficiency
How Computer Science, Data Science, and ML relate
- Computer Science comprises machine learning
- Machine learning sits at the intersection of computer science, math, and statistics
- Data analysis, traditional software, data science, and business domain expertise combine with computer science, maths, and statistics
Applications of Machine Learning
- Machine learning is used in: medicine for disease diagnosis, computer vision, robotic control, natural language processing, speech recognition, machine translation, finance, and fraud detection
Algorithmic vs Machine Learning Solutions
- algorithmic solutions use data as input to a program to get output
- Machine learning solution uses data and output to get a program
Learner Design
- To design a learner: choose the training experience, target function, how to represent the target function, and a learning algorithm to infer it
Types of Machine Learning
- Supervised Learning
- Unsupervised Learning
- Reinforcement Learning
- Evolutionary Learning
Supervised Learning
- In supervised learning, a training set of examples with correct responses can allow generalizing correctly to possible inputs
- Supervised learning is learning from exemplars
- Supervised learning is represented as set of data, the inputs, the targets and the index suggests lots of pieces of data indexed by i running from 1 to some upper limit N
Supervised learning - Regression
- Regression assumes values come from a function to find the function, giving the output value of y for any value of x
- Regression in statistics describes a curve so it passes close to the data-points
- Regression is a problem of function approximation or interpolation, working out the value between known values
Supervised learning - Classification
- Classification problem consists of input vectors to decide corresponding classes based on training
- With classification, each example belongs to only one class when the set of possible output space is complete
- Two constraints about classification aren't necessarily realistic, since examples might belong partially to two different classes
- Classification can include classifying if an email is spam, recognize handwritten characters, and classifying user behavior to determine churn
Classification Vs Regression
- Input features X1, ..., Xn and a target feature Y uses training examples, where the value of the input features is given for each given example
- The new example values predict target features for a new example, classification of Y is discrete, and regression of Y is continuous
Unsupervised Learning
- Unsupervised learning is the type of machine learning that happens without human supervision. A machine tries to find any patterns in data by itself.
- Applicable in clustering and association problems
- Results may be less accuracte than supervised learning
Types of Unsupervised Learning
- Applicable to find clustering and associations
Supervised vs Unsupervised Learning
- With unsupervised, machine learns without human supervision
- Supervised learning is a type of machine learning that happens under human supervision. Labels input data with answer keys showing a machine the desired outputs
Reinforcement Learning
- Reinforcement learning is between supervised and unsupervised learning
- Algorithm is told when the answer is wrong but not how to correct
- Algorithm must work out how to get the answer right by trial and error
- Reinforcement learning involves a monitor that scores the answer and suggest improvements with a critic
Evolutionary Learning
- Evolutionary learning adapts biological organisms to improve survival rates
- Evolutionary learning draws inspiration from biological evolution, particularly natural selection
- Evolutionary learning also optimizes solutions to complex problems using algorithms that mimic the process of natural selection
Steps in Evolutionary Learning
- Generate candidate solutions randomly or heuristically through initialization
- Evaluate each individual in the population based on its performance in solving the problem, which is done using a fitness function
- Select individuals with higher fitness values for the next stage of evolution
- Use selected individuals to generate offspring through processes like mutation, crossover, or recombination for offspring generation
- Offspring replace some individuals in the existing population by criteria and strategies such as elitism
- Repeat steps 2-5 for multiple generations until a maximum number is reached
The Machine Learning Workflow
- The workflow includes data collection, data preparation, choosing learning algorithm, and training
- Evaluating the model determines the best predictions
The Machine Learning Process
- Collect and prepare data
- Identifying the best features for examing the problem through feature selection
- Choose algorithm along with knowledge of its use
- Parameter and Model Selection to identify appropriate values either manually or via experimentation
- Use dataset, algorithm and parameters, alongside computational resources to build a model of the data in order to predict the outputs on new data during training
- Before deployment evaluate for accuracy on data not trained on
Terminology used in ML
- Inputs are the data fed into the algorithm
- Machine learning algorithms take input values to produce an output as an answer for the input vector
- An input vector consists of real numbers
- Numbers of vector elements is dimensionality of the input
Notation
- Vector x has elements (x1, x2, ..., xm) or xi, where i runs from 1 to the number of input dimensions, m
- wij are the weighted connections between nodes i and j
- Weights are analogous to synapses in the brain, which are arranged into a matrix W
- Output vector is y, with elements yj, where j runs from 1 to the number of output dimensions, n
- y(x,W) is the set of weights of the network
- Target vector is t, with elements tj, where j runs from 1 to the number of output dimensions, n
- t are the extra data for supervised learning to 'correct' the answers the algorithm gives
- Activation function describes the firing of the neuron as a response to the weighted inputs
- Error (E) compute inaccuracies of the network as a function of outputs (y) and targets (t)
- Weight Space is thinking about the weights that connect into a particular neuron and plotting the strengths of the weights using one axis for each weight
The Curse of Dimensionality
- The essence of the curse is the realization that as the number of dimensions increases, the volume of the unit hypersphere does not increase with it
- Volume of Hypersphere is relative to its dimensions
- The unit hypersphere is the region we get if we start at the origin and draw all the points that are 1 unit away
- In 2 dimensions we get a circle, in 3 dimensions its a sphere, and then its a hypersphere for higher dimensions
Overfitting
- Training too long, can cause overfitting the data, and learning noise and inaccuracies
- Model becomes complicated and cannot generalize
- Validate the generalization aspect and learning at each time step, by stopping the learning process before algorithm overfits
- Validation set of data can be used to validate the learning
Training, Testing and Validation Sets
- Three datasets are needed: training set to train an algorithm, the validation set to keep track of how well it is doing as it learns, and the test set to produce results
- 50:25:25 is the proportion of training to testing to validation data
- Data splitting can matter, some is presented with datapoints in class 1, then class 2 and so on
- If first few points are training sets, results will be bad, since the training did not see all classes
- can reorder the data randomly first to assign each data-point randomly
Multi-fold Cross-Validation
- Dataset randomly partitioned into K subsets and 1 subset as validation and the algorithm trained on the others
- A new model is trained after a different subset is left out, and repeat the process for all subsets.
- The model with the lowest validation error is tested and used
The Confusion Matrix
- Confusion matrix can work out whether the algorithm results were good
- Confusion matrix is for classification problems and is a square matrix
- Lists possible classes in both horizontal and vertical directions that can list along the top of a table as outputs and on the left hand side as the targets
- Matrix element at (i, j) tells us how many input patterns were put into class i by the targets, but class j by the algorithm
- Algorithm performance measured and compared
- The elements of the leading diagonal relate to the data classified correctly
Accuracy Metrics - Confusion Matrix
- A true positive accurately predicts the positive class
- A true negative accurately predicts the negative class
- A false positive are inaccurately predicts positive class
- A false negative inaccurately predicts the negative class
Accuracy Metrics - Example
- In the context of a wolf example, a "wolf" is a positive class, and "no wolf" is a negative class
Accuracy Definition
- Accuracy is sum of true positives and negatives divided by total examples:
- Accuracy = (#TP + #TN) / (#TP + #FP + #TN + #FN)
- Sensitivity = #TP / (#TP + #FN)
- Specificity = #TN / (#TN + #FP)
- Precision = #TP / (#TP + #FP)
- Recall = #TP / (#TP + #FN)
- F1 = #TP / (#TP + (#FN + #FP)/2
Receiver Operator Characteristic (ROC) curve
- ROC curve plots on the y axis the percentage of true positives against false positives on x-axis
- A single classifier run produces a single point on plot, with point (0,1), 100% true positives and 0% false positives being a perfect classifier
- Any classifier lying from (0,0) to (1,1) functions as at chance level and positive and negative classes are common, wasting learning effort
Unbalanced Datasets
- In a balanced dataset assume there are similar numbers of positive and negative examples
- Balanced accuracy computed with:
- MCC = (#TP x #TN - #FP x #FN) / sqrt((#TP + #FP)(#TP + #FN)(#TN + #FP)(#TN + #FN))
Measurement Precision
- Measurement precision evaluates accuracy of a learning system using word "precision" for measurement systems
- Measurement happens by feeding in inputs and look at the outputs
- Compare to target value, can measure if fed in a set of similar inputs, so expect similar output
- Algorithms variability is the precision, that tells how repeatable its predictions are
Averages
- Mean is the average of a dataset by adding the points and dividing by number of points
- Median can be found by middle value after sorting the dataset
- Mode can be found by the most common value and counting times an element appears
Variance and Standard Deviation
- Variance of numbers in a set is measured by how spread out values are
- Variance computed with sum of squared distances between each element and expected values of the set (mean, μ)
- Square root of variance is standard deviation, represented as the sigma symbol
Covariance
- Covariance is used to look at how two variables vary together
- Covariance is a measure computed by cov({xi}, {yi}) = E({xi} – μ)E({yi} – ν)
- If two variables are independent, the covariance is 0 and known to be uncorrelated
- If both variables both increase and decrease at the same time, the covariance is positive
- If variables display opposite patterns, covariance is negative
Calculate the average value for each variable
- μ = (1,245 + 1,415 + 1,312 + 1,427 + 1,510 + 1,590) / 6 => μ = 1,416.5
- v = (100 + 123 + 129 + 143 + 150 + 197) / 6 => v = 140.3
Steps in finding differences
- Find the difference between each value and the mean for both variables
Covariance between all pairs of variables in a Dataset.
- Covariance applied to used to look at the correlation between all pairs of variables within a set of data
- To compute, it needs to compute each then put together in the convariance matrix
- Element xi is a column vector describing the elements of the ith variable and μi is its mean
- Matrix is square, elements from leading elements equal to variances, equal to covariance
Mahalanobis Distance
- Distances between the center of the data indicate whether something would belong to a dataset
- This assesses relationship to the spread of the data-points
- Mahalanobis constructs a distance measure taking account, called the 1936 is written is:
- DM(x) = sqrt((x – μ)TΣ-1(x – μ))
The Gaussian
- Gaussian distribution is key because of the Central Limit Theorem which will add up to something Gaussian
The Bias-Variance Tradeoff
- A model can be bad for 2 reasons: not accurate, not precise/has variation in results
- First of the reasons is the bias and the second is the variance
- With more complex classifiers improving the bias results in higher variance
- Making a model more specific result in reducing variance, increasing the bias
Studying That Suits You
Use AI to generate personalized quizzes and flashcards to suit your learning preferences.