Week 1 - Introduction to Deep Learning PDF
Document Details
Uploaded by IntegratedBiography
Tags
Summary
This document provides an introductory overview of deep learning, touching on related concepts like artificial intelligence and machine learning.
Full Transcript
Introduction to Deep Learning Week 1 Artificial intelligence, Machine Learning and Deep Learning DL is just a branch of machine learning, which is a branch of artificial intelligence https://www.youtube.com/watch?v=uMzUB89uSxU What is AI? AI can be broadly d...
Introduction to Deep Learning Week 1 Artificial intelligence, Machine Learning and Deep Learning DL is just a branch of machine learning, which is a branch of artificial intelligence https://www.youtube.com/watch?v=uMzUB89uSxU What is AI? AI can be broadly defined as technology that can learn and produce intelligent behavior without the need for humans supervision. AI relies on algorithms to achieve a result that may or may not have anything to do with human goals or methods of achieving those goals. AI has been moving extremely quickly in the last few years, demonstrating a potential to revolutionize every aspect of our lives. Examples of Using AI More applications of Artificial Intelligence Machine Translation such as Google Translate Self Driving Vehicles AI Robots such as Sophia Speech Recognition applications like Apple’s Siri or OK Google What is Machine Learning? https://www.youtube.com/watch?v=ukzFI9rgwfU&list=PLEiEAq2VkUUI7 3199L-Aym2MnKjBxJ-4X Machine learning is an application of AI that provides systems the ability to learn on their own and improve from experiences without being programmed externally. Machine learning is all about using your computer to "learn" how to deal with problems without “programming". Machine Learning Applications of Machine Learning Paypal uses Machine Learning to detect fraud. Amazon uses Machine Learning to give you suggestion, what you can further buy. Banks also use Machine Learning to approve Loans. How does Machine Learning work? Machine Learning Workflow Main Types of Machine Learning Supervised Learning : It trains a model on known input and output data so that it can predict future outputs. It usually predicts future events by using past learned or stored data using labeled examples. Regression / Estimation: is used for predicting a continuous value; E.g. predicting things like the price of a house based on its characteristics Classification: is used for predicting the class or category of a case. E.g. if a cell is benign or malignant. Unsupervised Learning: Its data-driven (identify clusters). Here, we do not supervise the model, but we let the model work on its own to discover information that may not be visible to the human eye. Clustering: one of the most popular unsupervised machine learning techniques used for grouping data points or objects that are somehow similar. It is a grouping of data points or objects that are somehow similar. E.g. Can find similar patients, or can be used for customer segmentation in the banking field. Comparison between Supervised and Unsupervised List of common ML algorithms Linear Regression Logistic Regression Decision Tree Support Vector Machines K-NN Random Forest Limitations of ML Data Quality and Quantity Feature extraction Model Selection Computational Resources Structured vs Unstructured Data Structured Data Unstructured Data Data formatted into precisely defined fields Data stored in its native format and not processed until (columns) used Each feature such as size of the house, number Data like raw audio signals, images or texts. of bedrooms, user’s age has a very well- defined meaning. What are Neurons? Scientists agree that our brain has between 80 and 100 billion neurons. These neurons have hundreds of billions connections between them. Neurons or Nerve cells are the fundamental units of our brain and nervous systems The neurons are responsible for receiving input from the external world, for sending output (commands to our muscles), and for transforming the electrical signals in between. What is a Neural Network? An artificial neural network is a system designed to operate like a human brain. The data is fed as input to the neuron The neuron processes the information provided as input The output is the final value predicted by the artificial neuron A neural network is usually described as having different layers. The first layer is the input layer, it picks up the input signals and passes them to the next layer. The next layer does all kinds of calculations and feature extractions—it’s called the hidden layer. Often, there will be more than one hidden layer. And finally, there’s an output layer, which delivers the final result. What is Deep Learning? https://www.youtube.com/watch?v=6M5VXKLf4D4&t=5s Is a branch of machine learning algorithms in the form of a neural network that uses a cascade of layers of processing units to extract features from data and make predictive guesses about new data They are learning methods with deep architecture. It automatically learns intermediate representation. Deep learning usually indicates “Deep Neural Network“. The Deep in deep learning is not a reference of to any kind of deeper understanding achieved by the approach, rather it simply stands for this idea of successive layers of representations -how many layers contribute to a model of the data is called the depth of the model. Again, what is Deep Learning? Why do we need Deep Learning? Why Deep Learning is important? The data amount grows Explosion of features and datasets Focus on real time decisioning ML vs DL: First example The ML model takes images of both cat and dog as The deep learning model takes the images as the input. input and feed it directly to the algorithms without Extracts the different features of images such as shape, requiring any manual feature extraction step. height, nose, eyes, etc. The images pass to the different layers of the Applies the classification algorithm, and predict the artificial neural network and predict the final output. output. ML needs to be told how it should make an accurate prediction by feeding it more data DL: able to learn on its own ML vs DL: Second Example “Traditional” machine learning: handcrafted learned cat features classifier Deep, “end-to-end” learning: learned learned learned learned low-level mid-level high-level cat classifier features features features ML vs DL: How to choose? Terminology Learning Systems and Input-Output Relationships Learning systems uses Relationships between Inputs to produce Predictions. Labels The label is the thing we want to predict It is like the y in a linear graph Features The features are the input They are like the x values in a linear graph Sometimes there can be many features (input values) with different weights: y = b + w1x1 + w2x2 + w3x3 + w4x4 Model A model defines the relationship between the label y and the features x Model is the learned prediction/classification function derived from training data and showing how inputs relate to outputs. Input 1 Input 2 Output …… MODEL (prediction) Input n Learning Phases Two main phases: Training: Input data are used to calculate the parameters of the model Testing: The trained model outputs correct data from any input Training The goal of training is to create a model that can answer a question. Like what is the expected output? The idea is to give a set of inputs and its expected outputs, so after training we will have a model that will then map new data to one of the categories trained on Data Up to 80% of project is about collecting data: What data is Required? What data is Available? How to Select the data? How to Collect the data? How to Clean the data? How to Prepare the data? How to Use the data? Data is collecting of facts Classification vs Regression Classification: Categorizing input data points into one of multiple discrete classes or categories. For example, classifying emails as "spam" or "not spam“ Regression: Predicting a continuous numerical value output based on input data. For example, predicting house prices from characteristics (when the output variable is a real or continuous value, such as salary or weight). Figure from Towardsdatascience.com Binary vs Multi-class classification Binary: task of classifying a given data into two classes. ex: decide if an email is spam or not spam, determine if a patient has a disease or not. Multi-class: task of classifying an item into more than two classes. Ex: classify a character into digit (0…9), decide if an email is advertisement, phishing, hack or personal. Figure from medium.com Clustering Grouping unlabeled input data based on similarities/differences, without predefined categories. An example is to cluster similar news articles based on topics. Figure from Towardsdatascience.com Fitting, Underfitting and Overfitting Fitting refers to the process of training a model on some input data in order to make it able to generate similar outputs when new inputs are provided. Overfitting is when the model focuses too much on the exact details of the training data. It might perform very well on the data it was trained on, but fails badly when trying to handle new, unseen data. Underfitting is when model fails to identify and learn the true relationships between input and output data. It performs poorly even when making predictions on data it has seen before. Figure from mathworks.com Outlier Detection Automatically identify observations (data points) which deviate so much from the remaining set of observations 37 Training, Validation and Testing Generalization Generalization is a term used to describe a model’s ability to react to new data. That is, after being trained on a training set, a model can digest new data and make accurate predictions. Evaluating a machine learning model is “measure generalization”. A model’s ability to generalize is central to the success of a model. Generalization error and hyperparameters The error rate on new cases is called the generalization error (or out-of-sample error), and by evaluating your model on the test set, you get an estimate of this error. This value tells you how well your model will perform on instances it has never seen before. Anything in machine learning that you decide their values or choose their configuration before training begins is a hyperparameter. Most Machine learning algorithms have hyperparameters settings that we can use to control the algorithm’s behavior (external to the model and cannot be changed during training)