Perceptron: A Machine Learning Model PDF
Document Details
Uploaded by TrendyHeisenberg3357
Tags
Summary
This document presents an introduction to Perceptron, focusing on fundamental concepts in machine learning. It covers traditional programming, different machine learning models, and their application to various domains. The material is suitable for undergraduate students.
Full Transcript
Neural Networks Module 1 Traditional programming Vs ML Vs DL Traditional Programming Traditional programming is a manual process— meaning a person (programmer) creates the program Machine Learning Programming In machine learning, on the other hand...
Neural Networks Module 1 Traditional programming Vs ML Vs DL Traditional Programming Traditional programming is a manual process— meaning a person (programmer) creates the program Machine Learning Programming In machine learning, on the other hand, the algorithm automatically formulates the rules from the data. ML Vs DL ML Vs DL Machine Learning (ML) is commonly used along with AI but it is a subset of AI. ML refers to an AI system that can self-learn based on the algorithm. Systems that get smarter and smarter over time without human intervention is ML. Deep Learning (DL) is a machine learning (ML) applied to large data sets. How can machines learn? Datasets. Machine learning systems are trained on special collections of samples called datasets. The samples can include numbers, images, texts or any other kind of data. It usually takes a lot of time and effort to create a good dataset. Features In machine learning and pattern recognition, a feature is an individual measurable property or characteristic of a phenomenon. A feature is an attribute that is useful or meaningful to your problem Algorithm. It is possible to solve the same task using different algorithms. Depending on the algorithm, the accuracy or speed of getting the results can be different. Sometimes in order to achieve better performance, you combine different algorithms, like in ensemble learning. Four groups of ML algorithms Supervised Learning there is a training set with labeled data. Algorithm examples: Naive Bayes, Support Vector Machine, Decision Tree, K-Nearest Neighbours, Logistic Regression, Linear and Polynomial regressions. Used for: spam filtering, language detection, computer vision, search and classification Unsupervised Learning In unsupervised learning, you do not provide any features to the program allowing it to search for patterns independently. Algorithm examples: K-means clustering, DBSCAN, Mean-Shift, Singular Value Decomposition (SVD), Principal Component Analysis (PCA), Latent Dirichlet allocation (LDA), Latent Semantic Analysis, FP-growth. Used for: segmentation of data, anomaly detection, recommendation systems, risk management, fake images analysis. Semi-supervised Learning semi-supervised learning means that the input data is a mixture of labeled and unlabeled samples. A common example of an application of semi- supervised learning is a text document classifier. This is the type of situation where semi-supervised learning is ideal because it would be nearly impossible to find a large amount of labeled text documents. Speech analysis. The most basic disadvantage of any Supervised Learning algorithm is that the dataset has to be hand- labeled either by a Machine Learning Engineer or a Data Scientist. This is a very costly process, especially when dealing with large volumes of data. The most basic disadvantage of any Unsupervised Learning is that it’s application spectrum is limited. To counter these disadvantages, the concept of Semi- Supervised Learning was introduced. In this type of learning, the algorithm is trained upon a combination of labeled and unlabelled data. Typically, this combination will contain a very small amount of labeled data and a very large amount of unlabelled data. The basic procedure involved is that first, the programmer will cluster similar data using an unsupervised learning algorithm and then use the existing labeled data to label the rest of the unlabelled data Reinforcement Learning This is very similar to how humans learn: through trial. Humans don’t need constant supervision to learn effectively like in supervised learning. Games are very useful for reinforcement learning research because they provide ideal data-rich environments. An example of reinforcement learning is in the development of driverless cars. In general, a reinforcement learning agent is capable of sensing and interpreting its environment, acting, and learning via trial and error. How to build machine learning model The steps to building a machine learning model include: Explore the data and choose the type of algorithm Prepare and clean the dataset Split the prepared dataset and perform cross validation Perform machine learning optimization Deploy the model Machine Learning-Project flow Collecting Data: As you know, machines initially learn from the data that you give them. Preparing the Data: After you have your data, you have to prepare it. Choosing a Model: Training the Model: Evaluating the Model: Parameter Tuning: Making Predictions. Use Google Colab https://towardsdatascience.com/workflow-of-a-machine-l earning-project-ec1dba419b94 - Machine Learning Workflow https://www.w3schools.com/python/numpy/numpy_creat ing_arrays.asp- Numpy array https://www.tutorialspoint.com/scikit_learn/index.htm – Scikit learn https://www.w3schools.com/python/pandas/pandas_dat aframes.asp pandas DEEP LEARNING Deep Learning Machine Learning is a subset of artificial intelligence that helps you build AI-driven applications. Deep Learning is a subset of machine learning that uses vast volumes of data and complex algorithms to train a model. Deep Learning is basically a sub-part of the broader family of Machine Learning which makes use of Neural Networks(similar to the neurons working in our brain) to mimic human brain-like behavior. DL is a ML algorithm that uses deep(more than one layer) neural networks to analyze data and provide output accordingly. Applications Self driving cars Fraud detection NLP-Natural language Processing Health care Language translation Automatic game playing Chatbots Computer vision DL application in Biomedicine DL application in Image Processing Deep learning architectures Father of Deep Learninng Geoffrey Everest Hinton CC FRS FRSC (born 6 December 1947) is a British-Canadian cognitive psychologist and computer scientist, most noted for his work on artificial neural networks. Hinton, LeCun and Bengio who have just received the Turing Award as the fathers of the deep learning revolution. Yann André LeCun is a French computer scientist working primarily in the fields of machine... neural networks (CNN), and is a founding father of convolutional nets. Artificial neural network (ANN) is the underlying architecture behind deep learning. Based on ANN, several variations of the algorithms have been invented. Convolutional neural networks: The architecture is particularly useful in image-processing applications. Recurrent neural networks:Recurrent neural networks recognize data's sequential characteristics and use patterns to predict the next likely scenario Supervised Pretrained Networks Convolutional Neural Networks (CNNs) Recurrent Neural Networks(LSTM,GRU) Unsupervised Pretrained Networks Autoencoders Deep Belief Networks (DBNs) Generative Adversarial Networks (GANs) ANN The term "Artificial neural network" refers to a biologically inspired sub-field of artificial intelligence modeled after the brain. An Artificial neural network is usually a computational network based on biological neural networks that construct the structure of the human brain. Similar to a human brain has neurons interconnected to each other, artificial neural networks also have neurons that are linked to each other in various layers of the networks. These neurons are known as nodes. How do our brains work? The Brain is A massively parallel information processing system. Our brains are a huge network of processing elements. A typical brain contains a network of 10 billion neurons. How do our brains work? Biological Neuron Neurons & biological motivation Basic unit Neuron(Also called as cell body or soma) *Dendrites: Input(impulse carries towards cell body) *Cell body: Processor *Synaptic: Link(Neurons are connected together) *Axon: Output(impulses carries out from cell body) *A neuron receives input from other neurons. Inputs are combined A neuron is connected to other neurons through about 10,000 synapses A processing element Once input exceeds a critical level, the neuron discharges a spike ‐ an electrical pulse that travels from the body, down the axon, to the next neuron(s) The axon endings almost touch the dendrites or cell body of the next neuron. Transmission of an electrical signal from one neuron to the next is effected by synapses. McCulloch-Pitts Neuron(MP Neurons) The first computational model of a neuron was proposed by Warren MuCulloch (neuroscientist) and Walter Pitts (logician) in 1943. First mathematical model of biological neuron. This model imitates the functionality of a biological neuron, thus is also called Artificial Neuron Basic building block of Neural Networks. Two possible states of neuron(1 and 0) An artificial neuron accepts binary inputs and produces a binary output based on a certain threshold value which can be adjusted. On taking various inputs the function aggregates them and takes decision based on the aggregation. Aggregation simply means sum of these binary inputs. If the aggregated value exceeds the threshold, the output is 1 else it is 0. AND Function An AND function neuron would only fire when ALL the inputs are ON i.e., g(x) ≥ 3 here. OR Function OR function neuron would fire if ANY of the inputs is ON i.e., g(x) ≥ 1 here. How do ANNs work? Perceptron The perceptron model is a more general computational model than McCulloch-Pitts neuron. This model was developed by Frank Rosenblatt in 1957 Simple and limited (single layer models) A Single-layer perceptron can learn only linearly separable patterns. A perceptron model, in Machine Learning, is a supervised learning algorithm of binary classifiers. Perceptron Model Perceptron Model Artificial Neural Networks Architecture Perceptron There are 4 constituents of a perceptron model. They are as follows- Input values Weights and bias Net sum Activation function Weights It overcomes some of the limitations of the M-P neuron by introducing the concept of numerical weights (a measure of importance) for inputs, and a mechanism for learning those weights. A weight brings down the importance of the input value. choose layer weights at random from standard normal distribution inside [-1,1] In Perceptron, the weight coefficient is automatically learned. Inputs are no longer limited to Boolean values like in the case of an M-P neuron, it supports real inputs. Bias The bias term is an adjustable, numerical term added to a perceptron's weighted sum of inputs and weights that can increase classification model accuracy. It is similar to intercept added in linear equation. Activation function Activation Function helps the neural network to use important information while suppressing irrelevant data points. An activation function is applied with the above- mentioned weighted sum giving us an output either in binary form or a continuous value. The activation function does the non-linear transformation to the input making it capable to learn and perform more complex tasks. Perceptron Model Let us now go through a step-by-step procedure in order to understand the way the perceptron model operates. Enter bits of information that are supposed to serve as inputs in the first layer (Input Value). All weights (pre-learned coefficients) and input values will be multiplied. The multiplied values of all input values will be added. The bias value will shift to the final stage (activation function/output result). The weighted input will proceed to the stage of the activation function. Perceptron Learning Rule Perceptron Learning Rule A Perceptron accepts inputs, moderates them with certain weight values, then applies the transformation function to output the final result. In the Perceptron Learning Rule, the predicted output is compared with the known output. If it does not match, the error is propagated backward to allow weight adjustment to happen. Perceptron Algorithm Learning AND gate Learning AND gate The question is, what are the weights and bias for the AND perceptron? First, we need to understand that the output of an AND gate is 1 only if both inputs (in this case, x1 and x2) are 1. So, following the steps listed above; Learning AND gate Row 1 From w1*x1+w2*x2+b, initializing w1, w2, as 1 and b as –1, we get; x1(1)+x2(1)–1 Passing the first row of the AND logic table (x1=0, x2=0), we get; 0+0–1 = –1 From the Perceptron rule, if Wx+b≤0, then y`=0. Therefore, this row is correct, and no need for Backpropagation. Learning AND gate Row 2 Passing (x1=0 and x2=1), we get; 0+1–1 = 0 From the Perceptron rule, if Wx+b≤0, then y`=0. This row is correct, as the output is 0 for the AND gate. From the Perceptron rule, this works (for both row 1, row 2 and 3). Learning AND gate Row 4 Passing (x1=1 and x2=1), we get; 1+1–1 = 1 Again, from the perceptron rule, this is still valid Therefore, we can conclude that the model to achieve an AND gate, using the Perceptron algorithm is x1(1)+x2(1)–1 Learning AND gate OR Gate From the diagram, the OR gate is 0 only if both inputs are 0. OR Gate Row 1 From w1x1+w2x2+b, initializing w1, w2, as 1 and b as – 1, we get; 0+0–1 = –1 From the Perceptron rule, if Wx+b≤0, then y`=0. Therefore, this row is correct. Row 2 Passing (x1=0 and x2=1), we get; 0+1–1 = 0 From the Perceptron rule, if Wx+b