Artificial Neural Network PDF
Document Details
Uploaded by FoolproofXenon
Alexandria University
Dr. Doaa B. Ebaid
Tags
Summary
This presentation details artificial neural networks, their history, biological background, how they work, different types, activation functions, learning techniques, cost functions, and backpropagation. It also includes a comparison between machine learning and deep learning.
Full Transcript
Artificial Neural Network Dr. Doaa B. Ebaid Outlines What is Artificial Neural Network (ANN)? History Biological Background How ANN works? Types of ANNs Activation Functions Learning in ANN Cost Functions Backpropagation ML Vs Deep Learning What is Artif...
Artificial Neural Network Dr. Doaa B. Ebaid Outlines What is Artificial Neural Network (ANN)? History Biological Background How ANN works? Types of ANNs Activation Functions Learning in ANN Cost Functions Backpropagation ML Vs Deep Learning What is Artificial Neural Network (ANN)? ❏ The term "Artificial neural network“ (ANN) refers to a biologically inspired sub-field of artificial intelligence modeled after the brain. An Artificial neural network is usually a computational network based on biological neural networks that construct the structure of the human brain. ❏ ANN uses the processing of the brain as a basis to develop algorithms that can be used to model complex patterns and prediction problems. ❏ It aims to mimic human thinking that allows the computer to learn by incorporating new data. History Neural Networks Explained - Eastgate Software Biological Background ❏ Neuron consists of: ▪ Cell body (nodes) ▪ Synapses (weights) ▪ Dendrites (inputs) ▪ Axon (output) How do Artificial Neural Networks work? b is bias, w is weight How do Artificial Neural Networks work? Y= f( σ ) Y = f (w1 x1 +w2 x2 +…..+ wn xn+ b) b is bias, w is weight How do Artificial Neural Networks work? Weights are parameters that control the Bias is an additional parameter that is added to strength of the connection between two the input of the activation function. It acts like a neurons in different layers of the network. Each connection between neurons has an constant term in a linear equation. associated weight. Aim: The bias allows the activation function to be Aim: The weight determines the importance shifted to the left or right, giving the network more of the input value in the decision-making flexibility in learning patterns, even when all input process. When data passes through the network, the weights adjust the input values features are zero. It ensures that the neuron can before they reach the activation function. activate even if the input is zero, improving the During training, these weights are updated network's ability to fit the data. to minimize the error between the predicted Learning Process: Like weights, biases are and actual outputs. Learning Process: Weights are updated updated during backpropagation, enabling the during backpropagation to minimize the loss model to adjust and improve performance over function, helping the network adjust to the time. correct patterns in the data. How do Artificial Neural Networks work? ANNs incorporate the two fundamental components of biological neural nets: Neurones (nodes) and Synapses (weights) Neural networks are composed of a collection of nodes. The nodes are spread out across at least three layers. The three layers are: ▪ An input layer ▪ A "hidden" layer ▪ An output layer Types of Artificial Neural Networks Perceptron NNs are simple, Single/Multi-layer perceptron shallow networks with an input NNs add complexity to perceptron layer and an output layer. networks, and include one or more hidden layers Types of Artificial Neural Networks Feed-forward NNs only allow their nodes to pass information to a forward node. Types of Artificial Neural Networks Liquid state machine NN feature nodes that are randomly connected to each other. Activation Functions The activation function in an ANN plays a crucial role in determining whether a neuron should be activated or not. It introduces non- linearity into the network, enabling the model to learn and represent complex patterns in data. The activation function takes the weighted sum of the inputs to a neuron (plus bias) and applies a transformation, producing the neuron’s output. Activation Functions The Role of Activation Functions: ✓ Non-linearity: Activation functions make neural networks capable of learning and modeling complex data patterns. ✓ Output control: Controlling the output range helps the network normalize the outputs and makes learning efficient by bounding the output values, especially in the case of multi-layer networks. ✓ Gradient-based learning: Activation functions enable the computation of gradients during backpropagation, making training efficient. ✓ Problem mitigation: They help prevent common issues like vanishing gradients (through ReLU and similar functions) or dead neurons (via Leaky ReLU). Learning in ANN In ANN, the learning is the method of modifying the weights of connections between the neurons of a certain network. Learning in ANN can be classified into three categories: ▪ Supervised learning ▪ Unsupervised learning ▪ Reinforcement learning Learning in ANN During the training of ANN under supervised learning, the input Supervised learning vector is fed to the network, which will produce an output vector. This output vector is compared with the desired output vector. An error signal is produced, if there is a difference between the actual output and the desired output vector. Based on this error signal, the weights are adjusted until the actual output is matched with the desired output. Learning in ANN During the training of ANN under unsupervised learning, the input vectors of similar type are combined to form Unsupervised learning clusters. When a new input pattern is applied, then the neural network gives an output response indicating the class to which the input pattern belongs. There is no feedback from the environment as to what should be the desired output and if it is correct or incorrect. Hence, in this type of learning, the network itself must discover the patterns and features from the input data, and the relation for the input data over the output. Learning in ANN During the training of network under reinforcement learning RL, Reinforcement learning the network receives some feedback from the environment. This makes it somewhat similar to supervised learning. However, the feedback obtained here is evaluative not instructive, which means there is no teacher as in supervised learning. After receiving the feedback, the network performs adjustments of the weights to get better information in future. Cost Functions Mean Squared Error (MSE) the cost function in an ANN is MSE measures the average squared difference often referred to as the error between the actual and predicted values. The goal is function or loss function. It to minimize this difference, which in turn reduces the quantifies the difference error. Used for regression problems where the output between the predicted output of the network and the actual is continuous. (or desired) output. The objective of training the ANN is to minimize this cost function, thereby reducing the error in predictions. Cost Functions Binary Cross-Entropy the cost function in an Artificial Neural Network (ANN) is often Cross-entropy measures the dissimilarity between the referred to as the error function true labels and the predicted probabilities. Minimizing or loss function. It quantifies this function encourages the model to make the difference between the predictions that are closer to the true labels, used for predicted output of the binary classification network and the actual (or desired) output. The objective of training the ANN is to minimize this cost function, thereby reducing the error in predictions. Cost Functions Categorical Cross-Entropy the cost function in an Artificial Neural Network (ANN) is often This function is used to compare the predicted class referred to as the error function probabilities with the actual class labels. The goal is to or loss function. It quantifies minimize this cost, making the predictions as close as the difference between the possible to the actual class labels., used for multi-class predicted output of the classification network and the actual (or desired) output. The objective of training the ANN is to minimize this cost function, thereby reducing the error in predictions. Backpropagation Target: To minimize the error (cost), the network needs to adjust its weights. This is done by calculating the gradient of the cost function with respect to each weight. Backpropagation Backpropagation is a method used to train ANNs, to reduce the difference between the model’s predicted output and the actual output by adjusting the weights and biases in the network. ✓ The Backpropagation algorithm involves two main steps: the Forward Pass and the Backward Pass. ✓ Backpropagation often utilizes optimization algorithms like gradient descent or stochastic gradient descent. to compute how the error changes with respect to the weights at each layer, starting from the output layer and moving backward through the network. ✓ The algorithm computes the gradient using the chain rule from calculus, allowing it to effectively navigate complex layers in the neural network to minimize the cost function. Backpropagation Consider the following ANN with a Sigmoid activation function σ: (i) Perform a forward pass on the network. (ii) Perform a reverse pass (training) once (target = 0.5). (iii) Perform a further forward pass and comment on the result. 0.35 0.9 Backpropagations (i) Perform a forward pass on the network h1= σ (0.35*0.1 + 0.9*0.8) h1= σ (0.755)= 1/(1+ e-0.755) 0.35 h1? h1= 0.68 Y? h2= σ (0.35*0.4 + 0.9*0.6) 0.9 h2? h2= σ (0.68)= 1/(1+ e-0.68) h2= 0.6637 Backpropagations (i) Perform a forward pass on the network y= σ (h1*0.3 + h2*0.9) y= σ (0.68*0.3 + 0.6637*0.9) 0.35 0.68 y= σ (0.80133) Y? y = 1/(1+ e- 0.80133) 0.9 0.6637 y= 0.69 Backpropagations (ii) Perform a reverse pass (training) once (target = 0.5). Error= 0.5(target-predicted)2 Error= 0.5(0.5- 0.69)2 0.35 0.68 = 0.01805 0.69 Wnew =Wold + η (target-predicted) xinput 0.9 0.6637 where, η is learning rate Gradient is η (target-predicted) x input Backpropagations (ii) Perform a reverse pass (training) once (target = 0.5). η= 1 Wnew =Wold + η (target-predicted) xinput W5new=w5+ 1* (0.5-0.69)*0.68 0.35 0.68 W5new=0.3+ 1* (-0.19)*0.68= 0.1708 W6new=0.9+ 1* (-0.19)*0.6637= 0.7739 0.69 W1new=0.1+ 1* (-0.19)*0.35= 0.0335 0.9 W2new=0.4+ 1* (-0.19)*0.35= 0.3335 0.6637 W3new=0.8+ 1* (-0.19)*0.9= 0.629 W4new=0.6+ 1* (-0.19)*0.9= 0.429 backward Backpropagations (iii) Perform a further forward pass and comment on the result. h1new= σ (0.35*0.0335 + 0.9*0.629) h1new= σ (0.5778)= 1/(1+ e-0.5778) 0.35 0.68 h1new= 0.6406 0.69 h2new= σ (0.35*0.3335 + 0.9*0.429) h2new= σ (0.5028)= 1/(1+ e-0.5028) 0.9 0.6637 h2new= 0.6231 Backpropagations (iii) Perform a further forward pass and comment on the result. ynew= σ (h1new*0.1709 + h2new*0.7739) 0.6406 ynew= σ (0.6406*0.1709 + 0.6231* 0.7739) 0.35 0.68 ynew= σ (0.5917) 0.69 ynew = 1/(1+ e- 0.5917) ynew= 0.64 0.9 0.6637 0.6231 Error=0.5(target-predicted)2 Errornew = 0.5(0.5-0.64)2 = 0.0098 Backpropagations (iii) Perform a further forward pass and comment on the result. Error= 0.5(target-predicted)2 0.6406 Errornew= 0.5(0.5-0.64)2 0.35 0.68 0.64 = 0.0098 0.69 ErrorOld = 0.01805 0.9 0.6637 0.6231 ML Vs Deep Learning (DL) ML Vs Deep Learning Machine Learning is a broader Deep Learning is a subset of field of AI where computers use Machine Learning that focuses data and algorithms to imitate on neural networks with many the way humans learn, gradually layers (hence "deep"). It mimics improving their accuracy over the structure and function of the time. It encompasses a variety human brain's neural networks of algorithms and techniques to automatically discover that enable machines to learn representations of data. DL from data without being models can automatically learn explicitly programmed for features from data, making them specific tasks. particularly effective for unstructured data like images, audio, and text. ML Vs Deep Learning ML Vs Deep Learning ML Vs Deep Learning Popular DL Neural Networks and Applications Advantages of ANNs Ability to learn from data: ANNs can learn complex patterns from large datasets through training, making them highly flexible for tasks like image recognition, speech processing, and natural language understanding. Non-linearity: ANNs can model complex, non-linear relationships between input and output, which makes them suitable for tasks where linear models fail. Adaptability: Once trained, ANNs can adapt to new data or changes in input characteristics, often without needing explicit instructions. Parallel Processing: ANN computations can be done in parallel, especially in deep networks, leading to faster processing on modern hardware (GPUs, TPUs). Generalization: Neural networks, if trained well, can generalize from seen data to unseen data, making them useful for real-world tasks like classification, prediction, and pattern recognition. Disadvantages of ANN Large data requirements: ANNs often require large amounts of labeled data for effective training, which might not always be available. Black-box nature: ANNs lack interpretability and transparency. It’s hard to understand how they make decisions or why they arrived at a specific output. Computationally expensive: Training deep neural networks can be resource-intensive, requiring significant computational power and time, especially for large datasets. Risk of overfitting: If not properly regularized, ANNs can overfit the training data, which means they perform well on training but poorly on unseen data. Hyperparameter tuning: Neural networks require careful tuning of many hyperparameters (like learning rate, number of layers, number of neurons, etc.), which can be a complex and time-consuming process.