Chapter 2 - Artificial Neural Networks (ANNs) PDF

Document Details

Uploaded by Deleted User

Dr. Mahmoud Elsabagh

Tags

artificial neural networks computational intelligence machine learning deep learning

Summary

This document is a lecture on artificial neural networks (ANNs) - A summary of a chapter on computational intelligence. It covers the fundamental concepts of ANNs, such as their structure, activation functions, and different types of ANNs. The document is presented through diagrams and figures.

Full Transcript

Computational Intelligence Chapter 2 Artificial Neural Networks (ANNs) Dr. Mahmoud Elsabagh Contents 1. Chapter 1: Introduction to Computational Intelligence 2. Chapter 2: Artificial Neural Networks (ANN) 3. Chapter 3: Deep Learning and...

Computational Intelligence Chapter 2 Artificial Neural Networks (ANNs) Dr. Mahmoud Elsabagh Contents 1. Chapter 1: Introduction to Computational Intelligence 2. Chapter 2: Artificial Neural Networks (ANN) 3. Chapter 3: Deep Learning and Convolutional Neural Networks (CNN) 4. Chapter 4: Evolutionary Computation 5. Chapter 5: Fuzzy Logic Systems 6. Chapter 6: Swarm Intelligence 7. Chapter 7: Machine Learning in Computational Intelligence 8. Chapter 8: Hybrid Systems in Computational Intelligence 9. Chapter 9: Advanced Topics in Computational Intelligence 10. Chapter 10: Computational Intelligence Applications in Industry Chapter 2: Artificial Neural Networks (ANNs) 3 Contents 1. Introduction to Artificial Neural Networks 2. Structure of an Artificial Neuron (Perceptron) 3. ANN Architecture: Input, Hidden, and Output Layers 4. Activation Functions: Bringing Non-Linearity 5. Backpropagation and Gradient Descent 6. Types of Artificial Neural Networks 7. Training Process of ANNs 8. Real-World Applications of ANNs 9. Challenges and Limitations of ANNs 1. Introduction to Artificial Neural Networks ➜ Artificial Neural Networks (ANNs) are among the most fundamental tools in computational intelligence, providing solutions to complex problems such as image recognition, natural language processing, and autonomous control. This lecture will cover the basic building blocks of ANNs, the mathematics behind them, different types of ANNs, and their real-world applications. Along with textual explanations, the key concepts will be illustrated through diagrams and figures. ➜ An Artificial Neural Network (ANN) is a computational model that mimics the structure of biological neural networks. ANNs are used for tasks such as pattern recognition, classification, regression, and more, by learning from data. The core of an ANN is the artificial neuron (also known as a perceptron), which processes inputs and generates outputs. Biological Background ➜ Neuron consists of: - Cell body - Dendrites - Axon - Synapses ➜ Neural activation : - Throught dendrites/axon - Synapses have different strengths ANN ingredients Simulation on ANN 1. Introduction to Artificial Neural Networks ➜ Key Components: - Inputs (X): Features or data points provided to the model. - Weights (W): Parameters that control the influence of inputs on the neuron's output. - Activation Function (f): Non-linear function that determines the neuron's output. - Output (Y): The final result after processing inputs through layers of neurons. 2. Structure of an Artificial Neuron (Perceptron) ➜ An artificial neuron (often referred to as a perceptron) processes multiple inputs by applying weights and an activation function to produce an output. 2. Structure of an Artificial Neuron (Perceptron) Perceptron Structure Artificial Neuron Example Input links Unit Output links (dendrites) (cell body) (axon) aj Wji ini = ai = ai SajWji g(ini) Perceptron Ij Wj O -1 W0 a1 W1 W2 in = a= a a2 SajWj g(in) a = g(-W0 + W1a1 + W2a2) g(in) = 0, { in0 3. ANN Architecture: Input, Hidden, and Output Layers ➜ An ANN typically consists of: - Input Layer: Receives raw data. - Hidden Layers: Perform intermediate computations; multiple layers allow the network to learn complex features. - Output Layer: Provides the final prediction or classification. 3. ANN Architecture: Input, Hidden, and Output Layers ➜ Architecture of an Artificial Neural Network ➜ Each layer consists of neurons, and each neuron in one layer is connected to all neurons in the next layer. Modelling a Neuron aj :Activation value of unit j ini   j Wj , iaj wj,I :Weight on the link from unit j to unit i inI :Weighted sum of inputs to unit i aI :Activation value of unit i g :Activation function 4. Activation Functions: Bringing Non-Linearity To allow the network to solve non-linear problems, activation functions are used after the weighted sum of inputs. 4. Activation Functions: Bringing Non-Linearity ➜ Key Points: - Activation functions introduce non-linearity, allowing the network to model more complex patterns. - ReLU is especially popular because of its simplicity and efficiency in training deep networks. Activation Functions Stept(x) = 1 if x >= t, else 0 Sign(x) = +1 if x >= 0, else –1 Sigmoid(x) = 1/(1+e-x) Identity Function Relu(x) = max(o,X) 5. Backpropagation and Gradient Descent ➜ Backpropagation is the learning algorithm used in ANNs, where the error (loss) is propagated backward through the network to update weights. The goal is to minimize the error using an optimization technique called gradient descent. ➜ Key Steps in Backpropagation: - Forward Pass: Compute the network output by propagating the input forward through the layers. - Calculate Loss: Measure the error between predicted output and actual output using a loss function (e.g., Mean Squared Error, Cross-Entropy). - Backward Pass: Update the weights by calculating the gradients of the loss with respect to each weight using the chain rule. - Weight Update: Update weights using the following rule: 6. Types of Artificial Neural Networks ➜ Feed-forward Neural Networks (FNNs): - The simplest type of neural network, where data flows in one direction—from input to output. - Used in tasks like classification and regression. Multiple layers: Input Hidden Output hidden layer(s) Feed-forward: Output links only connected to input links in the next layer Complex non-linear functions can be represented 6. Types of Artificial Neural Networks ➜ Convolutional Neural Networks (CNNs): - Primarily used for image-related tasks. They consist of convolutional layers that capture spatial hierarchies in the data. - Applications: Image classification, object detection, face recognition. ➜ CNN Architecture 6. Types of Artificial Neural Networks ➜ Recurrent Neural Networks (RNNs): - Used for sequential data, such as time series or text. RNNs maintain a hidden state that captures information from previous time steps. - Applications: Natural language processing, speech recognition, time-series forecasting. No restrictions on connections Behaviour more difficult to predict/ understand Apps: Voice to text 7. Training Process of ANNs ➜ Data Preprocessing: - Normalization: Input data is normalized to ensure that all features have similar scales. - Train-Test Split: The dataset is split into training, validation, and testing sets to evaluate the model. ➜ Hyperparameters: - Learning Rate (η): Controls how big the step is when updating weights. - Epochs: The number of times the entire dataset is passed through the network during training. - Batch Size: Number of samples processed before the model is updated. 8. Real-World Applications of ANNs ➜ Image Recognition: ○ CNNs are widely used in systems like facial recognition (e.g., on smartphones) and medical image analysis (e.g., tumor detection in MRIs). ➜ Natural Language Processing (NLP): ○ RNNs and transformers power machine translation, chatbots, and text generation. ➜ Autonomous Vehicles: ○ Neural networks are used for sensor fusion, object detection, and decision-making in self-driving cars (e.g., Tesla's Autopilot). ➜ Healthcare: ○ ANNs assist in diagnosis (e.g., identifying diseases from medical scans) and personalized medicine (predicting patient responses to treatments). 9. Challenges and Limitations of ANNs ➜ Data Requirements: ○ ANNs need a large amount of labeled data to learn effectively. ➜ Computational Power: ○ Training deep networks is computationally intensive, requiring GPUs or cloud computing resources. ➜ Black Box Nature: ○ Understanding how a trained ANN makes its decisions is often difficult, which raises concerns in critical applications (e.g., healthcare, legal systems). ➜ Overfitting: ○ ANNs may perform well on training data but poorly on unseen data if overfitting occurs. Regularization techniques such as dropout are used to mitigate this. Thanks! Any questions? 27

Use Quizgecko on...
Browser
Browser