Neural Network (10S3001) PDF
Document Details
Uploaded by Deleted User
Institut Teknologi Del
Samuel I. G. Situmeang
Tags
Summary
This document is a presentation on neural networks. It discusses the concept, architectures, activation functions, and training methods of neural networks.
Full Transcript
NEURAL NETWORK 10S3001 – Artificial Intelligence by Samuel I. G. Situmeang Faculty of Informatics and Electrical Engineering 1 OBJECTIVES Students are able to exp...
NEURAL NETWORK 10S3001 – Artificial Intelligence by Samuel I. G. Situmeang Faculty of Informatics and Electrical Engineering 1 OBJECTIVES Students are able to explain the concept of neural networks, including their common architectures and types. Students are able to describe commonly used activation functions in neural networks. Students are able to apply the simple perceptron algorithm to construct a classification model and use the classification model to perform accurate inference. 10S3001-AI | Institut Teknologi Del 2 Mahasiswa mampu menjelaskan konsep jaringan saraf, termasuk arsitektur dan jenisnya yang umum. Mahasiswa mampu menjelaskan fungsi aktivasi yang umum digunakan dalam jaringan saraf. Mahasiswa mampu menerapkan algoritma perceptron sederhana untuk membangun model klasifikasi dan menggunakan model klasifikasi untuk melakukan inferensi yang akurat. 2 BASIC CONCEPTS OF NEURAL NETWORK 10S3001-AI | Institut Teknologi Del 3 3 NEURAL NETWORKS The definition of a neural network, more properly referred to as an 'artificial' neural network (ANN), is provided by the inventor of one of the first neurocomputers, Dr. Robert Hecht-Nielsen. He defines a neural network as: "...a computing system made up of a number of simple, highly interconnected processing elements, which process information by their dynamic state response to external inputs.“ Or you can also think of Artificial Neural Network as computational model that is inspired by the way biological neural networks in the human brain process information. 10S3001-AI | Institut Teknologi Del 4 Neural networks and deep learning are big topics in Computer Science and in the technology industry, they currently provide the best solutions to many problems in image recognition, speech recognition and natural language processing. 4 BIOLOGICAL MOTIVATION AND CONNECTIONS The basic computational unit of the brain is a neuron. Approximately 86 billion neurons can be found in the human nervous system and they are connected with approximately 10¹⁴ —10¹⁵ synapses. The diagram below shows a cartoon drawing of a biological neuron (left) and a common mathematical model (right). 10S3001-AI | Institut Teknologi Del 5 5 BIOLOGICAL MOTIVATION AND CONNECTIONS The basic unit of computation in a neural network is the neuron , often called a node or unit. It receives input from some other nodes, or from an external source and computes an output. Each input has an associated weight (𝑤), which is assigned on the basis of its relative importance to other inputs. The node applies a function to the weighted sum of its inputs. 10S3001-AI | Institut Teknologi Del 6 6 BIOLOGICAL MOTIVATION AND CONNECTIONS The idea is that the synaptic strengths (the weights 𝑤) are learnable and control the strength of influence and its direction: excitory (positive weight) or inhibitory (negative weight) of one neuron on another. In the basic model, the dendrites carry the signal to the cell body where they all get summed. If the final sum is above a certain threshold, the neuron can fire, sending a spike along its axon. 10S3001-AI | Institut Teknologi Del 7 7 BIOLOGICAL MOTIVATION AND CONNECTIONS In the computational model, we assume that the precise timings of the spikes do not matter, and that only the frequency of the firing communicates information. We model the firing rate of the neuron with an activation function (e.x sigmoid function), which represents the frequency of the spikes along the axon. 10S3001-AI | Institut Teknologi Del 8 8 NEURAL NETWORK ARCHITECTURE 10S3001-AI | Institut Teknologi Del 9 9 NEURAL NETWORK ARCHITECTURE From the previous explanation we can conclude that a neural network is made of neurons. Biologically the neurons are connected through synapses where informations flows (weights for out computational model). When we train a neural network, we want the neurons to fire whenever they learn specific patterns from the data, and we model the fire rate using an activation function. But, that’s not everything… 10S3001-AI | Institut Teknologi Del 10 10 NEURAL NETWORK ARCHITECTURE Input nodes (input layer): No computation is done here within this layer, they just pass the information to the next layer (hidden layer most of the time). A block of nodes is also called layer. 10S3001-AI | Institut Teknologi Del 11 11 NEURAL NETWORK ARCHITECTURE Hidden nodes (hidden layer): In Hidden layers is where intermediate processing or computation is done, they perform computations and then transfer the weights (signals or information) from the input layer to the following layer (another hidden layer or to the output layer). It is possible to have a neural network without a hidden layer. 10S3001-AI | Institut Teknologi Del 12 12 NEURAL NETWORK ARCHITECTURE Output nodes (output layer): Here we finally use an activation function that maps to the desired output format (e.g. softmax for classification). 10S3001-AI | Institut Teknologi Del 13 13 NEURAL NETWORK ARCHITECTURE Connections and weights: The network consists of connections, each connection transferring the output of a neuron 𝑖 to the input of a neuron 𝑗. In this sense 𝑖 is the predecessor of 𝑗 and 𝑗 is the successor of 𝑖, each connection is assigned a weight 𝑤𝑖𝑗. 10S3001-AI | Institut Teknologi Del 14 14 NEURAL NETWORK ARCHITECTURE Activation function: the activation function of a node defines the output of that node given an input or set of inputs. A standard computer chip circuit can be seen as a digital network of activation functions that can be “ON” (1) or “OFF” (0), depending on input. This is similar to the behavior of the linear perceptron in neural networks. However, it is the nonlinear activation function that allows such networks to compute nontrivial problems using only a small number of nodes. In artificial neural networks this function is also called the transfer function. 10S3001-AI | Institut Teknologi Del 15 15 NEURAL NETWORK ARCHITECTURE Learning rule: The learning rule is a rule or an algorithm which modifies the parameters of the neural network, in order for a given input to the network to produce a favored output. This learning process typically amounts to modifying the weights and thresholds. 10S3001-AI | Institut Teknologi Del 16 16 TYPES OF NEURAL NETWORKS 10S3001-AI | Institut Teknologi Del 17 17 TYPES OF NEURAL NETWORKS There are many classes of neural networks and these classes also have sub-classes. Source: http://www.asimovinstitute.org/neural-network- zoo/ 10S3001-AI | Institut Teknologi Del 18 18 TYPES OF NEURAL NETWORKS Here, we will list the most used ones and make things simple to move on in this journey to learn neural networks. 1. Feedforward Neural Network 1.1. Single-layer Perceptron 1.2. Multi-layer Perceptron (MLP) 1.3. Convolutional Neural Network (CNN) 2. Recurrent neural networks 10S3001-AI | Institut Teknologi Del 19 19 FEEDFORWARD NEURAL NETWORK A feedforward neural network is an artificial neural network where connections between the units do not form a cycle. In this network, the information moves in only one direction, forward, from the input nodes, through the hidden nodes (if any) and to the output nodes. There are no cycles or loops in the network. We can distinguish two types of feedforward neural networks: Single-layer Perceptron Multi-layer Perceptron (MLP) 10S3001-AI | Institut Teknologi Del 20 20 SINGLE-LAYER PERCEPTRON This is the simplest feedforward neural Network and does not contain any hidden layer, which means it only consists of a single layer of output nodes. This is said to be single because when we count the layers we do not include the input layer, the reason for that is because at the input layer no computations is done, the inputs are fed directly to the outputs via a series of weights. Simple 10S3001-AI Perceptron | Institut Teknologi Del 21 21 SINGLE-LAYER PERCEPTRON ALGORITHM 1. Inisialisasi semua bobot dan bias (umumnya nilai awal 𝑤𝑖 = 𝑏 = 0) 2. Selama ada elemen vektor masukan yang respon unit keluarannya tidak sama dengan target, lakukan: 2.1 Set aktivasi unit masukan 𝑥𝑖 = 𝑠𝑖 , dimana 𝑖 = 1, … , 𝑛 2.2 Hitung respon unit keluaran: 𝑛𝑒𝑡 = σ𝑖 𝑥𝑖 𝑤𝑖 + 𝑏 1 if 𝑛𝑒𝑡 > 𝜃 𝑓 𝑛𝑒𝑡 = ቐ 0 if − 𝜃 ≤ 𝑛𝑒𝑡 ≤ 𝜃 −1 if 𝑛𝑒𝑡 < −𝜃 2.3 Perbaiki bobot pola yang mengadung kesalahan menurut persamaan: 𝑤𝑖 𝑏𝑎𝑟𝑢 = 𝑤𝑖 𝑙𝑎𝑚𝑎 + ∆𝑤 dimana (𝑖 = 1, … , 𝑛) dengan ∆𝑤 = 𝛼𝑡𝑥𝑖 𝑏(𝑏𝑎𝑟𝑢) = 𝑏(𝑙𝑎𝑚𝑎) + ∆𝑏 dengan ∆𝑏 = 𝛼𝑡 Dimana: 𝛼 = Laju pembelajaran (learning rate) yang ditentukan ѳ = Threshold yang ditentukan 𝑡 = Target 2.4 Ulangi iterasi sampai perubahan bobot (∆𝑊𝑛 = 0) tidak ada 10S3001-AI | Institut Teknologi Del 22 22 MULTI-LAYER PERCEPTRON This class of networks consists of multiple layers of computational units, usually interconnected in a feed-forward way. Each neuron in one layer has directed connections to the neurons of the subsequent layer. 10S3001-AI | Institut Teknologi Del 23 23 MULTI-LAYER PERCEPTRON In many applications, the units of these networks apply a sigmoid function as an activation function. MLP are very more useful and one good reason is that, they are able to learn non-linear representations (most of the cases the data presented to us is not linearly separable). 10S3001-AI | Institut Teknologi Del 24 24 CONVOLUTIONAL NEURAL NETWORK (CNN) Convolutional Neural Networks are very similar to ordinary Neural Networks, they are made up of neurons that have learnable weights and biases. In convolutional neural network (CNN, or ConvNet or shift invariant or space invariant) the unit connectivity pattern is inspired by the organization of the visual cortex. Units respond to stimuli in a restricted region of space known as the receptive field. Receptive fields partially overlap, over-covering the entire visual field. Unit response can be approximated mathematically by a convolution operation. 10S3001-AI | Institut Teknologi Del 25 25 CONVOLUTIONAL NEURAL NETWORK (CNN) They are variations of multilayer perceptrons that use minimal preprocessing. Their wide applications is in image and video recognition, recommender systems and natural language processing. CNNs requires large data to train on. CNN for image classification 10S3001-AI | Institut Teknologi Del 26 26 RECURRENT NEURAL NETWORKS In recurrent neural network (RNN), connections between units form a directed cycle (they propagate data forward, but also backwards, from later processing stages to earlier stages). This allows it to exhibit dynamic temporal behavior. Unlike feedforward neural networks, RNNs can use their internal memory to process arbitrary sequences of inputs. This makes them applicable to tasks such as unsegmented, connected handwriting recognition, speech recognition and other general sequence processors. 10S3001-AI | Institut Teknologi Del 27 27 APPLICATION OF DEEP LEARNING 10S3001-AI | Institut Teknologi Del 28 28 AUTOMATIC SHORT ANSWER GRADING (ASAG) ASAG is a scoring system where specific short answers in free text form can be assessed automatically. IndoBERT is a state-of-the-art deep learning architecture specifically designed for Indonesian language processing. Situmeang et al. proposed an IndoBERT-based ASAG model for regression analysis. Samuel Indra Gunawan Situmeang, Raja Muda Gading Tulen Sihite, Humasak Simanjuntak, and Junita Amalia. 2023. A Deep Learning-Based Regression Approach to Indonesian Short Answer Grading System. In Proceedings of the 8th International Conference on Sustainable Information Engineering and Technology (SIET '23). Association for Computing Machinery, New York, NY, USA, 201–209. https://doi.org/10.1145/3626641.3626929 10S3001-AI | Institut Teknologi Del 29 29 COMMONLY USED ACTIVATION FUNCTIONS 10S3001-AI | Institut Teknologi Del 30 30 COMMONLY USED ACTIVATION FUNCTIONS Activation functions also known as transfer function is used to map input nodes to output nodes in certain fashion. Every activation function (or non-linearity) takes a single number and performs a certain fixed mathematical operation on it. Source: https://towardsdatascience.com/ 10S3001-AI | Institut Teknologi Del 31 31 COMMONLY USED ACTIVATION FUNCTIONS There are many activation functions used in Machine Learning. Here are some activations functions you will often find in practice: Source: https://engmrk.com/ 10S3001-AI | Institut Teknologi Del 32 32 TRAINING AND TESTING 10S3001-AI | Institut Teknologi Del 33 33 EXAMPLE Buat jaringan Perceptron untuk menyatakan fungsi logika AND dengan menggunakan masukan biner dan keluaran bipolar. Pilih 𝛼 = 1 dan 𝜃 = 0.2 10S3001-AI | Institut Teknologi Del 34 34 EXAMPLE: TRAINING Pola hubungan masukan-target : x1 x2 t 0 0 -1 0 1 -1 1 0 -1 1 1 1 x1 w1 n a f w2 x2 10S3001-AI | Institut Teknologi Del 35 35 EXAMPLE: TRAINING Masukan Target Output Perubahan bobot Bobot baru w = xi t b = t wbaru = wlama + w bbaru = blama + b x1 x2 1 t n a=f(n) w1 w2 b w1 w2 b Epoch ke - 1 0 0 0 0 0 1 -1 0 1 1 -1 1 0 1 -1 1 1 1 1 istilah epoch digunakan karena ketika melakukan satu kali iterasi dilakukan dengan rambatan balik. misalnya satu iterasi melibatkan proses a-b-c-d, maka satu epoch melibatkan a-b-c-d-c-b-a. 10S3001-AI | Institut Teknologi Del 36 36 EXAMPLE: TRAINING Masukan Target Output Perubahan bobot Bobot baru w = xi t b = t wbaru = wlama + w bbaru = blama + b x1 x2 1 t n a=f(n) w1 w2 b w1 w2 b Epoch ke - * 0 0 1 -1 0 1 1 -1 1 0 1 -1 1 1 1 1 10S3001-AI | Institut Teknologi Del 37 37 EXAMPLE: TRAINING 1 if 𝑛𝑒𝑡 > 𝜃 𝑓 𝑛𝑒𝑡 = ቐ 0 if − 𝜃 ≤ 𝑛𝑒𝑡 ≤ 𝜃 −1 if 𝑛𝑒𝑡 < −𝜃 𝑛𝑒𝑡 = 𝑥𝑖 𝑤𝑖 + 𝑏 𝑖 ∆𝑏 = 𝛼𝑡 ∆𝑤 = 𝛼𝑡𝑥𝑖 𝑏(𝑏𝑎𝑟𝑢) = 𝑏(𝑙𝑎𝑚𝑎) + ∆𝑏 10S3001-AI | Institut Teknologi Del 38 𝑤𝑖 𝑏𝑎𝑟𝑢 = 𝑤𝑖 𝑙𝑎𝑚𝑎 + ∆𝑤 38 EXAMPLE: TRAINING Iterasi dihentikan pada epoch ke-10 karena 𝒇(𝒏𝒆𝒕) sudah sama dengan target nya 10S3001-AI | Institut Teknologi Del 39 39 EXAMPLE: TESTING Model yang didapatkan dari training: x1 x2 t 0 0 -1 𝑛𝑒𝑡 = 2𝑥1 + 3𝑥2 − 4 0 1 -1 1 0 -1 1 1 1 Jika dilakukan testing dengan menggunakan 𝑥1 = 1 dan 𝑥2 = 0 (sebagai contoh, digunakan data yang sama dengan data training), maka 𝑛𝑒𝑡 = 2 ∗ 1 + 3 ∗ 0 − 4 = −2. 𝑦 = sign 𝑛𝑒𝑡 = sign −2 = −1. 10S3001-AI | Institut Teknologi Del 40 40 SUMMARY Artificial Neural Network is a computational model that is inspired by the way biological neural networks in the human brain process information. 10S3001-AI | Institut Teknologi Del 41 41 REFERENCES S. J. Russell and P. Borvig, Artificial Intelligence: A Modern Approach (4th Edition), Prentice Hall International, 2020. Chapter 19. Learning from Examples J. Han and M. Kamber, “Data Mining: Concepts and Techniques (3rd Edition),” Elsevier, 2012. Neural Networks and Deep Learning, http://neuralnetworksanddeeplearning.com/index.html (Accessed on November 27th, 2018). CS231n Convolutional Neural Networks for Visual Recognition, http://cs231n.github.io/neural-networks- 1/, (Accessed on November 27th, 2018). A Basic Introduction To Neural Networks, http://pages.cs.wisc.edu/~bolo/shipyard/neural/local.html (Accessed on November 27th, 2018). A Gentle Introduction To Neural Networks Series —Part 1, https://towardsdatascience.com/a-gentle- introduction-to-neural-networks-series-part-1-2b90b87795bc (Accessed on November 27th, 2018). 10S3001-AI | Institut Teknologi Del 42 42 eof Faculty of Informatics and Electrical Engineering 43