Artificial Intelligence for Big Data, Neural Networks & Deep Learning PDF

Artificial Intelligence for Big Data Neural Network & Deep Learning for Big Data Dr. Feras Al-Obeidat Feras Al-Obeidat 1 Deep Learning Dr. Feras Al- Obeidat Feras...

Artificial Intelligence for Big Data Neural Network & Deep Learning for Big Data Dr. Feras Al-Obeidat Feras Al-Obeidat 1 Deep Learning Dr. Feras Al- Obeidat Feras Al-Obeidat 2 Feras Al-Obeidat 3 Deep Neural Networks To utilize ANNs for real-world problems, which involve hundreds or thousands of input variables involve more complex models, and require the models to store more information, we need more complex structures that are realized with large numbers of hidden layers. These types of networks are called Deep Neural Networks and utilizing these Deep Neural Networks for modeling the real data is termed deep learning. With the addition of nodes and their interconnections, the Deep Neural Networks can model unstructured input, such as audio, video, and images. Feras Al-Obeidat 4 The Building Blocks of Deep Learning Gradient descent Backpropagation Non-linearities Dropout Feras Al-Obeidat 5 Deep learning basics and the building blocks The more input data we have along with variations, the more accurate the models can be generated, which requires more storage and computation power. Since the computation power and storage is available with the development of the big data analytics platforms, it is possible to experiment with large neural networks with hundreds or thousands of nodes in the input layer, and hundreds or thousands of hidden layers. These types of ANNs are called Deep Neural Networks. The Deep Neural Networks perform better in terms of accuracy and reliability with increasing amount of data. The use of these multi-layered neural networks for hypothesis generation is termed deep learning. Feras Al-Obeidat 6 ANN versus Deep Neural Network Feras Al-Obeidat 7 Cost Function of Deep Learning Deep Neural Networks can train the connection weights without sacrificing intelligence. The model accurately represents historical facts based on data and has a level of generalization that suits most mission-critical applications. The objective of all the learning methods is to minimize the cost function. The cost function value is inversely proportional to the model's accuracy. Feras Al-Obeidat 8 Cost Function: Deep Neural Network This function will always be positive since it takes the square of the difference: w: collection of all the weights in the network b: all the biases n: training data size (number of samples) a: vector of outputs from the network corresponding to x as input value Feras Al-Obeidat 9 Deep Neural Networks learning- Gradient-based learning Gradient descent, as applicable to the Deep Neural Network, essentially means we define the weights and biases for the neuron connections so as to reduce the value of the cost function. The network is initialized to a random state (random weights and bias values) and the initial cost value is calculated. The weights are adjusted with the help of the derivative of cost with respect to weights on the Deep Neural Network. Feras Al-Obeidat 10 Cost Function The dotted line is a tangent at a point on the cost function. The cost function represents the aggregate difference between the expected and the actual output from the deep neural network. Feras Al-Obeidat 11 Non-linearities features space is inconsistent and cannot be separated with a line. We need some type of nonlinear or quadratic equation in order to derive the decision boundary. Most of the real-world scenarios are represented with the second type of feature space. Feras Al-Obeidat 12 Non-Linear Activations The deep neural networks receive data at the input layer, process the data, map it mathematically within the hidden layers, and generate output in the last layer. In order for the deep neural network to understand the feature space and model it accurately for predictions, we need some type of non-linear activation function. If the activation functions on all the neurons are linear, there is no significance for the deep neural networks. In order to model the complex feature spaces, we require non-linearities within the nodes' activation functions. In the case of the more complex data input, such as images and audio signals, the deep neural networks model the feature space with weights and biases on the connectors. The non-linear activations define whether a neuron fires or not based on the input signal and the applied activation function The typical nonlinear functions that are deployed in the deep neural networks are: Feras Al-Obeidat 13 Sigmoid function a mathematical function that takes the shape of 'S' and ranges between 0 and 1. Feras Al-Obeidat 14 Tanh function a variation of the sigmoid for which the values range from -1 to 1. Feras Al-Obeidat 15 Rectified linear unit (RELU) function outputs 0 for any negative value of x and equals the value of x when it is positive: Feras Al-Obeidat 16 Feras Al-Obeidat 17 Dropout in Deep Learning Dropout is a popular regularization technique used to prevent overfitting. When the deep neural network memorizes all the training data, it does not generalize well enough to produce accurate results with the new test data. This is termed overfitting. Dropout is used primarily for preventing overfitting. This is a simple technique to implement. During the training phase, the algorithm selects the nodes from the deep neural network to be dropped (activation value set to 0). For example, if a dropout rate of 0.5 is selected, during each of the epochs, there is 50% chance that the node will not participate in the learning process. The network with dropout can be visualized as follows: Feras Al-Obeidat 18 Feras Al-Obeidat 19 Dropout By dropping out the nodes, a penalty is added to the loss function. Due to this, the model is prevented from memorizing by learning interdependence between neurons in terms of activation values as well as corresponding connecting weights. As a result of the dropout where the activation on the dropped-out units is 0, we are going to have a reduced value on the subsequent nodes in the network, we need to add a multiplication factor of : 1 - drop_out_rate (1 - 0.5 in our case) to the nodes that are participating in the training process. This process is called inverted dropout. With this, the activation on the participating node is Feras Al-Obeidat 20 Feras Al-Obeidat 21 Convolutional Neural Network (CNN) Feras Al-Obeidat 22 Convolutional Neural Network (CNN) A Convolutional Neural Network (CNN) is a deep learning model commonly used for analyzing visual data, such as images and videos. CNNs are particularly effective at recognizing patterns and features in image data because they are designed to automatically and adaptively learn spatial features through filters or kernels. Feras Al-Obeidat 23 Components of CNN 1.Convolutional Layers: These layers apply convolution operations to the input data, typically an image. A convolution operation slides a filter (kernel) over the input image to produce feature maps that capture essential features such as edges, textures, or shapes. 2.Pooling Layers: Pooling reduces the dimensionality of the feature maps, helping to minimize the computational cost and reduce the chances of overfitting. 3.Fully Connected Layers: After several convolutional and pooling layers, the high-level reasoning in the neural network is done through fully connected layers, where each neuron is connected to every neuron in the previous layer. 4.Activation Functions: Activation functions like ReLU (Rectified Linear Unit) introduce non-linearity into the network, allowing it to learn more complex patterns Feras Al-Obeidat 24 Feras Al-Obeidat 25 Feras Al-Obeidat 26 CNNs Applications CNNs are widely used in applications like: Image classification: Assigning a label to an entire image (e.g., "cat" or ”lion"). Object detection: Identifying and locating multiple objects within an image. Segmentation: Classifying each pixel in an image into categories. Facial recognition: Recognizing or verifying faces in images or videos. Compared to traditional neural networks, their architecture and design make them highly effective for image-related tasks. Feras Al-Obeidat 27 Feras Al-Obeidat 28 Backpropagation So far, we have traversed the ANN in one direction, which is termed as forward propagation. The ultimate goal in training the ANN is to derive the weights on each of the connections between the nodes so as to minimize the prediction error. backpropagation. The fundamental idea is that once we know the difference between the actual value of the predictor variable based on the training example, the error is calculated. The error in the final output layer is a function of the activation values of the nodes on the previous hidden layer. Each node in the hidden layer contributes with a different degree for the output error. The idea is to fine-tune the weights on the connectors so as to minimize the final output error. This will essentially help us to define how the hidden units should look based on the input and how the output is intended to look. The algorithm receives training input, one at a time. We feed forward to get predictions for a class by multiplying weights and the application of the activation function, get prediction errors based on the true label, and push the error back into the network in the reverse direction. Feras Al-Obeidat 29 Backpropagation Model Feras Al-Obeidat 30 Backpropagation steps 1.Inputs X, arrive through the preconnected path 2.Input is modeled using real weights W. weights are usually randomly selected. 3.Calculate the output for every neuron from the input layer, to the hidden layers, to the output layer. 4.Calculate the error in the outputs ErrorB= Actual Output – Desired Output 5. Travel back from the output layer to the hidden layer to adjust the weights such that the error is decreased. Keep repeating the process until the desired output is achieved Feras Al-Obeidat 31 Feras Al-Obeidat 32 The error Feras Al-Obeidat 33 Example NN Backpropagation target output = 0.5 learningRate = 1 Feras Al-Obeidat 34 Example NN Backpropagation target output = 0.5 learningRate = 1 H4=x1*w14 + x2*w24 H4= 0.55 ## calculate the activation function of output 4 oH4= 1/(1+exp(-H4)) oH4 = 0.6341356 Calculate the target output with initial weights from hidden layers to output layers: M5= oH3*w35 + oH4*w45 𝐇 3=𝑥 1 ∗𝑤 13+ 𝑥 2∗ 𝑤 23 M5= 0.592436 H3 calculate the activation function of the output of the = 0.26 network > oM5 = 1/(1+exp(-M5)) O = oM5= 0.6439239 0.564636 ## this is the network output 3 Now calculate the error Error = target - oM5 Error = -0.1439239 Feras Al-Obeidat 35 # # if we didn't reach the target, weights need target Example output = 0.5 NN Backpropagation – Weights’ Updat learningRate = 1 ## now let's update the weights accordingl delta_w45 = learningRate * delta5 * oH4 delta_w45 = -0.0209263 w45_new ### start calculating deta - the error rate = delta_w45 + w45 oH3= # calculate delta for output layerw45_new = 0.3790737 0.5646363 delta5 = oM5*(1-oM5)*(target-oM5) oH4 = delta_w35 = learningRate * delta5 * oH3 delta5 = -0.03299972 0.6341356 delta_w35 = -0.01863284 Error = - oM5= delta3 = oH3*(1-oH3)*w35 * delta5w35_new = delta_w35 + w35 0.1439239 0.6439239 delta3 = -0.004867237 w35_new 0.5813672 delta4 = oH4*(1-oH4)*w45 * delta5 delta4 = -0.003062475 Feras Al-Obeidat 36 target Example output = 0.5 NN Backpropagation – Weights’ Updat learningRate = 1 delta_W13 =learningRate*delta3*x1 delta_W13 = -0.001946895 ### start calculating delta - the w13_new = delta_W13 + w13 oH3= w13_new = 0.2980531 error rate 0.5646363 # calculate delta for output layer oH4 = delta5 = oM5*(1-oM5)*(target- 0.6341356 delta_W14 =learningRate*delta4*x1 Error = - oM5) oM5= delta3 = = -0.03299972 oH3*(1-oH3)*w35 * delta_W14= -0.00122499 0.1439239 delta5 0.6439239 delta5 w14_new = delta_W14 + w14 delta3 = -0.004867237 w14_new = 0.498775 delta4 = oH4*(1-oH4)*w45 * delta5 delta4 = -0.003062475 Feras Al-Obeidat 37 target output = 0.5 Example learningRate = 1 NN Backpropagation – Weights’ Up delta_W23 =learningRate*delta3*x2 delta_W23 = -0.003407066 ### start calculating delta - the w23_new = delta_W23 + w23 oH3= w23_new =0.1965929 error rate 0.5646363 # calculate delta for output layer oH4 = delta_W24 =learningRate*delta4*x2 delta5 = oM5*(1-oM5)*(target- 0.6341356 delta_W24 = -0.002143732 Error = - oM5) oM5= delta3 = = -0.03299972 oH3*(1-oH3)*w35 * w24_new = delta_W24 + w24 0.1439239 delta5 0.6439239 delta5 w24_new = 0.4978563 delta3 = -0.004867237 delta4 = oH4*(1-oH4)*w45 * delta5 delta4 = -0.003062475 Feras Al-Obeidat 38 Example NN target output = 0.5 Backpropagati learningRate = 1 on weight originalWeig updatedWeig s_ hts hts w13 0.3 0.2980531 w23 0.2 0.1965929 w14 0.5 0.498775 w24 0.5 0.4978563 w35 0.6 0.5813672 w45 0.4 0.3790737 Feras Al-Obeidat 39 Example NN Backpropagation after weights updates  ## this is the new network output with new updated weights  oM5 =0.6383057 Feras Al-Obeidat 40 Thank you Dr. Feras Al-Obeidat Feras Al-Obeidat 41 References https://www.guru99.com/backpropogation-neural-netwo rk.html https://medium.com/swlh/importance-of-activation-funct ions-in-neural-networks-bc39964311dd https://machinelearningmastery.com/an-introduction-to- recurrent-neural-networks-and-the-math-that-powers-th em/ Feras Al-Obeidat 42

Artificial Intelligence for Big Data, Neural Networks & Deep Learning PDF

Document Details

Tags

Related

Summary

Full Transcript