Artificial Neural Networks Lecture PDF
Document Details
Taif University
Dr. Maha Jarallah
Tags
Summary
This lecture provides an introduction to artificial neural networks (ANNs), delving into the fundamental concepts of machine learning and their relationship to biological neural networks. It covers topics like different types of learning, the history of ANN development, and practical applications within various fields like medical diagnosis and data mining.
Full Transcript
Artificial Neural Networks College of Computers and Information Technology Taif University Dr. Maha Jarallah M AH A. J @ T U. E D U. S A Topics Machine Learning Basics Analogy between Biological Neural Networks (BNN) and Artif...
Artificial Neural Networks College of Computers and Information Technology Taif University Dr. Maha Jarallah M AH A. J @ T U. E D U. S A Topics Machine Learning Basics Analogy between Biological Neural Networks (BNN) and Artificial Neural Networks (ANN) History of ANN Applications of ANN Machine Learning Most neural networks in use are a form of deep learning. Neural Networks Machine Learning Machine learning is a tool for transforming data into knowledge. Machine learning techniques are utilized to consequently locate the significant fundamental knowledge inside complex data ✓ High rate of data/information creation by applications, ✓ Expansion of computation power over the years, and ✓ Development of better algorithms. ❖ Predict future events ❖ Make all sorts of complex decisions. Machine Learning versus Traditional Programming Steps of Machine Learning Step 1 Step 2 Step 3 Step 4 Step 5 Step 6 Step 7 Data Preparing the Selecting Training Hyperparameter Evaluation Prediction Gathering Data a Model the Model on Tuning the Data Steps of Machine Learning Step 1 Step 2 Step 3 Step 4 Step 5 Step 6 Step 7 Data Preparing Selecting Training Evaluation Hyperparameter Prediction Gathering the Data a Model the Model Tuning on the Data This step is very important because the quality and quantity of data that we gather will directly determine how well or badly your model will work. To develop our machine learning model, our first step would be to gather relevant data that can be used to train the model. Steps of Machine Learning Step 1 Step 2 Step 3 Step 4 Step 5 Step 6 Step 7 Data Preparing Selecting Training Evaluation Hyperparameter Prediction Gathering the Data a Model the Model Tuning on the Data In this step, we wrangle the data collected in Step 1 and prepare it for training. We can clean the data by removing duplicates, correct errors, deal with missing values, data type conversions, and so on. We can also do the visualization of the data, as this will help us to see if there are any relevant relationships between the different attributes, how we can take their advantage Steps of Machine Learning Step 1 Step 2 Step 3 Step 4 Step 5 Step 6 Step 7 Data Preparing Selecting Training Evaluation Hyperparameter Prediction Gathering the Data a Model the Model Tuning on the Data Breaking down the data sets into 2 parts. The larger part (~80%) would be used for training the model while the smaller part (~20%) is used for the evaluation (i.e., testing) of the trained model’s performance. Can be also divided into (training, development, evaluation). Using the same data sets for both training and evaluation would not give a fair assessment of the model’s performance in real-world scenarios Steps of Machine Learning Step 1 Step 2 Step 3 Step 4 Step 5 Step 6 Step 7 Data Preparing Selecting Training Evaluation Hyperparameter Prediction Gathering the Data a Model the Model Tuning on the Data There are various existing models developed by data scientists that can be used for different purposes. Different classes of models are good at modeling the underlying patterns of different types of datasets. These models are designed with different goals in mind. For instance, some models are more suited to dealing with texts while another model may be better equipped to handle images. Steps of Machine Learning Step 1 Step 2 Step 3 Step 4 Step 5 Step 6 Step 7 Data Preparing Selecting Training Evaluation Hyperparameter Prediction Gathering the Data a Model the Model Tuning on the Data The bulk of the process is done at this stage. Training requires patience and experimentation. Steps of Machine Learning Step 1 Step 2 Step 3 Step 4 Step 5 Step 6 Step 7 Data Preparing Selecting Training Evaluation Hyperparameter Prediction Gathering the Data a Model the Model Tuning on the Data The training process typically involves initializing the model's parameters, say X and Y, with random values. The model then makes predictions using these initial values, and the predicted outputs are compared with the actual target values. The difference between the predictions and the actual values, often referred to as the error or loss, is calculated. The model’s parameters are then adjusted to minimize this error, using an optimization algorithm like gradient descent. This process of predicting, comparing, and updating the parameters is repeated many times, with each cycle referred to as a training step or iteration Steps of Machine Learning Step 1 Step 2 Step 3 Step 4 Step 5 Step 6 Step 7 Data Preparing Selecting Training Evaluation Hyperparameter Prediction Gathering the Data a Model the Model Tuning on the Data With the model trained, it needs to be tested to see if it would operate well in real-world situations. This puts the model in a scenario where it encounters situations that were not a part of its training. Steps of Machine Learning Step 1 Step 2 Step 3 Step 4 Step 5 Step 6 Step 7 Data Preparing Selecting Training Evaluation Hyperparameter Prediction Gathering the Data a Model the Model Tuning on the Data After the evaluation step, further improvements in the model's performance can be achieved by tuning hyperparameters. Hyperparameters are settings that govern the training process, such as the learning rate and the number of iterations. By carefully adjusting these hyperparameters, we can optimize the model's performance based on the feedback from previous training and evaluation steps Steps of Machine Learning Step 1 Step 2 Step 3 Step 4 Step 5 Step 6 Step 7 Data Preparing Selecting Training Evaluation Hyperparameter Prediction Gathering the Data a Model the Model Tuning on the Data The stage where we consider the model to be ready for practical applications. we can finally use our model to predict the outcome of what we want. Types of Machine Learning Supervised Learning Both the inputs and the desired outputs are provided The network then processes the inputs and compares its resulting outputs against the desired outputs. If the resulting output differs from the desired output, the generated error signal adjusts the weights. The error minimization process is supervised by a teacher. Supervised training method is used to perform non-linear mapping in pattern classification nets, pattern association nets, and multilayer nets. Unsupervised Learning ▪ The network is provided with inputs but not with desired outputs. ▪ The system itself must then decide what features it will use to group the input data. ▪ The input vectors of similar types are grouped together without a teacher. ▪ This is often referred to as self-organization. Reinforcement Learning Reinforcement learning mimics how humans learn: by interacting with environment, repeating actions for which the reward that is received is higher, and avoiding risky moves for which there is a low or negative reward as an outcome of their actions. Unlike other ML paradigms, such as supervised and unsupervised learning, RL works in a trial and error fashion by interacting with its environment. Reinforcement Learning Supervised Learning vs. Unsupervised Learning Image Credit: http://oliviaklose.com/machine-learning-2-supervised-versus-unsupervised-learning/ Supervised Unsupervised Learning Learning Classification Regression Clustering Association Binary Multiclass Regression (prediction) vs. Classification Binary Multiclass Classification Text Document Classification Classifying documents or articles into categories such as Sports, Politics, Technology, Entertainment, etc. Example: Given a news article, the model must classify it into one of several categories like "Sports," "Health," "Technology," "Business," etc. Image source: https://krakensystems.co/blog/2018/text-classification Regression (prediction) vs. Classification Clustering Clustering is an important concept of unsupervised learning. It is used to find out the hidden structures or patterns in uncategorized data. The clustering algorithm process the uncategorized data and divides them into different clusters (groups) such that objects with many similarities remain in the same group and have fewer or no similarities with the objects of another group. Association An association rule is an unsupervised learning method. It is used to find out the relationships between the data in the large dataset. It determines the set of items that occurs together in the dataset. Neural Networks Neural Networks Introduction to Biological and Artificial Neural Networks What is Neural Network? The term ‘Neural’ is derived from the human ‘neuron’, which is the nervous system’s basic functional unit. These nerve cells present in the brain and other parts of the human body. Neural Networks Black-box Input(s) Output(s) Physical Physical condition Biological Neural reaction (e.g. hot water Networks (e.g. movement poured on hand) of hand) Input Artificial Neural Learned Pattern Pattern (e.g. Binary A, B, Networks (e.g. A AND B, Irish flower data) learned classes of Irish flowers) How the brain works Our brain can be considered as a highly complex, non-linear and parallel information-processing system. Knowledge is stored and processed in a neural network simultaneously throughout the whole network, rather than at specific locations. Learning is a fundamental and essential characteristic of biological neural networks. The ease with which they can learn led to attempts to emulate a biological neural network in a computer. How the brain works The brain consists of a densely interconnected set of nerve cells, or basic information-processing units, called neurons. The human brain incorporates nearly 10 billion neurons and 60 trillion connections, synapses, between them. By using multiple neurons simultaneously, the brain can perform its functions much faster than the fastest computers today. Each neuron has a very simple structure, but an army of such elements constitutes a tremendous processing power. Structure of Neurons in Brain The typical nerve cell of human brain comprises of four parts: Dendrite — It receives signals from other neurons. Synapses — The point of interconnection of one neuron with other neurons. The Soma (cell body) — It amount of signal sums all the incoming transmitted depend signals to generate input. Axon — When neuron fires, upon the strength the signal travels down the (synaptic weights) of axon to the other neurons. the connections. Image Source: cs231n.github.io Artificial Neural Network Artificial Neural Networks, in general is a biologically inspired network of artificial neurons configured to perform specific tasks. Artificial Neural Network An artificial neural network consists of a number of very simple processors or nodes, also called neurons, which are analogous to the biological neurons in the brain. The neurons are connected by weighted links passing signals from one neuron to another. Similarity of ANN with Biological Neural Network ANN is similar to the biological neural systems in the following two ways: ◦ ANN acquires knowledge through learning. ◦ ANN’s knowledge is stored within inter-neuron connection strengths known as synaptic weights. Analogy of ANN With Biological Neural Network Biological Neural Network (BNN) Artificial Neural Network (ANN) Soma Neuron (or Node) Dendrite Input Axon Output Synapse Weight Definition of ANN ▪An artificial neural network (ANN) is a massively parallel distributed processor that has a natural tendency for storing experimental knowledge and making it available for use. ◦ Knowledge is acquired by the network through a learning (training) process. ◦ Strength of the interconnections between neurons is implemented by means of the synaptic weights used to store the knowledge. ▪The learning process is a procedure of the adapting the weights with a learning algorithm in order to capture the knowledge. On more mathematically, the aim of the learning process is to map a given relation between inputs and output (outputs) of the network. Architecture of a Typical ANN Output Signals Input Signals Middle Layer Input Layer Output Layer History of ANNs Pioneering work (a) The first neuron model: W. S. McCulloch and W. Pitts (1943), “A logical calculus of the ideas immanent in nervous activity,” Bulletin of Mathematical Biophysics, vol. 5, pp. 115-133 (Mark of the birth of neural networks and artificial intelligence). Neurons were modelled by Heaviside functions. A network with a sufficient number of such simple units and synaptic connections set properly and operate synchronously, in principle, is able to compute any computable function. (b) The first learning rule D. O. Hebb (1949), The Organization of Behaviour: A Neuropsychological Theory, New York: Wiley. The effectiveness of a variable synapse between two neurons is increased by the repeated activation of one neuron by the other across that synapse. History of ANNs 1950s and 1960s: Early bright years for ANN research F. Rosenblatt (1958), “The perceptron: a probabilistic model for information storage and organization in the brain,” Psychological Review, vol. 65, pp. 386- 408. (perceptron and delta learning rule) B. Widrow and M. E. Hoff, Jr. (1960), “Adaptive switching circuits,” IREWESCON Convention Record, pp. 96-104. (Adline and least mean squares learning rule) History of ANNs 1950s and 1960s: Early bright years for ANN research F. Rosenblatt (1958), “The perceptron: a probabilistic model for information storage and organization in the brain,” Psychological Review, vol. 65, pp. 386- 408. (perceptron and delta learning rule) B. Widrow and M. E. Hoff, Jr. (1960), “Adaptive switching circuits,” IREWESCON Convention Record, pp. 96-104. (Adline and least mean squares learning rule) History of ANNs ANN Areas of Applications System identification and control (e.g. vehicle control, process control) Game-playing and decision making (e.g. backgammon, chess, racing) Pattern and sequence recognition (e.g radar systems, face identification, object recognition) Sequence recognition (e.g. gesture, speech, handwritten text recognition) Medical diagnosis (e.g. Breast Cancer Detection) Financial applications (e.g. Price’s prediction or best trading actions prediction based on previous performance. Data mining (or knowledge discovery in databases, "KDD"), (e.g. e-mail spam filtering, classification , clustering, novelty prediction). References Ravichandiran, S. (2020). Deep Reinforcement Learning with Python. Packt Publishing. Dangeti, P. (2017). Statistics for machine learning. Packt Publishing Ltd. Hiran, K. K., Jain, R. K., Lakhwani, K., & Doshi, R. (2021). Machine Learning: Master Supervised and Unsupervised Learning Algorithms with Real Examples (English Edition). BPB Publications. [email protected]