Grade 10 AI Project Cycle Lesson 2 PDF
Document Details
Uploaded by SurrealEuphoria
Tags
Summary
This document details the AI Project Cycle for Grade 10. It discusses rule-based and learning-based approaches to AI, their advantages and disadvantages, with examples like identifying whether an elephant is spotted or distinguishing between images of apples and bananas. It also outlines the three categories of data—training, validating, and testing.
Full Transcript
AI PROJECT CYCLE GRADE 10 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 Rule Based Approach Refers to the AI modelling where the rules are defined by the developer. The machine follows the rules or instructions...
AI PROJECT CYCLE GRADE 10 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 Rule Based Approach Refers to the AI modelling where the rules are defined by the developer. The machine follows the rules or instructions mentioned by the developer and performs its task accordingly. For example, we have a dataset which tells us about the conditions on the basis of which we can decide if an elephant may be spotted or not while on safari. The parameters are: Outlook, Temperature, Humidity and Wind. Now, let’s take various possibilities of these parameters and see in which case the elephant may be spotted and in which case it may not. After looking through all the cases, we feed this data in to the machine along with the rules which tell the machine all the possibilities. 26 The machine trains on this data and now is ready to be tested. While testing the machine, we tell the machine that Outlook = Overcast; Temperature = Normal; Humidity = Normal and Wind = Weak. On the basis of this testing dataset, now the machine will be able to tell if the elephant has been spotted before or not and will display the prediction to us. This is known as a rule-based approach because we fed the data along with rules to the machine and the machine after getting trained on them is now able to predict answers for the same. 27 A drawback/feature for this approach is that the learning is static. The machine once trained, does not take into consideration any changes made in the original training dataset. That is, if you try testing the machine on a dataset which is different from the rules and data you fed it at the training stage, the machine will fail and will not learn from its mistake. Once trained, the model cannot improvise itself on the basis of feedbacks. Thus, machine learning gets introduced as an extension to this as in that case, the machine adapts to change in data and rules and follows the updated path only, while a rule-based model does what it has been taught once. 28 Learning Based Approach Refers to the AI modelling where the machine learns by itself. Under the Learning Based approach, the AI model gets trained on the data fed to it and then is able to design a model which is adaptive to the change in data. That is, if the model is trained with X type of data and the machine designs the algorithm around it, the model would modify itself according to the changes which occur in the data so that all the exceptions are handled in this case. 29 For example, suppose you have a dataset comprising of 100 images of apples and bananas each. These images depict apples and bananas in various shapes and sizes. These images are then labelled as either apple or banana so that all apple images are labelled ‘apple’ and all the banana images have ‘banana’ as their label. Now, the AI model is trained with this dataset and the model is programmed in such a way that it can distinguish between an apple image and a banana image according to their features and can predict the label of any image which is fed to it as an apple or a banana. After training, the machine is now fed with testing data. Now, the testing data might not have similar images as the ones on which the model has been trained. So, the model adapts to the features on which it has been trained and accordingly predicts if the image is of an apple or banana. In this way, the machine learns by itself by adapting to the new data which is flowing in. This is the machine learning approach which introduces the dynamicity in the model. 30 The learning-based approach can further be divided into three parts 31 Supervised and Unsupervised Learning In Machine Learning https://www.youtube.com/watch?v=kE5QZ8G _78c&ab_channel=Simplilearn 32 Supervised Learning In a supervised learning model, the dataset which is fed to the machine is labelled. In other words, we can say that the dataset is known to the person who is training the machine only then he/she is able to label the data. A label is some information which can be used as a tag for data. For example, students get grades according to the marks they secure in examinations. These grades are labels which categorise the students according to their marks. 33 34 35 Unsupervised Learning An unsupervised learning model works on unlabelled dataset. This means that the data which is fed to the machine is random and there is a possibility that the person who is training the model does not have any information regarding it. The unsupervised learning models are used to identify relationships, patterns and trends out of the data which is fed into it. It helps the user in understanding what the data is about and what are the major features identified by the machine in it. For example, you have a random data of 1000 dog images and you wish to understand some pattern out of it, you would feed this data into the unsupervised learning model and would train the machine on it. After training, the machine would come up with patterns which it was able to identify out of it. The Machine might come up with patterns which are already known to the user like colour or it might even come up with something very unusual like the size of the dogs. 36 Unsupervised learning models can be further divided into two categories 37 Dimensionality Reduction We humans are able to visualise upto 3- Dimensions only but according to a lot of theories and algorithms, there are various entities which exist beyond 3-Dimensions. For example, in Natural language Processing, the words are considered to be N-Dimensional entities. Which means that we cannot visualise them as they exist beyond our visualisation ability. Hence, to make sense out of it, we need to reduce their dimensions. Here, dimensionality reduction algorithm is used. 38 As we reduce the dimension of an entity, the information which it contains starts getting distorted. For example, if we have a ball in our hand, it is 3- Dimensions right now. But if we click its picture, the data transforms to 2-D as an image is a 2-Dimensional entity. Now, as soon as we reduce one dimension, at least 50% of the information is lost as now we will not know about the back of the ball. Whether the ball was of same colour at the back or not? Or was it just a hemisphere? If we reduce the dimensions further, more and more information will get lost. Hence, to reduce the dimensions and still be able to make sense out of the data, we use Dimensionality Reduction. 39 40 Evaluation Once a model has been made and trained, it needs to go through proper testing so that one can calculate the efficiency and performance of the model. Hence, the model is tested with the help of Testing Data (which was separated out of the acquired dataset at Data Acquisition stage) and the efficiency of the model is calculated on the basis of the parameters mentioned below: 41 42 Neural Networks Neural networks are loosely modelled after how neurons in the human brain behave. The key advantage of neural networks are that they are able to extract data features automatically without needing the input of the programmer. A neural network is essentially a system of organizing machine learning algorithms to perform certain tasks. It is a fast and efficient way to solve problems for which the dataset is very large, such as in images. 43 https://www.youtube.com/watch?v=bfmFfD2 RIcg&ab_channel=Simplilearn https://www.youtube.com/watch?v=6M5VXK Lf4D4&t=258s&ab_channel=Simplilearn 44 45 As seen in the figure given, the larger Neural Networks tend to perform better with larger amounts of data whereas the traditional machine learning algorithms stop improving after a certain saturation point. 46 This is a representation of how neural networks work. A Neural Network is divided into multiple layers and each layer is further divided into several blocks called nodes. Each node has its own task to accomplish which is then passed to the next layer. The first layer of a Neural Network is known as the input layer. The job of an input layer is to acquire data and feed it to the Neural Network. No processing occurs at the input layer. Next to it, are the hidden layers. Hidden layers are the layers in which the whole processing occurs. Their name essentially means that these layers are hidden and are not visible to the user. 47 Each node of these hidden layers has its own machine learning algorithm which it executes on the data received from the input layer. The processed output is then fed to the subsequent hidden layer of the network. There can be multiple hidden layers in a neural network system and their number depends upon the complexity of the function for which the network has been configured. Also, the number of nodes in each layer can vary accordingly. The last hidden layer passes the final processed data to the output layer which then gives it to the user as the final output. Similar to the input layer, output layer too does not process the data which it acquires. It is meant for user-interface. 48 Some of the features of a Neural Network are listed below: 49 The three categories of data Training Data Validating data Testing data 50 Training Data 51 Text Categorization Input label The movie was very good Entertainment The brain tumor had complications Medical 52 Image recognition input label movie Brain tumor 53 Spam detection – where input is an email or text message which is analyzed 54 Sentiment Analysis The input is a sentence or a phrase from social media feeds like twitter, facebook or customer reviews from web sites or surveys INPUT LABEL The shirt I bought was a perfect fit and of good quality Positive This hospital did not provide the correct diagnosis Negative 55 Validating data Also called secondary data set This data is used to check if the newly developed model is correctly identifying the data for making predictions. This step makes sure that the new model has not become specific to the primary dataset values in making predictions. If that is the case then corrections and tweaks are made in the project. The primary and the secondary data sets are also re runs through the model untill the desired accuracy is achieved. 56 Testing Data All primary and secondary data come with relevant label tags on the data The testing data is the final dataset which provides no help in terms of tag to thr model produced This dataset paves the way for the machine model to enter the real world and start making predictions 57 Data warehousing Data is always collected in bulk from various sources using various formats. This is called data warehousing 58