AI Project Cycle.pdf
Document Details
Uploaded by LighterMothman
Tags
Full Transcript
AI PROJECT CYCLE REVISION NOTES What is AI Project Cycle? It is a step-by-step process that a person should follow to develop an AI Project to solve a problem. AI Project Cycle provides us with an appropriate framework which c...
AI PROJECT CYCLE REVISION NOTES What is AI Project Cycle? It is a step-by-step process that a person should follow to develop an AI Project to solve a problem. AI Project Cycle provides us with an appropriate framework which can lead us to achieve our goal. The AI Project Cycle mainly has 5 stages. 1. Problem Scoping 2. Data Acquisition 3. Data Exploration 4. Modelling 5. Evaluation What is Problem Scoping? Identifying a problem and having a vision to solve it, is called Problem Scoping. Scoping a problem is not that easy as we need to have a deeper understanding so that the picture becomes clearer while we are working to solve it. So we use the 4Ws Problem Canvas to understand the problem in a better way. What is 4Ws Problem Canvas? The 4Ws Problem canvas helps in identifying the key elements related to the problem. The 4Ws are : 1. Who 2. What 3. Where 4. Why 1. Who? : This block helps in analysing the people who are getting affected directly or indirectly due to a problem. Under this, we find out who are the ‘Stakeholders’ (those people who face this problem and would be benefitted with the solution) to this problem? Below are the questions that we need to discuss under this block. 1. Who are the stakeholders? 2. What do you know about them? 2. What? : This block helps to determine the nature of the problem. What is the problem and how do we know that it is a problem? Under this block, we also gather evidence to prove that the problem you have selected actually exists. Below are the questions that we need to discuss under this block. 1. What is the problem? 2. How do you know that it is a problem? Prepared by: M. S. KumarSwamy, TGT(Maths) Page - 1 - 3. Where? : This block will help us to look into the situation in which the problem arises, the context of it, and the locations where it is prominent. Here is the Where Canvas: 1. What is the context/situation in which the stakeholders experience the problem? 4. Why? : In the “Why” canvas, we think about the benefits which the stakeholders would get from the solution and how it will benefit them as well as the society. Below are the questions that we need to discuss under this block. 1. What would be of key value to the stakeholders? 2. How would it improve their situation? 2. What is Data Acquisition? This is the second stage of AI Project cycle. According to the term, this stage is about acquiring data for the project. Whenever we want an AI project to be able to predict an output, we need to train it first using data. For example, If you want to make an Artificially Intelligent system which can predict the salary of any employee based on his previous salaries, you would feed the data of his previous salaries into the machine. The previous salary data here is known as Training Data while the next salary prediction data set is known as the Testing Data. Data features refer to the type of data you want to collect. In above example, data features would be salary amount, increment percentage, increment period, bonus, etc. There can be various ways to collect the data. Some of them are: 1. Surveys 2. Web Scraping 3. Sensors 4. Cameras 5. Observations 6. API (Application Program Interface) One of the most reliable and authentic sources of information, are the open-sourced websites hosted by the government. Some of the open-sourced Govt. portals are: data.gov.in, india.gov.in 3. What is Data Exploration? While acquiring data, we must have noticed that the data is a complex entity – it is full of numbers and if anyone wants to make some sense out of it, they have to work some patterns out of it. Thus, to analyse the data, you need to visualise it in some user-friendly format so that you can: 1. Quickly get a sense of the trends, relationships and patterns contained within the data. 2. Define strategy for which model to use at a later stage. 3. Communicate the same to others effectively. To visualise data, we can use various types of visual representations like Bargraph, Histogram, Line Chart, Pie Chart. 4. What is Data Exploration? The graphical representation makes the data understandable for humans as we can discover trends and patterns out of it, but machine can analyse the data only when the data is in the most basic form of numbers (which is binary – 0s and 1s). The ability to mathematically describe the relationship between parameters is the heart of every AI model. Generally, AI models can be classified as follows: Prepared by: M. S. KumarSwamy, TGT(Maths) Page - 2 - Rule Based Approach : It refers to the AI modelling where the rules are defined by the developer. The machine follows the rules or instructions mentioned by the developer and performs its task accordingly. In this we fed the data along with rules to the machine and the machine after getting trained on them is now able to predict answers for the same. A drawback/feature for this approach is that the learning is static. Learning Based Approach : It refers to the AI modelling where the machine learns by itself. In this approach the AI model gets trained on the data fed to it and then is able to design a model which is adaptive to the change in data. An advantage for this approach is that the learning is dynamic. The learning-based approach can further be divided into three parts: a) Supervised Learning : In a supervised learning model, the dataset which is fed to the machine is labelled. A label is some information which can be used as a tag for data. For example, students get grades according to the marks they secure in examinations. These grades are labels which categorise the students according to their marks. There are two types of Supervised Learning models: 1. Classification: Where the data is classified according to the labels. This model works on discrete dataset which means the data need not be continuous. 2. Regression: Such models work on continuous data. For example, if we wish to predict our next salary, then we would put in the data of our previous salary, any increments, etc., and would train the model. Here, the data which has been fed to the machine is continuous. b) Unsupervised Learning : An unsupervised learning model works on unlabelled dataset. This means that the data which is fed to the machine is random. This model is used to identify relationships, patterns and trends out of the data which is fed into it. It helps the user in understanding what the data is about and what are the major features identified by the machine in it. Unsupervised learning models can be further divided into two categories: 1. Clustering: It refers to the unsupervised learning algorithm which can cluster the unknown data according to the patterns or trends identified out of it. 2. Dimensionality Reduction: We humans are able to visualise upto 3-Dimensions only but, there are various entities which exist beyond 3-Dimensions. For example, in Natural language Processing, the words are considered to be N-Dimensional entities. So to make sense out of it, dimensionality reduction algorithm is used to reduce their dimensions. Rule Based Approach Learning Based Approach It refers to the AI modelling where the rules are It refers to the AI modelling where the defined by the developer. machine learns by itself In this learning is static In this learning is dynamic The machine once trained, does not take into The machine once trained, does take into consideration any changes made in the original consideration any changes made in the original training dataset. training dataset. Prepared by: M. S. KumarSwamy, TGT(Maths) Page - 3 - 5. What is Evaluation? Once a model has been made and trained, it needs to go through proper testing so that one can calculate the efficiency and performance of the model. Hence, the model is tested with the help of Testing Data and the efficiency of the model is calculated on the basis of the parameters mentioned below: 1. Accuracy 2. Precision 3. Recall 4. F1 Score Neural Network : Neural networks are loosely modelled after how neurons in the human brain behave. The key advantage of neural networks is that they are able to extract data features automatically without needing the input of the programmer. It is a fast and efficient way to solve problems for which the dataset is very large, such as in images. As seen in the figure given above, the larger Neural Networks tend to perform better with larger amounts of data whereas the traditional machine learning algorithms stop improving after a certain saturation point. How Neural Network works? A Neural Network is divided into multiple layers and each layer is further divided into several blocks called nodes. The first layer of a Neural Network is known as the input layer. It’s job is to acquire data and feed it to the Neural Network. No processing occurs at the input layer. Next to it, are the hidden layers. Hidden layers are the layers in which the whole processing occurs. These layers are hidden and are not visible to the user. There can be multiple hidden layers in a neural network system. The last hidden layer passes the final processed data to the output layer which then gives it to the user as the final output. Some of the features of a Neural Network are listed below: 1. Neural Network Systems are modelled on the human brain and nervous system. 2. They are able to automatically extract features without input from the programmer. 3. Every neural network node is essentially a machine learning algorithm. 4. It is useful when solving problems for which the data set is very large. Prepared by: M. S. KumarSwamy, TGT(Maths) Page - 4 -