🎧 New: AI-Generated Podcasts Turn your study notes into engaging audio conversations. Learn more

7f959b64-8404-4a73-8b87-ee0be84bb891_AI_Grade_10_(PT-1).pdf

Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...

Full Transcript

📝 AI Grade 10 (PT-1) https://www.youtube.com/watch?v=y70eVj7whxk (AI intro + Basics) Introduction to AI What is Artificial Intelligence? When a machine possesses the ability to mimic human traits, i.e., make decisions, pr...

📝 AI Grade 10 (PT-1) https://www.youtube.com/watch?v=y70eVj7whxk (AI intro + Basics) Introduction to AI What is Artificial Intelligence? When a machine possesses the ability to mimic human traits, i.e., make decisions, predict the future, learn and improve on its own, machine is artificially intelligent when it can accomplish tasks by itself - collect data, understand it, analyze it, learn from it, and improve it. any machine that has been trained with data and can make decisions/predictions on its own can be termed as AI. AI is a form of Intelligence; a type of technology and a field of study. AI theory and development of computer systems (both machines and software) enables machines to perform tasks that normally require human intelligence. Artificial Intelligence covers a broad range of domains and applications and is expected to impact every field in the future. Overall, its core idea is building machines and algorithms which are capable of performing computational tasks that would otherwise require human like brain functions. What is Intelligence? ability to perceive or infer information, and to retain it as knowledge to be applied towards adaptive behaviors within an environment or context. ‘Machines make their lives easier’ Explain. OR ‘Life without machines is unimaginable’ Humans have been developing machines which can make their lives easier. Machines are made with an intent of accomplishing tasks which are either too tedious for humans or are time consuming. Hence, machines help us by working for us, thereby sharing our load and making it easier for us to fulfil such goals. Life without machines today is unimaginable, and because of this, humans have been putting efforts into making them even more sophisticated and smart. As a result, we are surrounded by smart devices and gadgets like smartphones, smartwatches, smart TV, etc how is a smartphone today different from the telephones we had in the last century? Today’s phones can do much more than just call-up people. They can help us in navigating, recommend which songs we should listen to or which movies we should watch according to our own likes and dislikes. Our phones can help us connect with likeminded people, make our selfies fun with face filters, help us maintain a record of our health and fitness and a lot more. AI Grade 10 (PT-1) 1 It involves various mental abilities, like logic, reasoning, problem solving and planning. Broadly classified on following parameters: 1. Learning from experience 2. Identifying problems 3. Problem solving 4. Ability to make accurate decisions 5. Ability to prove outcomes 6. Ability to think logically 7. Ability to learn and improve Types of intelligence math: understanding numbers, symbols and logic in mathematics linguistic: ability to write, understand and implement language processing skills spatial: ability to percieve visual world, by connecting dots and finding the relation kineasthetics: ability of being skilled in limb realated things music: ability to understand music, rythm etc INTRAPERSONAL: understanding yourself, or- self-awerness (knowing weakness, strengths, feelings) INTERPERSONAL: ability to understand other’s feeling and knowning how to communicate existential: related to spitural world naturalist: ability to process enviournment related matters But even though one is more skilled in intelligence than the other, it should be noted that in fact all humans have all 9 of these intelligences only at different levels. One might be an expert at painting, while the other might be an expert in mathematical calculations. One is a musician, the other is an expert dancer. AI Grade 10 (PT-1) 2 Desicion making How do you make decisions? The basis of decision making depends upon the availability of information and how we experience and understand it. For the purposes of this article, ‘information’ includes our past experience, intuition, knowledge, and self-awareness. We can’t make “good” decisions without information because then we have to deal with unknown factors and face uncertainty, which leads us to make wild guesses, flipping coins, or rolling a dice. [BE QUES] Having knowledge, experience, or insights given a certain situation, helps us visualize what the outcomes could be. and how we can achieve/avoid those outcomes. How do machines become Artificially Intelligent? machines also become intelligent once they are trained with some information which helps them achieve their tasks. AI machines also keep updating their knowledge to optimize their output (just like how we gain intelligence from experiance, ai does the same) Applications of Artificial Intelligence They are becoming a crucial part of our everyday life and provide us with an ease of having even some of the most complicated and time-consuming tasks being done at the touch of a button or by the simple use of a sensor. 1. Google We surf the internet for things on Google without realizing how efficiently Google always responds to us with accurate answers. Come up with results to our search in a matter of seconds, It also suggests and autocorrects our typed sentences. 2. Siri / Google Assistant / Alexa / Bixby / Cortana They’re pocket assistants that can do a lot of tasks at just one command. They some very common examples of the voice assistants which are a major part of our digital devices. popular pocket assistants: Siri- and google 3. Google Maps To help us navigate to places, apps like UBER and Google Maps come in human. 4. Fifa / Sports AI has completely enhanced the gaming experience for its users. A lot of games nowadays are backed up with AI which helps in enhancing the graphics, come up with new difficulty levels, encourage gamers, etc. 5. Amazon show us recommendations on the basis of what we like. 6. Soical media The recommendations are not just limited to our preferences, they even cater to our needs of connecting with friends on social media platforms with apps like Facebook and Instagram. They also send us customized notifications about our online shopping details, auto-create playlists according to our requests AI Grade 10 (PT-1) 3 Taking selfies was never this fun as Snapchat filters make them look so cool 7. Chatbots AI is also being used to monitor our health. A lot of chatbots and other health apps are available, which continuously monitor the physical and mental health of its users. 8. Humanoids like Sophia get citizenship, biometric security systems like the face locks we have in our phones, real-time language translators, weather forecasts and etc What is not AI? Just as humans learn how to walk and then improve this skill with the help of their experiences, an AI machine too gets trained first on the training data and then optimizes itself according to its own experiences which makes AI different from any other technological device/machine. 1. Washing Machine A fully automatic washing machine can work on its own, but it requires human intervention to select the parameters of washing and to do the necessary preparation for it to function correctly before each wash, which makes it an example of automation 2. AC An air conditioner can be turned on and off remotely with the help of internet but still needs a human touch. This is an example of Internet of Things (IoT). Also, every now and then we get to know about robots which might follow a path or maybe can avoid obstacles but need to be primed accordingly each time. 3. Automated Projects We also get to see a lot of projects which can automate our surroundings with the help of sensors. Here too, since the bot or the automation machine is not trained with any data, it does not count as AI. 4. TV It would be valid to say that not all the devices which are termed as "smart" are AI-enabled. For example, a TV does not become AI- enabled if it is a smart one, it gets the power of AI when it is able to think and process on its own. Various organizations have coined their own versions of defining Artificial Intelligence. Some of them are mentioned below: NITI Aayog: National Strategy for Artificial Intelligence AI refers to the ability of machines to perform cognitive tasks like thinking, perceiving, learning, problem solving and decision making. World Economic Forum Artificial intelligence (AI) is the software engine that drives the Fourth Industrial Revolution. It holds the promise of solving some of the most pressing issues facing society, but also presents challenges such as inscrutable “black box” algorithms, unethical use of data and potential job displacement. As rapid advances in machine learning (ML) increase the scope and scale of AI’s deployment across all aspects of daily life, and as the technology itself can learn and change on its own, multi-stakeholder collaboration is required to optimize accountability, transparency, privacy and impartiality to create trust. AI, ML & DL AI Grade 10 (PT-1) 4 Artificial Intelligence (AI) Refers to any technique that enables computers to mimic human intelligence. It gives the ability to machines to recognize a human’s face; to move and manipulate objects; to understand the voice commands by humans, and also do other tasks. The AI-enabled machines think algorithmically and execute what they have been asked for intelligently. Machine Learning (ML) It is a subset of Artificial Intelligence which enables machines to improve at tasks with experience (data). The intention of Machine Learning is to enable machines to learn by themselves using the provided data and make accurate Predictions/ Decisions. Deep Learning (DL) It enables software to train itself to perform tasks with vast amounts of data. In Deep Learning, the machine is trained with huge amounts of data which helps it in training itself around the data. Such machines are intelligent enough to develop algorithms for themselves. Deep Learning is the most advanced form of Artificial Intelligence out of these three. Then comes Machine Learning which is intermediately intelligent and Artificial Intelligence covers all the concepts and algorithms which, in some way or the other mimic human intelligence. AI Domains Artificial Intelligence becomes intelligent according to the training which it gets. For training, the machine is fed with datasets. According to the applications for which the AI algorithm is being developed, the data which is fed into it changes. With respect to the type of data fed in the AI model, AI models can be broadly categorized into three domains: 1. Data Sciences Data sciences is a domain of AI related to data systems and processes, in which the system collects numerous data, maintains data sets and derives meaning/sense out of them Example: Price Comparison Websites: These websites are being driven by lots and lots of data. If you have ever used these websites, you would know, the convenience of comparing the price of a product from multiple vendors at one place. Now a days, price comparison website can be found in almost every domain such as technology, hospitality, automobiles, durables, apparels etc. 2. Computer Vision Computer Vision, abbreviated as CV, is a domain of AI that depicts the capability of a machine to get and analyze visual information and afterwards predict some decisions about it. The entire process involves image acquiring, screening, analysing, identifying and extracting information. This extensive processing helps computers to understand any visual content and act on it accordingly Computer vision related projects translate digital visual data into descriptions. This data is then turned into computer-readable language to aid the decision-making process. The main objective of this domain of AI is to teach machines to collect information from pixels. Examples of Computer Vision: Self-Driving cars/ Automatic Cars: CV systems scan live objects and analyse them, based on whether the car decides to keep running or to stop. AI Grade 10 (PT-1) 5 Face Lock in Smartphones: Smartphones nowadays come with the feature of face locks in which the smartphone’s owner can set up his/her face as an unlocking mechanism for it. The front camera detects and captures the face and saves its features during initiation. Next time onwards, whenever the features match, the phone is unlocked. 3. Natural Language Processing Natural Language Processing, abbreviated as NLP, is a branch of artificial intelligence that deals with the interaction between computers and humans using the natural language. Natural language refers to language that is spoken and written by people, and natural language processing (NLP) attempts to extract information from the spoken and written word using algorithms. The ultimate objective of NLP is to read, decipher, understand, and make sense of the human languages in a manner that is valuable. Examples of Natural Language Processing: Email filters: Email filters are one of the most basic and initial applications of NLP online. It started out with spam filters, uncovering certain words or phrases that signal a spam message. Smart assistants: Smart assistants like Apple’s Siri and Amazon’s Alexa recognize patterns in speech, then infer meaning and provide a useful response. AI Ethics the discipline deals with right vs wrong and moral obligation and duties of human Bias Privacy 1) The data collected for training the AI system should have some 1) AI bias are built in biases in algorithms by the programmer. regulation in accessing and sharing personal data. 2) For eg. Facial recognition algorithms made by Microsoft, IBM all 2) Eg: Apps during installation, seeks access to gallery, mic etc. had biases when detecting people’s gender. Moral issues: Self-driving car An ethical dilemma arises when the car's algorithm must choose between hitting a small boy or damaging the car and injuring the person inside The developer's morality is transferred into the machine The choice made by the machine reflects the developer's values The developer must consider such ethical dilemmas while developing the car's algorithm The choices might differ from person to person and one must understand that nobody is wrong in this case Data Privacy The world of Artificial Intelligence revolves around Data. Every company whether small or big is mining data from as many sources as possible. (Data mining: extracting and storing data from various area) More than 70% of the data collected till now has been collected in the last 3 years which shows how important data has become in recent times. (where do we collect data from): One of the major sources of data for many major companies is the device which all of us have in our hands all the time: Smartphones. Why do these apps collect data? 1. To provide us with a lot of facilities and features which have made our lives easier. 2. Customized recommendations and notifications according to our choices. 3. Make their app more accurate and efficient. It is not wrongly said that Data is the new gold. General Notes: Another feature of smartphones nowadays is that they provide us with customised recommendations and notifications according to our choices. Let us understand this with the help of some examples: 1. When you are talking to your friend on a mobile network or on an app like WhatsApp. You tell your friend that you wish to buy new shoes and are looking for suggestions from him/her. You discuss about shoes and that is it. After some time, the online shopping websites start giving you notifications to buy shoes! They start recommending some of their products and urge you to you buy some. AI Grade 10 (PT-1) 6 2. If you search on Google for a trip to Kerala or any other destination, just after the search, all the apps on your phone which support advertisements, will start sending messages about packages that you can buy for the trip. 3. Even when you are not using your phone and talking to a person face-to-face about a book you’ve read recently while the phone is kept in a locked mode nearby, the phone will end up giving notifications about similar books or messages about the same book once you operate it. 4. But at the same time if one does not want to share his/her data with anyone, he/she can opt for alternative applications which are of similar usage and keep your data private. For example, an alternative to WhatsApp is the Telegram app which does not collect any data from us. But since WhatsApp is more popular and used by the crowd, people go for it without thinking twice. AI Bias Another aspect to AI Ethics is bias. Everyone has a bias of their own no matter how much one tries to be unbiased, we in some way or the other have our own biases even towards smaller things. Biases are not negative all the time. Sometimes, it is required to have a bias to control a situation and keep things working. When we talk about a machine, we know that it is artificial and cannot think on its own. It can have intelligence, but we cannot expect a machine to have any biases of its own. Any bias can transfer from the developer to the machine while the algorithm is being developed. AI Access Since Artificial Intelligence is still a budding technology, not everyone has the opportunity to access it. The people who can afford AI enabled devices make the most of it while others who cannot are left behind. Because of this, a gap has emerged between these two classes of people and it gets widened with the rapid advancement of technology. AI creates unemployment AI is making people’s lives easier. Most of the things nowadays are done in just a few clicks. In no time AI will manage to be able to do all the laborious tasks which we humans have been doing since long. Maybe in the coming years, AI enabled machines will replace all the people who work as labourers. This may start an era of mass unemployment where people having little or no skills may be left without jobs and others who keep up with their skills according to what is required, will flourish. This brings us to a crossroads. On one hand where AI is advancing and improving the lives of people by working for them and doing some of their tasks, the other hand points towards the lives of people who are dependent on laborious jobs and are not skilled to do anything else. Despite AI’s promises to bring forth new opportunities, there are certain associated risks that need to be mitigated appropriately and effectively. To give a better perspective, the ecosystem and the sociotechnical environment in which the AI systems are embedded needs to be more trustworthy. AI PROJECT CYCLE Project Cycle is a step-by-step process to solve problems using proven scientific methods and drawing inferences about them. Problem Scoping - Understanding the problem Data Acquisition - Collecting accurate and reliable data Data Exploration - Arranging the data uniformly Modelling - Creating Models from the data Evaluation - Evaluating the project PROBLEM SCOPING Problem Scoping refers to understanding a problem, finding out various factors which affect the problem, define the goal or aim of the project. Sustainable Development Goals Sustainable Development: To Develop for the present without exploiting the resources of the future. AI Grade 10 (PT-1) 7 4 W's of Problem Scoping Theme is a broad term which covers all the aspects of relevance under it The 4W's of Problem Scoping are Who, What, Where, and Why. This Ws helps in identifying and understanding the problem in a better and efficient manner.: Who - "Who" part helps us in comprehending and categorizing who all are affected directly and indirectly with the problem and who are called the Stake Holders What - "What" part helps us in understanding and identifying the nature of the problem and under this block, you also gather evidence to prove that the problem you have selected exists. Where - "Where" does the problem arise, situation, context, and location. Why - "Why" is the given problem worth solving Problem Statement Template helps us to summarise all the key points into one single Template so P that in future, whenever there is need to look back at the basis of the problem, we can take a look at the Problem Statement Template and understand the key elements of it. Our​ [Stakeholder(s)] Who Has/Have a problem that​ (issue, problem, need) ​ What​ When/while​ (context /situation) ​ Where​ An ideal solution would​ (benefit of situation for them) ​ Why​ DATA AQUISITION This stage is about acquiring data for the project. Data can be a piece of information or facts and statistics collected together for reference or analysis. Whenever we want an Al project to be able to predict an output, we need to train it first using data. Training Data: training the model with data with previous data Testing Data: for making predictions, in one sense testing the data for what it is trained For any AI project to be efficient, the training data should be authentic and relevant. Data features Data features refer to the type of data you want to collect Data sources Examples: Surveys, Web Scraping, Sensors, Cameras, Observations, API (Application Program Interface) Data acquired from random websites might not be authentic as its accuracy cannot be proved. It is necessary to find a reliable source of data from where some authentic information can be taken. Data which we collect is open-sourced and not someone’s property. Extracting private data can be an offence. One of the most reliable and authentic sources of information, are the open-sourced websites hosted by the government. These government portals have general information collected in suitable format which can be downloaded and used wisely. AI Grade 10 (PT-1) 8 Some of the open-sourced Govt. portals are: data.gov.in, india.gov.in Data Exploration The process of arranging the gathered data uniformly for a better understanding. Data can be arranged in the form of a table, plotting a chart, or making a database To analyse the data, you need to visualise it in some user-friendly format so that you can: Quickly get a sense of the trends contained within the data. in one sense, to get an overview Easily communicate it to others Interpret easily than the numerical form of data Visual representations: Bar graphs, Pie charts, Line graphs, histograms, dot charts Modelling Modelling is the process in which different models based on the visualized data can be created and even checked for the advantages and disadvantages of the model. When we talk about machines accessing and analyzing data, it needs the data in the most basic form of numbers (which is binary – 0s and 1s) and when it comes to discovering patterns and trends in data, the machine goes in for mathematical representations of the same. The ability to mathematically describe the relationship between parameters is the heart of every AI model. Thus, whenever we talk about developing AI models, it is the mathematical approach towards analyzing data which we refer t Rule based Refers to the Al modelling where the rules are defined by the developer The machine follows the rules or instructions mentioned by the developer and performs its task accordingly. This is known as a rule-based approach because we fed the data along with rules to the machine and the machine after getting trained on them is now able to predict answers for the same A drawback/feature for this approach is that the learning is static. The machine once trained, does not take into consideration any changes made in the original training dataset. (meaning: no change to be made after the model is done) If you try testing the machine on a dataset which is different from the rules and data you fed it at the training stage, the machine will fail and will not learn from its mistake. Example: if u provide data of apple and orange, it would give accurate answers to apple and orange’s picture- but if u show a banana off no nowhere- the machine will crash and fail to work, and wont learn. The model cannot improvise itself on the basis of feedbacks Learning based The learning-Based Approach is based on a Machine learning experience with the data fed. Refers to the Al modelling where the machine learns by itself. Under the Learning Based approach, the Al model gets trained on the data fed to it and then is able to design a model which is adaptive to the change in data. After training, the machine is now fed with testing data Machine learning: Machine learning is a subset of artificial intelligence (AI) that provides machines the ability to learn automatically and improve from experience without being programmed for it.) The machine learning approach that introduces the dynamicity in the model. (Dynamic meaning: learns if made a mistakes, without forgetting the previous data fed, unlike static) AI Grade 10 (PT-1) 9 Supervised Learning In a supervised learning model, the dataset which is fed to the machine is labelled. The dataset is known to the person who is training the machine only then he/she is able to label the data. A label is some information which can be used as a tag for data. For example, students get grades according to the marks they secure in examinations. These grades are labels which categorize the students according to their mark Classification and Regression Classification Where the data is classified according to the labels. For example, in the grading system, students are classified on the basis of the grades they obtain with respect to their marks in the examination. This model works on discrete dataset which means the data need not be continuous. Regression Such models work on continuous data. For example, if you wish to predict your next salary, then you would put in the data of your previous salary, any increments, etc., and would train the model. Here, the data which has been fed to the machine is continuous. Unsupervised Learning Model An unsupervised learning model works on unlabelled dataset. This means that the data which is fed to the machine is random and there is a possibility that the person who is training the model does not have any information regarding it. The unsupervised learning models are used to identify relationships, patterns and trends out of the data which is fed into it. Clustering Refers to the unsupervised learning algorithm which can cluster the unknown data according to the patterns or trends identified out of it. The patterns observed might be the ones which are known to the developer or it might even come up with some unique patterns out of it Machine learning model The model predict the future data and outputs generated based on patterns or trends in data Dataset is usually unlabeled or random discrete data set Dimensionality Reduction We can visualize up to 3-Dimensions only. To reduce the dimensions and still be able to make sense of the data, we use Dimensionality Reduction. The ball in our hand is 3-Dimensions. But if we click its picture, the data transform to 2-D Re-enforcement Learning 1. Learning through feedback or trial and error method 2. The system works on a Reward or Penalty policy. 3. In this an agent performs an action positive or negative, in the environment which is taken as input from the the system, then the system changes the state in the environment and the agent is provided with a reward or penalty Example: A very good example of these is Vending machines. Suppose you put a coin (action) in a Juice Vending machine(environment), now the system detects the amount of coin given (state) you get the drink corresponding to the amount(reward) If the coin is damaged or there is any other problem, then you get nothing (penalty). AI Grade 10 (PT-1) 10 Evaluation It is the method of understanding the reliability of an AI Model. It is based on the outputs which is received by feeding the test data to model and comparing the output with the actual answers. Neural Network Neural networks are loosely modelled after how neurons in the human brain behave. The key advantage of neural networks are that they are able to extract data features automatically without needing the input of the programmer. A neural network is essentially a system of organizing machine learning algorithms to perform certain tasks. It is a fast and efficient way to solve problems for which the dataset is very large, such as in images. the larger Neural Networks tend to perform better with larger amounts of data whereas the traditional machine learning algorithms stop improving after a certain saturation point Working Of Neural Networks This is a representation of how neural networks work. A Neural Network is divided into multiple layers and each layer is further divided into several blocks called nodes. Each node has its own task to accomplish which is then passed to the next layer. The first layer of a Neural Network is known as the input layer. The job of an input layer is to acquire data and feed it to the Neural Network. No processing occurs at the input layer. Next to it, are the hidden layers. Hidden layers are the layers in which the whole processing occurs. Their name essentially means that these layers are hidden and are not visible to the user. Each node of these hidden layers has its own machine learning algorithm which it executes on the data received from the input layer. The processed output is then fed to the subsequent hidden layer of the network. There can be multiple hidden layers in a neural network system and their number depends upon the complexity of the function for which the network has been configured. Also, the number of nodes in each layer can vary accordingly. The last hidden layer passes the final processed data to the output layer which then gives it to the user as the final output. Similar to the input layer, output layer too does not process the data which it acquires. It is meant for user-interface. Some of the features of a Neural Network are listed below: Neural Network systems are modelled on the human brain and nervous system. They are able to automatically extract features without input from the programmer. Every neural network node is essentially a machine learning algorithm. It is useful when solving problems for which the data set is very large. TRAINING DATA: This data is trained to recognise repetitive patterns and to utilize AI technologies like neural networks to make accurate predictions. TYPES OF TRAINING DATA - Image Recognition, Sentiment Analysis, Text Recognition and Spam Detection. VALIDATING DATA: Also called secondary data set. AI Grade 10 (PT-1) 11 This data is used to check if the newly developed model is correctly identifying the data for making predictions. This step ensures that the new model has not become specific to the primary dataset values in making predictions. If that is the case, then corrections and tweaks are made in the project. The primary and the secondary data sets are also re-runs through the model until the desired accuracy is achieved. TESTING DATA: All primary and secondary data come with relevant label tags on the data. The testing data is the final dataset which provides no help in terms of tag to the model produced. This dataset paves the way for the machine model to enter the real world and start making prediction. DATA WAREHOUSING: Data collected in bulk from various sources using various formats. Questions Important~ 1. Differentiate between what is AI and what is not AI with the help of an example? AI Machine : 1. AI machines are trained with data and algorithm. 2. AI machines learn from mistakes and experience. They try to improvise on their next iterations. 3. AI machines can analyses the situation and can take decisions. 4. AI based drones capture the real-time data during the flight, processes it in real-time, and makes a human independent decision based on the processed data. Not AI machine : 1. Smart machines which are not AI, do not require training data, they work on algorithms only. 2. Smart machines work on fixed algorithms and they always work with the same level of efficiency, which is programmed into them. 3. Machines which are not AI cannot take decisions on their own. 4. An automatic door in a shopping mall, seems to be AI-enabled, but it is built with only sensor technology 2. Akhil wants to learn how to scope the problem for an AI Project. Explain him the following: (a) 4W Problem Canvas (b) Problem Statement Template. The 4Ws Problem canvas helps in identifying the key elements related to the problem. The 4Ws are Who, What, Where and Why The “Who” block helps → analysing the people getting affected directly or indirectly due to the problem. The “What” block helps us to→ determine the nature of the problem. The “Where” block helps us to → look into the situation in which the problem arises, the context of it, and the locations where it is prominent. The “Why” block suggests to us → the benefits which the stakeholders would get from the solution and how it will benefit them as well as the society.The answers to these questions lead to a problem statement. 3. Differentiate between Classification and Regression algorithms with the help of diagram a) Classification: Where the data is classified according to the labels. For example, in the grading system, students are classified on the basis of the grades they obtain with respect to their marks in the examination. This model works on discrete dataset which means the data need not be continuous. b) Regression: Such models work on continuous data. For example, if you wish to predict your next salary, then you would put in the data of your previous salary, any increments, etc., and would train the model. Here, the data which has been fed to the machine is continuous. 4. What do you understand by the Data Exploration stage in AI project cycle? Briefly explain any 3 methods of data exploration techniques Data Exploration basically refers to visualizing the data to determine the pattern, relationships between elements and trends in the dataset that gives a clear meaning and understanding of the dataset. AI Grade 10 (PT-1) 12 Data exploration is important as it helps the user to select an AI model in the next stage of the AI project cycle. To visualize the data, various types of visual representations can be used such as statistical functions, diagrams, charts, graphs, flows and so on. 5. “A great leader is recognized for their exceptional skill in making decisions.” Is it correct to say that AI has exceptional qualities in the decision-making process? How does AI make decisions? AI systems makes good decisions based on the basis of training given and user behavior. It analyses the patterns in data and accordingly predictions were made. 6. Sirisha and Divisha want to make a model which will organize the unlabelled input data into groups based on features. Which learning model should they use and why? Clustering. This model can cluster the unlabeled data based on the patterns and accordingly predictions can be made. 7. Data analytics raises many ethical issues, especially when anyone starts making money from their data externally for the purposes different from the ones for which the data was initially collected. Suggest him any two points that he needs to keep in mind while accessing data from any data source. Data which is available for public usage only should be taken up. Personal datasets should only be used with the consent of the owner. One should never breach someone’s privacy to collect data. Data should only be taken form reliable sources as the data collected from random sources can be wrong or unusable. Reliable sources of data ensure the authenticity of data which helps in proper training of the AI model 8. List any two online and off line data sources 9. How does values of precision and Recall affects F1 Score? An ideal situation would be when we have a value of 1 (that is 100%) for both Precision and Recall. In that case, the F1 score would also be an ideal 1 (100%). It is known as the perfect value for F1 Score. As the values of both Precision and Recall ranges from 0 to 1, the F1 score also ranges from 0 to 1. In conclusion, we can say that a model has good performance if the F1 Score for that model is high. 10. How can data science be applied in e-commerce? Data science plays a crucial role in e-commerce, helping businesses make data-driven decisions and improve their operations. The key applications are: Personalized Recommendations: Data science algorithms analyze customer browsing and purchasing history to provide personalized product recommendations, enhancing user experience and boosting sales. Fraud Detection: Machine learning models can detect unusual transactions or patterns, helping to identify and prevent fraudulent activities in real-time. data science enables e-commerce companies to make smarter decisions, improve customer satisfaction, and drive growth. 11. An AI model is designed to predict if there is going to be a water shortage in your area in the near future or not. The confusion matrix for the same is, Calculate Accuracy, Precision, Recall and F1 Score AI Grade 10 (PT-1) 13 12. Calculate Accuracy, Precision, Recall and F1 Score for the following Confusion Matrix on Heart Attack Risk. Also suggest which metric would not be a good evaluation parameter here and why? Evaluation https://www.youtube.com/watch?v=UnHdA57jXxc (Evaluation + Confusion Matrix) 1. Problem Scoping - Understanding about the problem and stakeholders 2. Data Acquisition- Acquire data from various sources 3. Data Exploration - Arranging and Interpreting Data 4. Modelling- Creating a desired working model 5. Evaluation- In short words, evaluation is the testing of the model we made in modelling stage. we test the data, to check if it’s working perfectly or does it need more training Model Evaluation: is an integral part of the model development process. It helps to find the best model that represents our data and how well the chosen model will work in the future. Evaluation: is the process of understanding the reliability of any AI model, based on outputs by feeding test dataset into the model and comparing with actual answers. Overfitting of Data? An overfitted model occurs when an algorithm fits too closely to its training data. It memorizes the training data rather than learning the underlying patterns. Therefore, to evaluate the Al model it is not necessary to use the data that is used to build the model. Because Al Model remembers the whole training data set, therefore it always predicts the correct label for any point in the training dataset. Models that use the training dataset during testing, will always results in correct output. This is known as overfitting. Evaluation- Concept Process of understanding the reliability of any AI model, based on outputs by feeding test dataset into the model and comparing with actual answers. Q) What Data Should be fed? (imp) Use testing set, not the training dataset. This is because if we use training set, our model will simply remember the whole training set, and will always predict the correct label for any point in the training set. This is known as overfitting. Model Evaluation Terminologies There are 2 terminologies used for Evaluation, Prediction (output) and Reality The prediction is the output which is given by the machine The reality is the real scenario when the prediction has been made. EXAMPLE 1: Let’s say that a Forest is prone to forest fires. A machine predicts that the forest will be broken out with fire. machine here gives a TRUE prediction Now, Lets say you’re near that forest currently for which the machine predicted the fire to be broken out. and, there’s actually a fire broken out in forest. therefore, reality is a TRUE reality Which makes this a TRUE POSITIVE situation EXAMPLE 2: Let’s say that a Forest is prone to forest fires. A machine predicts that the forest will NOT be broken out with fire. machine here gives a False prediction Now, Lets say you’re near that forest currently for which the machine predicted the fire to be broken out and, there’s no fire broken out in forest. therefore, reality is a False/Negative reality Which makes this a TRUE NEGATIVE situation (Why True Negative? this because the negative outcomes were accurate and true) AI Grade 10 (PT-1) 14 EXAMPLE 3: Let’s say that a Forest is prone to forest fires. A machine predicts that the forest will be broken out with fire. machine here gives a TRUE prediction Now, Lets say you’re near that forest currently for which the machine predicted the fire to be broken out and, there’s no fire broken out in forest. therefore, reality is a False/Negative reality Which makes this a FALSE POSITIVE situation (Why False Positive? this because the machine predicted ‘FALSE’ and There’s no fire which makes it positive ) EXAMPLE 4: Let’s say that a Forest is prone to forest fires. A machine predicts that the forest will NOT be broken out with fire. machine here gives a False prediction Now, Lets say you’re near that forest currently for which the machine predicted the fire to be broken out. and, there’s actually a fire broken out in forest. therefore, reality is a TRUE reality Which makes this a FALSE NEGATIVE situation CONFUSION MATRIX A confusion matrix is a table with two rows and two columns that reports the number of true positives, false negatives, false positives, and true negatives. Confusion Matrix provides a more insightful picture which is not only the performance of a predictive model, but also which particular segments are being predicted correctly and incorrectly, and what type of errors are being made by the model. The confusion matrix allows us to understand the prediction results. It is not an evaluation metric but a record that can help in evaluation Prediction and Reality can be easily mapped together with the help of this confusion Confusion PREDICTION REALITY matrix. Matrix TP Yes Yes Evaluation Methods TN No No FP Yes No 1. Accuracy FN No Yes 2. Precision 3. Recall 4. F1 ACCURACY How accurate was the prediction to the reality. Accuracy is percentage of correct predictions out of all the observations. A prediction can be said to be correct if it matches the reality. Here, we have two conditions in which the Prediction matches with the Reality: True Positive and True Negative. TN and TP are always Real! QUESTIONS: 1. Devendra is confused about the condition when is the prediction said to be correct, support your answer to help him to clear his confusion. A prediction can be said to be correct if it matches the reality. e have two conditions in which the Prediction matches with the Reality: True Positive and True Negative. 2. Mentions two conditions when prediction matches reality We have two conditions in which the Prediction matches with the Reality: True Positive and True Negative. 3. Is high accuracy equivalent to good performance? (or, Give an example that High accuracy is not suitable) No, Not always. AI Grade 10 (PT-1) 15 Let us go back to the Forest Fire example. Assume that the model always predicts that there is no fire. But in reality, there is a 2% chance of forest fire breaking out. In this case, for 98 cases, the model will be right but for those 2 cases in which there was a forest fire, then too the model predicted no fire. Here, True Positives = 0 True Negatives = 98 Total cases = 100 Therefore, accuracy becomes: (98 +0)/100 = 98% But this parameter is useless for us as the actual cases where the fire broke out are not taken into account. Hence, there is a need to look at another parameter which takes account of such cases as well known as precision. 4. How much percentage of accuracy is reasonable to show good performance? Depends upon what kind of model you have. for example, your model comes in a medical field having 99.9%, it’s a serious issue since 0.1% is also a huge amount of patient. the model’s accuracy depends and varies from model to model. 5. Out of 300 images of Lions and Tigers, the model identified 267 images correctly. What is the accuracy of the model? 89% or 0.89 6. If a model predicts there is no fire, where in reality there is a 3% chance of forest fire breaking out. What is the accuracy? In these cases where % is given, always take 100% as total cases… why so? since it’s given in percentage. Above situation is a FALSE NEGATIVE. Here, there is only one True case= 97 total case= 100 therefore, it’s 97% accurate 7. Out of 60 pictures of tiger cubs and kittens, of which 45 were kitten, how many tiger cubs must the model identify correctly along with 35 kitten images correctly identified so that the accuracy is more than 75%? In this case, the total number of images is 60, and the number of correctly identified images is 35 kittens plus the number of correctly identified tiger cubs. We can represent this as the equation: (35 + tiger cubs) / 60 > 0.75 To solve for the number of tiger cubs, we can multiply both sides of the equation by 60 and simplify: 35 + tiger cubs > 45 Subtracting 35 from both sides gives us: tiger cubs > 10 So, to achieve an accuracy of more than 75%, the model must correctly identify at least 11 tiger cubs. PRECISION Precision is defined as the percentage of true positive cases versus all the cases where the prediction is true. That is, it takes into account the True Positives and False Positives. RECALL Defined as the percentage of true positive cases vs all the cases where the prediction is true That is, it takes into account the True Positives and False Positives Positive = When Prediction is Yes 1. Give and Example where high precision is not applicable Example: "Predicting a mail as Spam or Not Spam” False Positive: Mail is predicted as "spam" but it is "not spam". False Negative: Mail is predicted as "not spam" but it is "spam". AI Grade 10 (PT-1) 16 Of course, too many False Negatives will make the spam filter ineffective but False Positives may cause important mails to be missed and hence Precision is not usable. 2. Which is more important, Recall or Precision?! Choosing between Precision and Recall depends on the condition in which the model has been deployed. In a case like Forest Fire, a False Negative can cost us a lot and is risky too. Imagine no alert being given even when there is a Forest Fire. The whole forest might burn down. Another case where a False Negative can be dangerous is Viral Outbreak. Imagine a deadly virus has started spreading and the model which is supposed to predict a viral outbreak does not detect it. The virus might spread widely and infect a lot of people. On the other hand, there can be cases in which the False Positive condition costs us more than False Negatives. One such case is Mining. Imagine a model telling you that there exists treasure at a point and you keep on digging there but it turns out that it is a false alarm. Here, False Positive case (predicting there is treasure but there is no treasure) can be very costly. Similarly, let’s consider a model that predicts that a mail is spam or not. If the model always predicts that the mail is spam, people would not look at it and eventually might lose important information. Here also False Positive condition (Predicting the mail as spam while the mail is not spam) would have a high cost. To conclude the argument, we must say that if we want to know if our model’s performance is good, we need these two measures: Recall and Precision. For some cases, you might have a High Precision but Low Recall or Low Precision but High Recall. But since both the measures are important, there is a need of a parameter which takes both Precision and Recall into account. F1 SCORE F1 score can be defined as the measure of balance between precision and recall. In conclusion, we can say that a model has good performance if the F1 Score for that model is high. 1. Take a look at the formula and think of when can we get a perfect F1 score? An ideal situation would be when we have a value of 1 (that is 100%) for both Precision and Recall. In that case, the F1 score would also be an ideal 1 (100%). It is known as the perfect value for F1 Score. As the values of both Precision and Recall ranges from 0 to 1, the F1 score also ranges from 0 to 1. ******************************************************************************************************************************** AI Grade 10 (PT-1) 17 DATA SCIENCE~ Domains of AI: Each domain has its own type of data which gets fed into the machine and hence has its own way of working around it. Talking about Data Sciences, it is a concept to unify statistics, data analysis, machine learning and their related methods in order to understand and analyse actual phenomena with data. It employs techniques and theories drawn from many fields within the context of Mathematics, Statistics, Computer Science, and Information Science Data science example: * Rock, Paper & Scissors: https://www.afiniti.com/corporate/rock- paperscissors Applications of Data Sciences Data Science is not a new field. Data Sciences majorly work around analysing the data when it comes to AI, the analysis helps in making the machine intelligent enough to perform tasks by itself. There exist various applications of Data Science in today’s world Data science’s one of the main objectives is to move from Generalization to Personalization 1) Fraud and Risk Detection* The earliest applications of data science were in Finance Companies were fed up of bad debts and losses every year. However, they had a lot of data which use to get collected during the initial paperwork while sanctioning loans. Over the years, banking companies learned to divide and conquer data via customer profiling, past expenditures, and other essential variables to analyse the probabilities of risk and default. Moreover, It also helped them to push their banking products based on customer’s purchasing power. 2) Genetics & Genomics Data Science applications also enable an advanced level of treatment personalization through research in genetics and genomics The goal is to understand the impact of the DNA on our health and find individual biological connections between genetics, diseases, and drug response. Data science techniques allow integration of different kinds of data with genomic data in disease research, which provides a deeper understanding of genetic issues in reactions to particular drugs and diseases. As soon as we acquire reliable personal genome data, we will achieve a deeper understanding of the human DNA. The advanced genetic risk prediction will be a major step towards more individual care. 3) Internet Search When we talk about search engines, we think ‘Google’. Right? But there are many other search engines like Yahoo, Bing, Ask, AOL, and so on. All these search engines (including Google) make use of data science algorithms to deliver the best result for our searched query in the fraction of a second. Considering the fact that Google processes more than 20 petabytes of data every day, had there been no data science, Google wouldn’t have been the ‘Google’ we know today. 4) Targeted Advertising Starting from the display bailers on various websites to the digital billboards at the airports – almost all of them are decided by using data science algorithms. This is the reason why digital ads have been able to get a much higher CTR (Call-Through Rate) than traditional advertisements. They can be targeted based on a user’s past behaviour. to increase CTR rate, personalized ads are used, giving more profit to the advertiser and the person giving the ad 5) Website Recommendations AI Grade 10 (PT-1) 18 Aren’t we all used to the suggestions about similar products on Amazon? They not only help us find relevant products from billions of products available with them but also add a lot to the user experience. A lot of companies have fervidly used this engine to promote their products in accordance with the user’s interest and relevance of information. Internet giants like Amazon, Twitter, Google Play, Netflix, LinkedIn, IMDB and many more use this system to improve the user experience. The recommendations are made based on previous search results for a user. 6) Airline Route Planning The Airline Industry across the world is known to bear heavy losses. Except for a few airline service providers, companies are struggling to maintain their occupancy ratio and operating profits. With high rise in air-fuel prices and the need to offer heavy discounts to customers, the situation has got worse. It wasn’t long before airline companies started using Data Science to identify the strategic areas of improvements. Now, while using Data Science, the airline companies can: Predict flight delay Decide which class of airplanes to buy Whether to directly land at the destination or take a halt in between (For example, A flight can have a direct route from New Delhi to New York. Alternatively, it can also choose to halt in any country.) Effectively drive customer loyalty programs Project Cycle revisit~ Problem Scooping Humans are social animals. We tend to organise and/or participate in various kinds of social gatherings all the time. We love eating out with friends and family because of which we can find restaurants almost everywhere and out of these, many of the restaurants arrange for buffets to offer a variety of food items to their customers. Be it small shops or big outlets, every restaurant prepares food in bulk as they expect a good crowd to come and enjoy their food. But in most cases, after the day ends, a lot of food is left which becomes unusable for the restaurant as they do not wish to serve stale food to their customers the next day. So, every day, they prepare food in large quantities keeping in mind the probable number of customers walking into their outlet. But if the expectations are not met, a good amount of food gets wasted which eventually becomes a loss for the restaurant as they either have to dump it or give it to hungry people for free. And if this daily loss is taken into account for a year, it becomes quite a big amount. AI Grade 10 (PT-1) 19 The Problem statement template leads us towards the goal of our project which can now be stated as: “To be able to predict the quantity of food dishes to be prepared for everyday consumption in restaurant buffets.” Data Acquisition After finalising the goal of our project, let us now move towards looking at various data features which affect the problem in some way or the other. Since any AI-based project requires data for testing and training, we need to understand what kind of data is to be collected to work towards the goal. In our scenario, various factors that would affect the quantity of food to be prepared for the next day consumption in buffets would be: Total Number of Customers Unconsumed dish quantity per day Quantity of dish prepared per day Price of dish Dish consumption Quantity of dish for the next day Now let us understand how these factors are related to our problem statement. For this, we can use the System Maps tool to figure out the relationship of elements with the project’s goal. Here is the System map for our problem statement. Relationships are defined by arrows. Within a system map, loops will be identified The loops are important because they represent a specific chain of causes and effects The loops are important because they represent a specific chain of causes and effects In this system map, you can see how the relationship of each element is defined with the goal of our project. Recall that the positive arrows determine a direct relationship of elements while the negative ones show an inverse relationship of elements. Data Exploration After creating the database, we now need to look at the data collected and understand what is required out of it. In this case, since the goal of our project is to be able to predict the quantity of food to be prepared for the next day, we need to have the following data: AI Grade 10 (PT-1) 20 Name of dish Quantity of that dish prepared per day Quantity of unconsumed portion of the dish per day Modelling Once the dataset is ready, we train our model on it. In this case, a regression model is chosen in which the dataset is fed as a data frame and is trained accordingly. Regression is a Supervised Learning model which takes in continuous values of data over a period of time. Since in our case the data which we have is a continuous data of 30 days, we can use the regression model so that it predicts the next values to it in a similar manner. In this case, the dataset of 30 days is divided in a ratio of 2:1 for training and testing respectively. In this case, the model is first trained on the 20-day data and then gets evaluated for the rest of the 10 days. Evaluation Once the model has been trained on the training dataset of 20 days, it is now time to see if the model is working properly or not. Let us see how the model works and how is it tested. Step 1: The trained model is fed data regards the name of the dish and the quantity produced for the same. Step 2: It is then fed data regards the quantity of food left unconsumed for the same dish on previous occasions. Step 3: The model then works upon the entries according to the training it got at the modelling stage. Step 4: The Model predicts the quantity of food to be prepared for the next day. Step 5: The prediction is compared to the testing dataset value. From the testing dataset, ideally, we can say that the quantity of food to be produced for next day’s consumption should be the total quantity minus the unconsumed quantity. Step 6: The model is tested for 10 testing datasets kept aside while training. Step 7: Prediction values of testing dataset is compared to the actual values. Step 8: If the prediction value is same or almost similar to the actual values, the model is said to be accurate. Otherwise, either the model selection is changed or the model is trained on more data for better accuracy Data Collection (imp) Data collection is nothing new which has come up in our lives. It has been in our society since ages. Even when people did not have fair knowledge of calculations, records were still maintained in some way or the other to keep an account of relevant things. Data collection is an exercise which does not require even a tiny bit of technological knowledge. But when it comes to analyzing the data, it becomes a tedious process for humans as it is all about numbers and alpha-numerical data. That is where Data Science comes into the picture. It not only gives us a clearer idea around the dataset, but also adds value to it by providing deeper and clearer analyses around it. And as AI gets incorporated in the process, predictions and suggestions by the machine become possible on the same. For the data domain-based projects, majorly the type of data used is in numerical or alpha-numerical format and such datasets are curated in the form of tables. Such databases are very commonly found in any institution for record maintenance and other purposes. Some examples of datasets which you must already be aware of are: Banks: Databases of loans issued, account holder, locker owners, employee registrations, bank visitors, etc. ATM Machines: Usage details per day, cash denominations transaction details, visitor details, etc. Movie Theatres: Movie details, tickets sold offline, tickets sold online, refreshment purchases, etc. Sources of Data There exist various sources of data from where we can collect any type of data required and the data collection process can be categorised in two ways: Offline and Online. AI Grade 10 (PT-1) 21 While accessing data from any of the data sources, following points should be kept in mind: 1. Data which is available for public usage only should be taken up. 2. Personal datasets should only be used with the consent of the owner. 3. One should never breach(interfering) someone’s privacy to collect data. 4. Data should only be taken form reliable sources as the data collected from random sources can be wrong or unusable. 5. Reliable sources of data ensure the authenticity and relevancy of data which helps in proper training of the AI model. Types of Data For Data Science, usually the data is collected in the form of tables. These tabular datasets can be stored in different formats. Some of the commonly used formats are: 1. CSV: CSV stands for Comma Separated Values. It is a simple file format used to store tabular data. Each line of this file is a data record and reach record consists of one or more fields which are separated by commas. Since the values of records are separated by a comma, hence they are known as CSV files. 2. Spreadsheet: A Spreadsheet is a piece of paper or a computer program which is used for accounting and recording data using rows and columns into which information can be entered. Microsoft excel is a program which helps in creating spreadsheets. 3. SQL: SQL is a programming language also known as Structured Query Language. It is a domain specific language used in programming is designed for managing data held in different kinds of DBMS (Database Management System) It is particularly useful in handling structured data Data Access After collecting the data, to be able to use it for programming purposes, we should know how to access the same in a Python code. To make our lives easier, there exist various Python packages which help us in accessing structured data (in tabular form) inside the code. Let us take a look at some of these packages: NumPy: NumPy, which stands for Numerical Python, is the fundamental package for Mathematical and logical operations on arrays in Python. It is a commonly used package when it comes to working around numbers. NumPy gives a wide range of arithmetic operations around numbers giving us an easier approach in working with them. NumPy also works with arrays, which is nothing but a homogenous collection of Data. An array is nothing but a set of multiple values which are of same datatype. They can be numbers, characters, booleans, etc. but only one datatype can be accessed through an array. In NumPy, the arrays used are known as ND-arrays (N-Dimensional Arrays) as NumPy comes with a feature of creating n- dimensional arrays in Python. Pandas Pandas is a software library written for the Python programming language for data manipulation and analysis. In particular, it offers data structures and operations for manipulating numerical tables and time series. The name is derived from the term "panel data", an econometrics term for data sets that include observations over multiple time periods for the same individuals. It provides functionalities like: AI Grade 10 (PT-1) 22 Introduction to pandas and series Introduction to a data frame Data access from the data frame using iteration Selecting data from a data frame Deleting rows/columns from a data frame Data Frame functions Matplotlib This library is used for data visualization and creating histograms. It helps in understanding data patterns in a visual format. One of the greatest benefits of visualization is that it allows us visual access to huge amounts of data in easily digestible visuals. Matplotlib comes with a wide variety of plots. AI Grade 10 (PT-1) 23

Use Quizgecko on...
Browser
Browser