Artificial Intelligence Handouts PDF

Artificial Intelligence By Dr Zafar. M. Alvi Artificial Intelligence (CS607) Table of Contents: 1 Introduction............................................................................................................................................ 4 1.1 What is Intelligence?.................................................................................................................... 4 1.2 Intelligent Machines..................................................................................................................... 7 1.3 Formal Definitions for Artificial Intelligence............................................................................... 7 1.4 History and Evolution of Artificial Intelligence........................................................................... 9 1.5 Applications............................................................................................................................... 13 1.6 Summary.................................................................................................................................... 14 2 Problem Solving.................................................................................................................................. 15 2.1 Classical Approach..................................................................................................................... 15 2.2 Generate and Test....................................................................................................................... 15 2.3 Problem Representation.............................................................................................................. 16 2.4 Components of Problem Solving................................................................................................ 17 2.5 The Two-One Problem............................................................................................................... 18 2.6 Searching.................................................................................................................................... 21 2.7 Tree and Graphs Terminology.................................................................................................... 21 2.8 Search Strategies........................................................................................................................ 23 2.9 Simple Search Algorithm........................................................................................................... 24 2.10 Simple Search Algorithm Applied to Depth First Search........................................................... 25 2.11 Simple Search Algorithm Applied to Breadth First Search........................................................ 28 2.12 Problems with DFS and BFS...................................................................................................... 32 2.13 Progressive Deepening............................................................................................................... 32 2.14 Heuristically Informed Searches................................................................................................ 37 2.15 Hill Climbing.............................................................................................................................. 39 2.16 Beam Search............................................................................................................................... 43 2.17 Best First Search......................................................................................................................... 45 2.18 Optimal Searches........................................................................................................................ 47 2.19 Branch and Bound...................................................................................................................... 48 2.20 Improvements in Branch and Bound.......................................................................................... 55 2.21 A* Procedure.............................................................................................................................. 56 2.22 Adversarial Search..................................................................................................................... 62 2.23 Minimax Procedure.................................................................................................................... 63 2.24 Alpha Beta Pruning.................................................................................................................... 64 2.25 Summary.................................................................................................................................... 71 2.26 Problems..................................................................................................................................... 72 3 Genetic Algorithms.............................................................................................................................. 76 3.1 Discussion on Problem Solving.................................................................................................. 76 3.2 Hill Climbing in Parallel............................................................................................................ 76 3.3 Comment on Evolution............................................................................................................... 77 3.4 Genetic Algorithm...................................................................................................................... 77 3.5 Basic Genetic Algorithm............................................................................................................ 77 3.6 Solution to a Few Problems using GA....................................................................................... 77 3.7 Eight Queens Problem................................................................................................................ 82 3.8 Problems..................................................................................................................................... 88 4 Knowledge Representation and Reasoning.......................................................................................... 89 4.1 The AI Cycle.............................................................................................................................. 89 4.2 The dilemma............................................................................................................................... 90 4.3 Knowledge and its types............................................................................................................. 90 4.4 Towards Representation............................................................................................................. 91 4.5 Formal KR techniques................................................................................................................ 93 4.6 Facts........................................................................................................................................... 94 4.7 Rules........................................................................................................................................... 95 4.8 Semantic networks..................................................................................................................... 97 4.9 Frames........................................................................................................................................ 98 4.10 Logic........................................................................................................................................... 98 4.11 Reasoning................................................................................................................................. 102 4.12 Types of reasoning................................................................................................................... 102 5 Expert Systems.................................................................................................................................. 111 2 © Copyright Virtual University of Pakistan Artificial Intelligence (CS607) 5.1 What is an Expert?................................................................................................................... 111 5.2 What is an expert system?........................................................................................................ 111 5.3 History and Evolution.............................................................................................................. 111 5.4 Comparison of a human expert and an expert yystem.............................................................. 112 5.5 Roles of an expert system......................................................................................................... 113 5.6 How are expert systems used?.................................................................................................. 114 5.7 Expert system structure............................................................................................................ 115 5.8 Characteristics of expert systems............................................................................................. 121 5.9 Programming vs. knowledge engineering................................................................................ 122 5.10 People involved in an expert system project............................................................................ 122 5.11 Inference mechanisms.............................................................................................................. 123 5.12 Design of expert systems.......................................................................................................... 129 6 Handling uncertainty with fuzzy systems.......................................................................................... 145 6.1 Introduction.............................................................................................................................. 145 6.2 Classical sets............................................................................................................................ 145 6.3 Fuzzy sets................................................................................................................................. 146 6.4 Fuzzy Logic.............................................................................................................................. 147 6.5 Fuzzy inference system............................................................................................................ 153 6.6 Summary.................................................................................................................................. 158 6.7 Exercise.................................................................................................................................... 158 7 Introduction to learning...................................................................................................................... 159 7.1 Motivation................................................................................................................................ 159 7.2 What is learning ?..................................................................................................................... 159 7.3 What is machine learning ?...................................................................................................... 160 7.4 Why do we want machine learning.......................................................................................... 160 7.5 What are the three phases in machine learning?....................................................................... 160 7.6 Learning techniques available.................................................................................................. 162 7.7 How is it different from the AI we've studied so far?............................................................... 163 7.8 Applied learning....................................................................................................................... 163 7.9 LEARNING: Symbol-based..................................................................................................... 165 7.10 Problem and problem spaces.................................................................................................... 165 7.11 Concept learning as search....................................................................................................... 171 7.12 Decision trees learning............................................................................................................. 176 7.13 LEARNING: Connectionist..................................................................................................... 181 7.14 Biological aspects and structure of a neuron........................................................................... 181 7.15 Single perceptron...................................................................................................................... 182 7.16 Linearly separable problems..................................................................................................... 184 7.17 Multiple layers of perceptrons.................................................................................................. 186 7.18 Artificial Neural Networks: supervised and unsupervised....................................................... 187 7.19 Basic terminologies.................................................................................................................. 187 7.20 Design phases of ANNs............................................................................................................ 188 7.21 Supervised................................................................................................................................ 190 7.22 Unsupervised............................................................................................................................ 190 7.23 Exercise.................................................................................................................................... 192 8 Planning............................................................................................................................................. 195 8.1 Motivation................................................................................................................................ 195 8.2 Definition of Planning.............................................................................................................. 196 8.3 Planning vs. problem solving................................................................................................... 197 8.4 Planning language.................................................................................................................... 197 8.5 The partial-order planning algorithm – POP............................................................................ 198 8.6 POP Example........................................................................................................................... 199 8.7 Problems................................................................................................................................... 202 9 Advanced Topics............................................................................................................................... 203 9.1 Computer vision....................................................................................................................... 203 9.2 Robotics.................................................................................................................................... 204 9.3 Clustering................................................................................................................................. 205 10 Conclusion.................................................................................................................................... 206 3 © Copyright Virtual University of Pakistan Artificial Intelligence (CS607) Artificial Intelligence 1 Introduction This booklet is organized as chapters that elaborate on various concepts of Artificial Intelligence. The field itself is an emerging area of computer sciences and a lot of work is underway in order to mature the concepts of this field. In this booklet we will however try to envelop some important aspects and basic concepts which will help the reader to get an insight into the type of topics that Artificial Intelligence deals with. We have used the name of the field i.e. Artificial Intelligence (commonly referred as AI) without any explanation of the name itself. Let us now look into a simple but comprehensive way to define the field. To define AI, let us first try to understand that what is Intelligence? 1.1 What is Intelligence? If you were asked a simple question; how can we define Intelligence? many of you would exactly know what it is but most of you won’t exactly be able to define it. Is it something tangible? We all know that it does exist but what actually it is. Some of us will attribute intelligence to living beings and would be of the view that all living species are intelligent. But how about these plants and tress, they are living species but are they also intelligent? So can we say that Intelligence is a trait of some living species? Let us try to understand the phenomena of intelligence by using a few examples. Consider the following image where a mouse is trying to search a maze in order to find its way from the bottom left to the piece of cheese in the top right corner of the image. This problem can be considered as a common real life problem which we deal with many times in our life, i.e. finding a path, may be to a university, to a friends house, to a market, or in this case to the piece of cheese. The mouse tries various paths as shown by arrows and can reach the cheese by more than one path. In other words the mouse can find more than one solutions to this problem. The mouse was intelligent enough to find a solution to the problem at hand. Hence the ability of problem solving demonstrates intelligence. Let us consider another problem. Consider the sequence of numbers below: 1, 3, 7, 13, 21, ___ 4 © Copyright Virtual University of Pakistan Artificial Intelligence (CS607) If you were asked to find the next number in the sequence what would be your answer? Just to help you out in the answer let us solve it for you “adding the next even number to the” i.e. if we add 2 to 1 we get 3, then we add 4 to 3 we get 7, then we get 6 to 7 we get 13, then we add 8 to 13 we get 21 and finally if we’ll add 10 to 21 we’ll get 31 as the answer. Again answering the question requires a little bit intelligence. The characteristic of intelligence comes in when we try to solve something, we check various ways to solve it, we check different combinations, and many other things to solve different problems. All this thinking, this memory manipulation capability, this numerical processing ability and a lot of other things add to ones intelligence. All of you have experienced your college life. It was very easy for us to look at the timetable and go to the respective classes to attend them. Not even caring that how that time table was actually developed. In simple cases developing such a timetable is simple. But in cases where we have 100s of students studying in different classes, where we have only a few rooms and limited time to schedule all those classes. This gets tougher and tougher. The person who makes the timetable has to look into all the time schedule, availability of the teachers, availability of the rooms, and many other things to fit all the items correctly within a fixed span of time. He has to look into many expressions and thoughts like “If room A is free AND teacher B is ready to take the class AND the students of the class are not studying any other course at that time” THEN “the class can be scheduled”. This is a fairly simple one, things get complex as we add more and more parameters e.g. if we were to consider that teacher B might teach more than one course and he might just prefer to teach in room C and many other things like that. The problem gets more and more complex. We are pretty much sure than none of us had ever realized the complexity through which our teachers go through while developing these schedules for our classes. However, like we know such time tables can be developed. All this information has to reside in the developer’s brain. His intelligence helps him to create such a schedule. Hence the ability to think, plan and schedule demonstrate intelligence. Consider a doctor, he checks many patients daily, diagnoses their disease, gives them medicine and prescribes them behaviors that can help them to get cured. Let us think a little and try to understand that what actually he does. Though checking a patient and diagnosing the disease is much more complex but we’ll try to keep our discussion very simple and will intentionally miss out stuff from this discussion. A person goes to doctor, tells him that he is not feeling well. The doctor asks him a few questions to clarify the patient’s situation. The doctor takes a few measurements to check the physical status of the person. These measurements might just include the temperature (T), Blood Pressure (BP), Pulse Rate (PR) and things like that. For simplicity let us consider that some doctor only checks these measurements and tries to come up with a diagnosis for the disease. He takes these measurements and based on his previous knowledge he tries to diagnose the disease. His previous knowledge is based on rules like: “If the patient has a high BP and normal T and normal PR then he is not well”. “If only the BP is normal then what ever the other measurements may be the person should be healthy”, and many such other rules. 5 © Copyright Virtual University of Pakistan Artificial Intelligence (CS607) The key thing to notice is that by using such rules the doctor might classify a person to be healthy or ill and might as well prescribe different medicines to him using the information observed from the measurements according to his previous knowledge. Diagnosing a disease has many other complex information and observations involved, we have just mentioned a very simple case here. However, the doctor is actually faced with solving a problem of diagnosis having looked at some specific measurements. It is important to consider that a doctor who would have a better memory to store all this precious knowledge, better ability of retrieving the correct portion of the knowledge for the correct patient will be better able to classify a patient. Hence, telling us that memory and correct and efficient memory and information manipulation also counts towards ones intelligence. Things are not all that simple. People don’t think about problems in the same manner. Let us give you an extremely simple problem. Just tell us about your height. Are you short, medium or tall? An extremely easy question! Well you might just think that you are tall but your friend who is taller than you might say that NO! You are not. The point being that some people might have such a distribution in their mind that people having height around 4ft are short, around 5ft are medium and around 6ft are tall. Others might have this distribution that people having height around 4.5ft are short, around 5.5ft are medium and around 6.5ft are tall. Even having the same measurements different people can get to completely different results as they approach the problem in different fashion. Things can be even more complex when the same person, having observed same measurements solves the same problem in two different ways and reaches different solutions. But we all know that we answer such fuzzy questions very efficiently in our daily lives. Our intelligence actually helps us do this. Hence the ability to tackle ambiguous and fuzzy problems demonstrates intelligence. Can you recognize a person just by looking at his/her fingerprint? Though we all know that every human has a distinct pattern of his/her fingerprint but just by looking at a fingerprint image a human generally can’t just tell that this print must be of person XYZ. On the other hand having distinct fingerprint is really important information as it serves as a unique ID for all the humans in this world. Let us just consider 5 different people and ask a sixth one to have a look at different images of their fingerprints. We ask him to somehow learn the patterns, which make the five prints distinct in some manner. After having seen the images a several times, that sixth person might get to find something that is making the prints distinct. Things like one of them has fever lines in the print, the other one has sharply curved lines, some might have larger distance between the lines in the print and some might have smaller displacement between the lines and many such features. The point being that after some time, which may be in hours or days or may be even months, that sixth person will be able to look at a new fingerprint of one of those five persons and he might with some degree of accuracy recognize that which one amongst the five does it belong. Only with 5 people the problem was hard to solve. His intelligence helped him to learn the features that distinguish one finger print from the other. Hence the ability to learn and recognize demonstrates intelligence. 6 © Copyright Virtual University of Pakistan Artificial Intelligence (CS607) Let us give one last thought and then will get to why we have discussed all this. A lot of us regularly watch television. Consider that you switch off the volume of your TV set. If you are watching a VU lecture you will somehow perceive that the person standing in front of you is not singing a song, or anchoring a musical show or playing some sport. So just by observing the sequence of images of the person you are able to perceive meaningful information out of the video. Your intelligence helped you to perceive and understand what was happening on the TV. Hence the ability to understand and perceive demonstrates intelligence. 1.2 Intelligent Machines The discussion in the above section has a lot of consequences when we see it with a different perspective. Let us show you something really interesting now and hence informally define the field of Artificial Intelligence at the same time. What if? A machine searches through a mesh and finds a path? A machine solves problems like the next number in the sequence? A machine develops plans? A machine diagnoses and prescribes? A machine answers ambiguous questions? A machine recognizes fingerprints? A machine understands? A machine perceives? A machine does MANY MORE SUCH THINGS! A machine behaves as HUMANS do? HUMANOID!!! We will have to call such a machine Intelligent. Is this real or natural intelligence? NO! This is Artificial Intelligence. 1.3 Formal Definitions for Artificial Intelligence In their book “Artificial Intelligence: A Modern Approach” Stuart Russell and Peter Norvig comment on artificial intelligence in a very comprehensive manner. They present the definitions of artificial intelligence according to eight recent textbooks. These definitions can be broadly categorized under two themes. The ones in the left column of the table below are concerned with thought process and reasoning, where as the ones in the right column address behavior. Systems that think like Systems that act like humans humans “The exciting new effort “The art of creating to make computers think machines that perform … machines with minds, functions that require in the full and literal intelligence when sense” (Haugeland, performed by people” 1985) (Kurzweil 1990) “[The automation of] “The study of how to activities that we make computers do associate with human things at which, at the thinking, activities such moment, people are 7 © Copyright Virtual University of Pakistan Artificial Intelligence (CS607) as decision making, better” (Rich and Knight, problem solving, 1991) learning …” (Bellman, 1978) “The study of mental “A field of study that faculties through the use seeks to explain and of computational emulate intelligent models” (Charniak and behavior in terms of McDermott) computational processes” (Schalkoff, 1990) “The study of “The branch of computer computation that make it science that is possible to perceive concerned with the reason and act” automation of intelligent (Winston 1992) behavior” (Luger and Stubblefield, 1993) To make computers think like humans we first need to devise ways to determine that how humans think. This is not that easy. For this we need to get inside the actual functioning of the human brain. There are two ways to do this: Introspection: that is trying to catch out own thoughts as they go by. Psychological Experiments: that concern with the study of science of mental life. Once we accomplish in developing some sort of comprehensive theory that how humans think, only then can we come up with computer programs that follow the same rules. The interdisciplinary field of cognitive science brings together computer models from AI and experimental techniques from psychology to try to construct precise and testable theories of the working of human mind. The issue of acting like humans comes up when AI programs have to interact with people or when they have to do something physically which human usually do in real life. For instance when a natural language processing system makes a dialog with a person, or when some intelligent software gives out a medical diagnosis, or when a robotic arm sorts out manufactured goods over a conveyer belt and many other such scenarios. Keeping in view all the above motivations let us give a fairly comprehensive comment that Artificial Intelligence is an effort to create systems that can learn, think, perceive, analyze and act in the same manner as real humans. People have also looked into understanding the phenomena of Artificial Intelligence from a different view point. They call this strong and weal AI. Strong AI means that machines act intelligently and they have real conscious minds. Weak AI says that machines can be made to act as if they are intelligent. That is Weak AI treats the brain as a black box and just emulates its functionality. While strong AI actually tries to recreate the functions of the inside of the brain as opposed to simply emulating behavior. 8 © Copyright Virtual University of Pakistan Artificial Intelligence (CS607) The concept can be explained by an example. Consider you have a very intelligent machine that does a lot of tasks with a lot of intelligence. On the other hand you have very trivial specie e.g. a cat. If you throw both of them into a pool of water, the cat will try to save her life and would swim out of the pool. The “intelligent” machine would die out in the water without any effort to save itself. The mouse had strong Intelligence, the machine didn’t. If the machine has strong artificial intelligence, it would have used its knowledge to counter for this totally new situation in its environment. But the machine only knew what we taught it or in other wards only knew what was programmed into it. It never had the inherent capability of intelligence which would have helped it to deal with this new situation. Most of the researchers are of the view that strong AI can’t actually ever be created and what ever we study and understand while dealing with the field of AI is related to weak AI. A few are also of the view that we can get to the essence of strong AI as well. However it is a standing debate but the purpose was to introduce you with another aspect of thinking about the field. 1.4 History and Evolution of Artificial Intelligence AI is a young field. It has inherited its ideas, concepts and techniques from many disciplines like philosophy, mathematics, psychology, linguistics, biology etc. From over a long period of traditions in philosophy theories of reasoning and learning have emerged. From over 400 years of mathematics we have formal theories of logic, probability, decision-making and computation. From psychology we have the tools and techniques to investigate the human mind and ways to represent the resulting theories. Linguistics provides us with the theories of structure and meaning of language. From biology we have information about the network structure of a human brain and all the theories on functionalities of different human organs. Finally from computer science we have tools and concepts to make AI a reality. 1.4.1 First recognized work on AI The first work that is now generally recognized as AI was done by Warren McCulloch and Walter Pitts (1943). Their work based on three sources: The basic physiology and function of neurons in the human brain The prepositional logic The Turing’s theory of computation The proposed an artificial model of the human neuron. Their model proposed a human neuron to be a bi-state element i.e. on or off and that the state of the neuron depending on response to stimulation by a sufficient number of neighboring neurons. They showed, for example, that some network of connected neurons could compute any computable function, and that all the logical connectives can be implemented by simple net structures. They also suggested that suitably connected networks can also learn but they didn’t pursue this idea much at that time. Donald Hebb (1949) demonstrated a simple updating rile for the modifying the connection strengths between neurons, such that learning could take place. 9 © Copyright Virtual University of Pakistan Artificial Intelligence (CS607) 1.4.2 The name of the field as “Artificial Intelligence” In 1956 some of the U.S researchers got together and organized a two-month workshop at Dartmouth. There were altogether only 10 attendees. Allen Newell and Herbert Simon actually dominated the workshop. Although all the researchers had some excellent ideas and a few even had some demo programs like checkers, but Newell and Herbert already had a reasoning program, the Logic Theorist. The program came up with proofs for logic theorems. The Dartmouth workshop didn’t lead to any new breakthroughs, but it did all the major people who were working in the field to each other. Over the next twenty years these people, their students and colleagues at MIT, CMU, Stanford and IBM, dominated the field of artificial intelligence. The most lasting and memorable thing that came out of that workshop was an agreement to adopt the new name for the field: Artificial Intelligence. So this was when the term was actually coined. 1.4.3 First program that thought humanly In the early years AI met drastic success. The researchers were highly motivated to try out AI techniques to solve problems that were not yet been solved. Many of them met great successes. Newell and Simon’s early success was followed up with the General Problem Solver. Unlike Logic Theorist, this program was developed in the manner that it attacked a problem imitating the steps that human take when solving a problem. Though it catered for a limited class of problems but it was found out that it addressed those problems in a way very similar to that as humans. It was probably the first program that imitated human thinking approach. 1.4.4 Development of Lisp In 1958 In MIT AI Lab, McCarthy defined the high-level language Lisp that became the dominant AI programming language in the proceeding years. Though McCarthy had the required tools with him to implement programs in this language but access to scarce and expensive computing resources were also a serious problem. Thus he and other researchers at MIT invented time sharing. Also in 1958 he published a paper titled Programs with Common Sense, in which he mentioned Advice Taker a hypothetical that can be seen as the first complete AI system. Unlike the other systems at that time, it was to cater for the general knowledge of the world. For example he showed that how some simple rules could help a program generate a plan to drive to an airport and catch the plane. 1.4.5 Microworlds Marvin Minsky (1963), a researcher at MIT supervised a number of students who chose limited problems that appeared to require intelligence to solve. These limited domains became known as Microworlds. Some of them developed programs that solved calculus problems; some developed programs, which were able to accept input statements in a very restricted subset of English language, and generated answers to these statements. An example statement and an answer can be: Statement: If Ali is 2 years younger than Umar and Umar is 23 years old. How old is Ali? Answer: Ali is 21 years old. 10 © Copyright Virtual University of Pakistan Artificial Intelligence (CS607) In the same era a few researchers also met significant successes in building neural networks but neural networks will be discusses in detail in the section titled “Learning” in this book. 1.4.6 Researchers started to realize problems In the beginning the AI researchers very confidently predicted their up coming successes. Herbert Simon in 1957 said: It is not my aim to surprise of shock you -- but the simplest way I can summarize is to say that there are now in the world machines that think, that learn and that create. Moreover, their ability to do these things is going to increase rapidly until -- in a visible future – the range of problems they can handle will be coextensive with the range to which human mind has been applied In 1958 he predicted that computers would be chess champions, and an important new mathematical theorem would be proved by machine. But over the years it was revealed that such statements and claims were really optimistic. A major problem that AI researchers started to realize was that though their techniques worked fairly well on one or two simple examples but most of them turned out to fail when tried out on wider selection of problems and on more difficult tasks. One of the problems was that early programs often didn’t have much knowledge of their subject matter, and succeeded by means of simple syntactic manipulations e.g. Weizenbaum’s ELIZA program (1965), which could apparently engage in serious conversation on any topic, actually just borrowed and manipulated the sentences typed into it by a human. Many of the language translation programs tried to translate sentences by just a replacement of words without having catered for the context in which they were used, hence totally failing to maintain the subject matter in the actual sentence, which was to be translated. The famous retranslation of “the spirit is willing but the flesh is weak” as “the vodka is good but the meat is rotten” illustrates the difficulties encountered. Second kind of difficulty was that many problems that AI was trying to solve were intractable. Most of the AI programs in the early years tried to attack a problem by finding different combinations in which a problem can be solved and then combined different combinations and steps until the right solution was found. This didn’t work always. There were many intractable problems in which this approach failed. A third problem arose because of the fundamental limitations on the basic structures being used to generate intelligent behavior. For example in 1969, Minsky and Papert’s book Perceptrons proved that although perceptrons could be shown to learn anything they were capable of representing, they could represent very little. 11 © Copyright Virtual University of Pakistan Artificial Intelligence (CS607) However, in brief different happenings made the researchers realize that as they tried harder and more complex problem the pace of their success decreased so they now refrained from making highly optimistic statements. 1.4.7 AI becomes part of Commercial Market Even after realizing the basic hurdles and problems in the way of achieving success in this field, the researchers went on exploring grounds and techniques. The first successful commercial expert system, R1, began operation at Digital Equipment Corporation (McDermott, 1982). The program basically helped to configure the orders for new computer systems. Detailed study of what expert systems are will be dealt later in this book. For now consider expert systems as a programs that somehow solves a certain problem by using previously stored information about some rules and fact of the domain to which that problem belongs. In 1981, the Japanese announced the “Fifth Generation” project, a 10-year plan to build intelligent computers running Prolog in much the same way that ordinary computers run the machine code. The project proposed to achieve full-scale natural language understanding along with many other ambitious goals. However, by this time people began to invest in this field and many AI projects got commercially funded and accepted. 1.4.8 Neural networks reinvented Although computer science had rejected this concept of neural networks after Minsky and Papert’s Perceptrons book, but in 1980s at least four different groups reinvented the back propagation learning algorithm which was first found in 1969 by Bryson and Ho. The algorithm was applied to many learning problem in computer science and the wide spread dissemination of the results in the collection Parallel Distributed Processing (Rumelhart and McClelland, 1986) caused great excitement. People tried out the back propagation neural networks as a solution to many learning problems and met great success. 12 © Copyright Virtual University of Pakistan Artificial Intelligence (CS607) The diagram above summarizes the history and evolution of AI in a comprehensive shape. 1.5 Applications Artificial finds its application in a lot of areas not only related to computer sciences but many other fields as well. We will briefly mention a few of the application areas and throughout the content of this booklet you will find various applications of the field in detail later. Many information retrieval systems like Google search engine uses artificially intelligent crawlers and content based searching techniques to efficiency and accuracy of the information retrieval. A lot of computer based games like chess, 3D combat games even many arcade games use intelligent software to make the user feel as if the machine on which that game is running is intelligent. Computer Vision is a new area where people are trying to develop the sense of visionary perception into a machine. Computer vision applications help to establish tasks which previously required human vision capabilities e.g. recognizing human faces, understanding images and to interpret them, analyzing medical scan and innumerable amount of other tasks. Natural language processing is another area which tries to make machines speak and interact with humans just like humans themselves. This requires a lot from the field of Artificial Intelligence. 13 © Copyright Virtual University of Pakistan Artificial Intelligence (CS607) Expert systems form probably the largest industrial applications of AI. Software like MYCIN and XCON/R1 has been successfully employed in medical and manufacturing industries respectively. Robotics again forms a branch linked with the applications of AI where people are trying to develop robots which can be rather called as humanoids. Organizations have developed robots that act as pets, visitor guides etc. In short there are vast applications of the field and a lot of research work is going on around the globe in the sub-branches of the field. Like mentioned previously, during the course of the booklet you will find details of many application of AI. 1.6 Summary Intelligence can be understood as a trait of some living species Many factors and behaviors contribute to intelligence Intelligent machines can be created To create intelligent machines we first need to understand how the real brain functions Artificial intelligence deals with making machines think and act like humans It is difficult to give one precise definition of AI History of AI is marked by many interesting happenings through which the field gradually evolved In the early years people made optimistic claims about AI but soon they realized that it’s not all that smooth AI is employed in various different fields like gaming, business, law, medicine, engineering, robotics, computer vision and many other fields This book will guide you through basic concepts and some core algorithms that form the fundamentals of Artificial Intelligence AI has enormous room for research and posses a diverse future 14 © Copyright Virtual University of Pakistan Artificial Intelligence (CS607) Lecture No. 4 -10 2 Problem Solving In chapter one, we discussed a few factors that demonstrate intelligence. Problem solving was one of them when we referred to it using the examples of a mouse searching a maze and the next number in the sequence problem. Historically people viewed the phenomena of intelligence as strongly related to problem solving. They used to think that the person who is able to solve more and more problems is more intelligent than others. In order to understand how exactly problem solving contributes to intelligence, we need to find out how intelligent species solve problems. 2.1 Classical Approach The classical approach to solving a problem is pretty simple. Given a problem at hand use hit and trial method to check for various solutions to that problem. This hit and trial approach usually works well for trivial problems and is referred to as the classical approach to problem solving. Consider the maze searching problem. The mouse travels though one path and finds that the path leads to a dead end, it then back tracks somewhat and goes along some other path and again finds that there is no way to proceed. It goes on performing such search, trying different solutions to solve the problem until a sequence of turns in the maze takes it to the cheese. Hence, of all the solutions the mouse tries, the one that reached the cheese was the one that solved the problem. Consider that a toddler is to switch on the light in a dark room. He sees the switchboard having a number of buttons on it. He presses one, nothing happens, he presses the second one, the fan gets on, he goes on trying different buttons till at last the room gets lighted and his problem gets solved. Consider another situation when we have to open a combinational lock of a briefcase. It is a lock which probably most of you would have seen where we have different numbers and we adjust the individual dials/digits to obtain a combination that opens the lock. However, if we don’t know the correct combination of digits that open the lock, we usually try 0-0-0, 7-7-7, 7-8-6 or any such combination for opening the lock. We are solving this problem in the same manner as the toddler did in the light switch example. All this discussion has one thing in common. That different intelligent species use a similar approach to solve the problem at hand. This approach is essentially the classical way in which intelligent species solve problems. Technically we call this hit and trial approach the “Generate and Test” approach. 2.2 Generate and Test This is a technical name given to the classical way of solving problems where we generate different combinations to solve our problem, and the one which solves the problem is taken as the correct solution. The rest of the combinations that we try are considered as incorrect solutions and hence are destroyed. 15 © Copyright Virtual University of Pakistan Artificial Intelligence (CS607) Possible Tester Solutions Correct Solution Solutions Generator Incorrect Solutions The diagram above shows a simple arrangement of a Generate and Test procedure. The box on the left labeled “Solution Generator” generates different solutions to a problem at hand, e.g. in the case of maze searching problem, the solution generator can be thought of as a machine that generates different paths inside a maze. The “Tester” actually checks that either a possible solution from the solution generates solves out problem or not. Again in case of maze searching the tester can be thought of as a device that checks that a path is a valid path for the mouse to reach the cheese. In case the tester verifies the solution to be a valid path, the solution is taken to be the “Correct Solution”. On the other hand if the solution was incorrect, it is discarded as being an “Incorrect Solution”. 2.3 Problem Representation All the problems that we have seen till now were trivial in nature. When the magnitude of the problem increases and more parameters are added, e.g. the problem of developing a time table, then we have to come up with procedures better than simple Generate and Test approach. Before even thinking of developing techniques to systematically solve the problem, we need to know one more thing that is true about problem solving namely problem representation. The key to problem solving is actually good representation of a problem. Natural representation of problems is usually done using graphics and diagrams to develop a clear picture of the problem in your mind. As an example to our comment consider the diagram below. OFF | OFF | OFF ON | OFF | OFF OFF | ON | OFF OFF | OFF | ON ON | ON | OFF ON | OFF | ON ON | ON | OFF OFF | ON | ON ON | OFF | ON OFF | ON | ON It shows the problem of switching on the light by a toddler in a graphical form. Each rectangle represents the state of the switch board. OFF | OFF| OFF means that all the three switches are OFF. Similarly OFF| ON | OFF means that the first and the last switch is OFF and the middle one is ON. Starting from the state when all the switches are OFF the child can proceed in any of the three ways by 16 © Copyright Virtual University of Pakistan Artificial Intelligence (CS607) switching either one of the switch ON. This brings the toddler to the next level in the tree. Now from here he can explore the other options, till he gets to a state where the switch corresponding to the light is ON. Hence our problem was reduced to finding a node in the tree which ON is the place corresponding to the light switch. Observe how representing a problem in a nice manner clarifies the approach to be taken in order to solve it. 2.4 Components of Problem Solving Let us now be a bit more formal in dealing with problem solving and take a look at the topic with reference to some components that constitute problem solving. They are namely: Problem Statement, Goal State, Solution Space and Operators. We will discuss each one of them in detail. 2.4.1 Problem Statement This is the very essential component where by we get to know what exactly the problem at hand is. The two major things that we get to know about the problem is the Information about what is to be done and constraints to which our solution should comply. For example we might just say that given infinite amount of time, one will be able to solve any problem he wishes to solve. But the constraint “infinite amount of time” is not a practical one. Hence whenever addressing a problem we have to see that how much time shall out solution take at max. Time is not the only constraint. Availability of resources, and all the other parameters laid down in the problem statement actually tells us about all the rules that have to be followed while solving a problem. For example, taking the same example of the mouse, are problem statement will tell us things like, the mouse has to reach the cheese as soon as possible and in case it is unable to find a path within an hour, it might die of hunger. The statement might as well tell us that the mouse is located in the lower left corner of the maze and the cheese in the top left corner, the mouse can turn left, right and might or might not be allowed to move backward and things like that. Thus it is the problem statement that gives us a feel of what exactly to do and helps us start thinking of how exactly things will work in the solution. 2.4.2 Problem Solution While solving a problem, this should be known that what will be out ultimate aim. That is what should be the output of our procedure in order to solve the problem. For example in the case of mouse, the ultimate aim is to reach the cheese. The state of world when mouse will be beside the cheese and probably eating it defines the aim. This state of world is also referred to as the Goal State or the state that represents the solution of the problem. 2.4.3 Solution space In order to reach the solution we need to check various strategies. We might or might not follow a systematic strategy in all the cases. Whatever we follow, we have to go though a certain amount of states of nature to reach the solution. For example when the mouse was in the lower left corner of the maze, represents a state i.e. the start state. When it was stuck in some corner of the maze represents a state. When it was stuck somewhere else represents another state. When it was traveling on a path represents some other state and finally when it 17 © Copyright Virtual University of Pakistan Artificial Intelligence (CS607) reaches the cheese represents a state called the goal state. The set of the start state, the goal state and all the intermediate states constitutes something which is called a solution space. 2.4.4 Traveling in the solution space We have to travel inside this solution space in order to find a solution to our problem. The traveling inside a solution space requires something called as “operators”. In case of the mouse example, turn left, turn right, go straight are the operators which help us travel inside the solution space. In short the action that takes us from one state to the other is referred to as an operator. So while solving a problem we should clearly know that what are the operators that we can use in order to reach the goal state from the starting state. The sequence of these operators is actually the solution to our problem. 2.5 The Two-One Problem In order to explain the four components of problem solving in a better way we have chosen a simple but interesting problem to help you grasp the concepts. The diagram below shows the setting of our problem called the Two-One Problem. Start Goal 11?22 22?11 Legal Moves: Rules: Slide 1s’ move right 2s’ move left Only one move at a time Hop No backing up A simple problem statement to the problem at hand is as under. You are given a rectangular container that has 5 slots in it. Each slot can hold only one coin at a time. Place Rs.1 coins in the two left slots; keep the center slot empty and put Rs.2 coins in the two right slots. A simple representation can be seen in the diagram above where the top left container represents the Start State in which the coined are placed as just described. Our aim is to reach a state of the container where the left two slots should contain Rs.2 coins, the center slot should be empty and the right two slots should contain Rs.1 coin as shown in the Goal State. There are certain simple rules to play this game. The rules are mentioned clearly in the diagram under the heading of “Rules”. The rules actually define the constraints under which the problem has to be solved. The legal moves are the Operators that we can use to get from one state to the other. For example we can slide a coin to its left or right if the left or right slot is empty, or we can hop the coin over a single slot. The rules say that Rs.1 coins can slide or hop only towards right. Similarly the Rs.2 coins can slide or hop only towards the left. You can only move one coin at a time. 18 © Copyright Virtual University of Pakistan Artificial Intelligence (CS607) Now let us try to solve the problem in a trivial manner just by using a hit and trial method without addressing the problem in a systematic manner. Trial 1 Start State Move 1 Move 2 Move 3 Move 4 Move 5 In Move 1 we slide a 2 to the left, then we hop a 1 to the right, then we slide the 2 to the left again and then we hop the 2 to the left, then slide the one to the right hence at least one 2 and one 1 are at the desired positions as required in the goal state but then we are stuck. There is no other valid move which takes us out of this state. Let us consider another trial. Trial 2 19 © Copyright Virtual University of Pakistan Artificial Intelligence (CS607) Starting from the start state we first hop a 1 to the right, then we slide the other 1 to the right and then suddenly we get STUCK!! Hence solving the problem through a hit and trial might not give us the solution. Let us now try to address the problem in a systematic manner. Consider the diagram below. Starting from the goal state if we hop, we get stuck. If we slide we can further carry on. Keeping this observation in mind let us now try to develop all the possible combinations that can happen after we slide. H H ?1122 11 11 ?? 22 22 11 11 22 22 ?? S S 1?122 11 11 22?? 22 S H H S ?1122 1 2 1? 2 1?212 11 11 22 22 ?? S S S S 1212 ? 12 ?12 ?1212 H H H H 12?21 ?2112 1 2 2 1? 21?12 S H S S H S 1 2 2? 1 ?2121 2?112 1 2 2 ?1 2112 ? 2?112 S S 2?121 212 ?1 H H 2 2 1? 1 2?211 S S 1 1? 2 2 20 © Copyright Virtual University of Pakistan Artificial Intelligence (CS607) The diagram above shows a tree sort of structure enumerating all the possible states and moves. Looking at this diagram we can easily figure out the solution to our problem. This tree like structure actually represents the “Solution Space” of this problem. The labels on the links are H and S representing hop and slide operators respectively. Hence H and S are the operators that help us travel through this solution space in order to reach the goal state from the start state. We hope that this example actually clarifies the terms problem statement, start state, goal state, solution space and operators in your mind. It will be a nice exercise to design your own simple problems and try to identify these components in them in order to develop a better understanding. 2.6 Searching All the problems that we have looked at can be converted to a form where we have to start from a start state and search for a goal state by traveling through a solution space. Searching is a formal mechanism to explore alternatives. Most of the solution spaces for problems can be represented in a graph where nodes represent different states and edges represent the operator which takes us from one state to the other. If we can get our grips on algorithms that deal with searching techniques in graphs and trees, we’ll be all set to perform problem solving in an efficient manner. 2.7 Tree and Graphs Terminology Before studying the searching techniques defined on trees and graphs let us briefly review some underlying terminology. A B C D E F G H “A” is the “root node” I J “A, B, C …. J” are “nodes” “B” is a “child” of “A” “A” is ancestor of “D” “D” is a descendant of “A” “D, E, F, G, I, J” are “leaf nodes” Arrows represent “edges” or “links” The diagram above is just to refresh your memories on the terminology of a tree. As for graphs, there are undirected and directed graphs which can be seen in the diagram below. 21 © Copyright Virtual University of Pakistan Artificial Intelligence (CS607) A A B C B C D E F G D E F G H I H I Directed Graph Undirected Graph Let us first consider a couple of examples to learn how graphs can represent important information by the help of nodes and edges. Graphs can be used to represent city routes. A A B C B C D E F G D E F G H I H I Directed Graph Undirected Graph Graphs can be used to plan actions. We will use graphs to represent problems and their solution spaces. One thing to be noted is that every graph can be converted into a tree, by replicating the nodes. Consider the following example. 22 © Copyright Virtual University of Pakistan Artificial Intelligence (CS607) SS A A 33 B B 33 C C AA DD 22 BB DD AA EE 44 44 SS G G CC EE EE BB BB FF 33 22 DD FF BB FF CC EE AA CC GG D D EE FF 11 33 CC GG FF GG GG The graph in the figure represents a city map with cities labeled as S, A, B, C, D, E, F and G. Just by following a simple procedure we can convert this graph to a tree. Start from node S and make it the root of your tree, check how many nodes are adjacent to it. In this case A and D are adjacent to it. Hence in the tree make A and D, children of S. Now go on proceeding in this manner and you’ll get a tree with a few nodes replicated. In this manner depending on a starting node you can get a different tree too. But just recall that when solving a problem; we usually know the start state and the end state. So we will be able to transform our problem graphs in problem trees. Now if we can develop understanding of algorithms that are defined for tree searching and tree traversals then we will be in a better shape to solve problems efficiently. We know that problems can be represented in graphs, and are well familiar with the components of problem solving, let us now address problem solving in a more formal manner and study the searching techniques in detail so that we can systematically approach the solution to a given problem. 2.8 Search Strategies Search strategies and algorithms that we will study are primarily of four types, blind/uninformed, informed/heuristic, any path/non-optimal and optimal path search algorithms. We will discuss each of these using the same mouse example. Suppose the mouse does not know where and how far is the cheese and is totally blind to the configuration of the maze. The mouse would blindly search the maze without any hints that will help it turning left or right at any junction. The mouse will purely use a hit and trial approach and will check all combinations till one takes it to the cheese. Such searching is called blind or uninformed searching. Consider now that the cheese is fresh and the smell of cheese is spread through the maze. The mouse will now use this smell as a guide, or heuristic (we will comment on this word in detail later) to guess the position of the cheese and choose the best from the alternative choices. As the smell gets stronger, the 23 © Copyright Virtual University of Pakistan Artificial Intelligence (CS607) mouse knows that the cheese is closer. Hence the mouse is informed about the cheese through the smell and thus performs an informed search in the maze. For now you might think that the informed search will always give us a better solution and will always solve our problem. This might not be true as you will find out when we discuss the word heuristic in detail later. When solving the maze search problem, we saw that the mouse can reach the cheese from different paths. In the diagram above two possible paths are shown. In any-path/non optimal searches we are concerned with finding any one solution to our problem. As soon as we find a solution, we stop, without thinking that there might as well be a better way to solve the problem which might take lesser time or fewer operators. Contrary to this, in optimal path searches we try to find the best solution. For example, in the diagram above the optimal path is the blue one because it is smaller and requires lesser operators. Hence in optimal searches we find solutions that are least costly, where cost of the solution may be different for each problem. 2.9 Simple Search Algorithm Let us now state a simple search algorithm that will try to give you an idea about the sort of data structures that will be used while searching, and the stop criteria for your search. The strength of the algorithm is such that we will be able to use this algorithm for both Depth First Search (DFS) and Breadth First Search (BFS). Let S be the start state 1. Initialize Q with the start node Q=(S) as the only entry; set Visited = (S) 2. If Q is empty, fail. Else pick node X from Q 3. If X is a goal, return X, we’ve reached the goal 4. (Otherwise) Remove X from Q 5. Find all the children of state X not in Visited 6. Add these to Q; Add Children of X to Visited 7. Go to Step 2 Here Q represents a priority queue. The algorithm is simple and doesn’t need much explanation. We will use this algorithm to implement blind and uninformed searches. The algorithm however can be used to implement informed searches 24 © Copyright Virtual University of Pakistan Artificial Intelligence (CS607) as well. The critical step in the Simple Search Algorithm is picking of a node X from Q according to a priority function. Let us call this function P(n). While using this algorithm for any of the techniques, our priority will be to reduce the value of P(n) as much as we can. In other words, the node with the highest priority will have the smallest value of the function P(n) where n is the node referred to as X in the algorithm. 2.10 Simple Search Algorithm Applied to Depth First Search Depth First Search dives into a tree deeper and deeper to fine the goal state. We will use the same Simple Search Algorithm to implement DFS by keeping our priority function as 1 P (n) = height (n) As mentioned previously we will give priority to the element with minimum P(n) hence the node with the largest value of height will be at the maximum priority to be picked from Q. The following sequence of diagrams will show you how DFS works on a tree using the Simple Search Algorithm. We start with a tree containing nodes S, A, B, C, D, E, F, G, and H, with H as the goal node. In the bottom left table we show the two queues Q and Visited. According to the Simple Search Algorithm, we initialize Q with the start node S, shown below. If Q is not empty, pick the node X with the minimum P(n) (in this case S), as it is the only node in Q. Check if X is goal, (in this case X is not the goal). Hence find all the children of X not in Visited and add them to Q and Visited. Goto Step 2. 25 © Copyright Virtual University of Pakistan Artificial Intelligence (CS607) Again check if Q is not empty, pick the node X with the minimum P(n) (in this case either A or B), as both of them have the same value for P(n). Check if X is goal, (in this case A is not the goal). Hence, find all the children of A not in Visited and add them to Q and Visited. Go to Step 2. Go on following the steps in the Simple Search Algorithm till you find a goal node. The diagrams below show you how the algorithm proceeds. 26 © Copyright Virtual University of Pakistan Artificial Intelligence (CS607) Here, from the 5th row of the table when we remove H and check if it’s the goal, the algorithm says YES and hence we return H as we have reached the goal state. The path followed by the DFS is shown by green arrows at each step. The diagram below also shows that DFS didn’t have to search the entire search space, rather only by traveling in half the tree, the algorithm was able to search the solution. Hence simply by selecting a specific P(n) our Simple Search Algorithm was converted to a DFS procedure. 27 © Copyright Virtual University of Pakistan Artificial Intelligence (CS607) 2.11 Simple Search Algorithm Applied to Breadth First Search Breadth First Search explores the breadth of the tree first and progresses downward level by level. Now, we will use the same Simple Search Algorithm to implement BFS by keeping our priority function as P (n) = height (n) As mentioned previously, we will give priority to the element with minimum P(n) hence the node with the largest value of height will be at the maximum priority to be picked from Q. In other words, greater the depth/height greater the priority. The following sequence of diagrams will show you how BFS works on a tree using the Simple Search Algorithm. We start with a tree containing nodes S, A, B, C, D, E, F, G, and H, with H as the goal node. In the bottom left table we show the two queues Q and Visited. According to the Simple Search Algorithm, we initialize Q with the start node S. If Q is not empty, pick the node X with the minimum P(n) (in this case S), as it is the only node in Q. Check if X is goal, (in this case X is not the goal). Hence find all the children of X not in Visited and add them to Q and Visited. Goto Step 2. 28 © Copyright Virtual University of Pakistan Artificial Intelligence (CS607) Again, check if Q is not empty, pick the node X with the minimum P(n) (in this case either A or B), as both of them have the same value for P(n). Remember, n refers to the node X. Check if X is goal, (in this case A is not the goal). Hence find all the children of A not in Visited and add them to Q and Visited. Go to Step 2. Now, we have B, C and D in the list Q. B has height 1 while C and D are at a height 2. As we are to select the node with the minimum P(n) hence we will select B and repeat. The following sequence of diagram tells you how the algorithm proceeds till it reach the goal state. 29 © Copyright Virtual University of Pakistan Artificial Intelligence (CS607) 30 © Copyright Virtual University of Pakistan Artificial Intelligence (CS607) When we remove H from the 9th row of the table and check if it’s the goal, the algorithm says YES and hence we return H since we have reached the goal state. The path followed by the BFS is shown by green arrows at each step. The diagram below also shows that BFS travels a significant area of the search space if the solution is located somewhere deep inside the tree. Hence, simply by selecting a specific P(n) our Simple Search Algorithm was converted to a BFS procedure. 31 © Copyright Virtual University of Pakistan Artificial Intelligence (CS607) 2.12 Problems with DFS and BFS Though DFS and BFS are simple searching techniques which can get us to the goal state very easily yet both of them have their own problems. DFS has small space requirements (linear in depth) but has major problems: DFS can run forever in search spaces with infinite length paths DFS does not guarantee finding the shallowest goal BFS guarantees finding the shallowest path even in presence of infinite paths, but it has one great problem BFS requires a great deal of space (exponential in depth) We can still come up with a better technique which caters for the drawbacks of both these techniques. One such technique is progressive deepening. 2.13 Progressive Deepening Progressive deepening actually emulates BFS using DFS. The idea is to simply apply DFS to a specific level. If you find the goal, exit, other wise repeat DFS to the next lower level. Go on doing this until you either reach the goal node or the full height of the tree is explored. For example, apply a DFS to level 2 in the tree, if it reaches the goal state, exit, otherwise increase the level of DFS and apply it again until you reach level 4. You can increase the level of DFS by any factor. An example will further clarify your understanding. Consider the tree on the previous page with nodes from S … to N, where I is the goal node. Apply DFS to level 2 in the tree. The green arrows in the diagrams below show how DFS will proceed to level 2. 32 © Copyright Virtual University of Pakistan Artificial Intelligence (CS607) 33 © Copyright Virtual University of Pakistan Artificial Intelligence (CS607) After exploring to level 2, the progressive deepening procedure will find out that the goal state has still not been reached. Hence, it will increment the level by a factor of, say 2, and will now perform a DFS in the tree to depth 4. The blue arrows in the diagrams below show how DFS will proceed to level 4. 34 © Copyright Virtual University of Pakistan Artificial Intelligence (CS607) 35 © Copyright Virtual University of Pakistan Artificial Intelligence (CS607) 36 © Copyright Virtual University of Pakistan Artificial Intelligence (CS607) As soon as the procedure finds the goal state it will quit. Notice that it guarantees to find the solution at a minimum depth like BFS. Imagine that there are a number of solutions below level 4 in the tree. The procedure would only travel a small portion of the search space and without large memory requirements, will find out the solution. 2.14 Heuristically Informed Searches So far we have looked into procedures that search the solution space in an uninformed manner. Such procedures are usually costly with respect to either time, space or both. We now focus on a few techniques that search the solution space in an informed manner using something which is called a heuristic. Such techniques are called heuristic searches. The basic idea of a heuristic search is that rather than trying all possible search paths, you try and focus on paths that seem to be getting you closer to your goal state using some kind of a “guide”. Of course, you generally can't be sure that you are really near your goal state. However, we might be able to use a good guess for the purpose. Heuristics are used to help us make that guess. It must be noted that heuristics don’t always give us the right guess, and hence the correct solutions. In other words educated guesses are not always correct. Recall the example of the mouse searching for cheese. The smell of cheese guides the mouse in the maze, in other words the strength of the smell informs the mouse that how far is it from the goal state. Here the smell of cheese is the heuristic and it is quite accurate. Similarly, consider the diagram below. The graph shows a map in which the numbers on the edges are the distances between cities, for example, the distance between city S and city D is 3 and between B and E is 4. 37 © Copyright Virtual University of Pakistan Artificial Intelligence (CS607) Suppose our goal is to reach city G starting from S. There can be many choices, we might take S, A, D, E, F, G or travel from S, to A, to E, to F, and to G. At each city, if we were to decide which city to go next, we might be interested in some sort of information which will guide us to travel to the city from which the distance of goal is minimum. If someone can tell us the straight-line distance of G from each city then it might help us as a heuristic in order to decide our route map. Consider the graph below. It shows the straight line distances from every city to the goal. Now, cities that are closer to the goal should be our preference. These straight line distances also known as “as the crow flies distance” shall be our heuristic. It is important to note that heuristics can sometimes misguide us. In the example we have just discussed, one might try to reach city C as it is closest from the goal according to our heuristic, but in the original map you can see that there is no direct link between city C and city G. Even if someone reaches city C using the heuristic, he won’t be able to travel to G from C directly, hence the heuristic can misguide. The catch here is that crow-flight distances do not tell us that the two cities are directly connected. Similarly, in the example of mouse and cheese, consider that the maze has fences fixed along some of the paths through which the smell can pass. Our heuristic might guide us on a path which is blocked by a fence, hence again the heuristic is misguiding us. 38 © Copyright Virtual University of Pakistan Artificial Intelligence (CS607) The conclusion then is that heuristics do help us reduce the search space, but it is not at all guaranteed that we’ll always find a solution. Still many people use them as most of the time they are helpful. The key lies in the fact that how do we use the heuristic. Consider the notion of a heuristic function. Whenever we choose a heuristic, we come up with a heuristic function which takes as input the heuristic and gives us out a number corresponding to that heuristic. The search will now be guided by the output of the heuristic function. Depending on our application we might give priority to either larger numbers or smaller numbers. Hence to every node/ state in our graph we will assign a heuristic value, calculated by the heuristic function. We will start with a basic heuristically informed search which is called Hill Climbing. 2.15 Hill Climbing Hill Climbing is basically a depth first search with a measure of quality that is assigned to each node in the tree. The basic idea is: Proceed as you would in DFS except that you order your choices according to some heuristic measurement of the remaining distance to the goal. We will discuss the Hill climbing with an example. Before going to the actual example, let us give another analogy for which the name Hill Climbing has been given to this procedure. Consider a blind person climbing a hill. He can not see the peak of the hill. The best he can do is that from a given point he takes steps in all possible directions and wherever he finds that a step takes him higher he takes that step and reaches a new, higher point. He goes on doing this until all possible steps in any direction will take him higher and this would be the peak, hence the name hill climbing. Notice that each step that we take, gets us closer to our goal which in this example is the peak of a hill. Such a procedure might as well have some problems. Foothill Problem: Consider the diagram of a mountain below. Before reaching the global maxima, that is the highest peak, the blind man will encounter local maxima that are the intermediate peaks and before reaching the maximum height. At each of these local maxima, the blind man gets the perception of having reached the global maxima as none of the steps takes him to a higher point. Hence he might just reach local maxima and think that he has reached the global maxima. Thus getting stuck in the middle of searching the solution space. 39 © Copyright Virtual University of Pakistan Artificial Intelligence (CS607) Plateau Problem: Similarly, consider another problem as depicted in the diagram below. Mountains where flat areas called plateaus are frequently encountered the blind person might again get stuck. When he reaches the portion of a mountain which is totally flat, whatever step he takes gives him no improvement in height hence he gets stuck. Ridge Problem: Consider another problem; you are standing on what seems like a knife edge contour running generally from northeast to southwest. If you take step in one direction it takes you lower, on the other hand when you step in some other direction it gives you no improvement. All these problems can be mapped to situations in our solution space searching. If we are at a state and the heuristics of all the available options take us to a lower value, we might be at local maxima. Similarly, if all the available heuristics 40 © Copyright Virtual University of Pakistan Artificial Intelligence (CS607) take us to no improvement we might be at a plateau. Same is the case with ridge as we can encounter such states in our search tree. The solution to all these problems is randomness. Try taking random steps in random direction of random length and you might get out of the place where you are stuck. Example Let us now take you through an example of searching a tree using hill climbing to end out discussion on hill climbing. Consider the diagram below. The tree corresponds to our problem of reaching city M starting from city S. In other words our aim is to find a path from S to M. We now associate heuristics with every node, that is the straight line distance from the path-terminating city to the goal city. When we start at S we see that if we move to A we will be left with 9 units to travel. As moving on A has given us an improvement in reaching our goal hence we move to A. Exactly in the same manner as the blind man moves up a step that gives him more height. 41 © Copyright Virtual University of Pakistan Artificial Intelligence (CS607) Standing on A we see that C takes us closer to the goal hence we move to C. From C we see that city I give us more improvement hence we move to I and then finally to M. 42 © Copyright Virtual University of Pakistan Artificial Intelligence (CS607) Notice that we only traveled a small portion of the search space and reached our goal. Hence the informed nature of the search can help reduce space and time. 2.16 Beam Search You just saw how hill climbing procedure works through the search space of a tree. Another procedure called beam search proceeds in a similar manner. Out of n possible choices at any level, beam search follows only the best k of them; k is the parameter which we set and the procedure considers only those many nodes at each level. The following sequence of diagrams will show you how Beam Search works in a search tree. We start with a search tree with L as goal state and k=2, that is at every level we will only consider the best 2 nodes. When standing on S we observe that the only two nodes available are A and B so we explore both of them as shown below. 43 © Copyright Virtual University of Pakistan Artificial Intelligence (CS607) From here we have C, D, E and F as the available options to go. Again, we select the two best of them and we explore C and E as shown in the diagram below. From C and E we have G, H, I and J as the available options so we select H and J and similarly at the last level we select L and N of which L is the goal. 44 © Copyright Virtual University of Pakistan Artificial Intelligence (CS607) 2.17 Best First Search Just as beam search considers best k nodes at every level, best first search considers all the open nodes so far and selects the best amongst them. The following sequence of diagrams will show you how a best first search procedure works in a search tree. We start with a search tree as shown above. From S we observe that A is the best option so we explore A. 45 © Copyright Virtual University of Pakistan Artificial Intelligence (CS607) At A we now have C, G, D and B as the options. We select the best of them which is D. 46 © Copyright Virtual University of Pakistan Artificial Intelligence (CS607) At D we have S, G, B, H, M and J as the options. We select H which is the best of them. At last from H we find L as the best. Hence best first search is a greedy approach will looks for the best amongst the available options and hence can sometimes reduce the searching time. All these heuristically informed procedures are considered better but they do not guarantee the optimal solution, as they are dependent on the quality of heuristic being used. 2.18 Optimal Searches So far we have looked at uninformed and informed searches. Both have their advantages and disadvantages. But one thing that lacks in both is that whenever they find a solution they immediately stop. They never consider that their might be more than one solution to the problem and the solution that they have ignored might be the optimal one. A simplest approach to find the optimal solution is this; find all the possible solutions using either an uninformed search or informed search and once you have searched the whole search space and no other solution exists, then choose 47 © Copyright Virtual University of Pakistan Artificial Intelligence (CS607) the most optimal amongst the solutions found. This approach is analogous to the brute force method and is also called the British museum procedure. But in reality, exploring the entire search space is never feasible and at times is not even possible, for instance, if we just consider the tree corresponding to a game of chess (we will learn about game trees later), the effective branching factor is 16 and the effective depth is 100. The number of branches in an exhaustive survey would be on the order of 10120. Hence a huge amount of computation power and time is required in solving the optimal search problems in a brute force manner. 2.19 Branch and Bound In order to solve our problem of optimal search without using a brute force technique, people have come up with different procedures. One such procedure is called branch-and-bound method. The simple idea of branch and bound is the following: The length of the complete path from S to G is 9. Also note that while traveling from S to B we have already covered a distance of 9 units. So traveling further from S D A B to some other node will make the path longer. So we ignore any further paths ahead of the path S D A B. We will show this with a simple example. 48 © Copyright Virtual University of Pakistan Artificial Intelligence (CS607) The diagram above shows the same city road map with distance between the cities labels on the edges. We convert the map to a tree as shown below. We proceed in a Best First Search manner. Starting at S we see that A is the best option so we explore A. From S the options to travel are B and D, the children of A and D the child of S. Among these, D the child of S is the best option. So we explore D. 49 © Copyright Virtual University of Pakistan Artificial Intelligence (CS607) From here the best option is E so we go there, then B, then D, 50 © Copyright Virtual University of Pakistan Artificial Intelligence (CS607) Here we have E, F and A as equally good options so we select arbitrarily and move to say A, then E. 51 © Copyright Virtual University of Pakistan Artificial Intelligence (CS607) When we explore E we find out that if we follow this path further, our path length will increase beyond 9 which is the distance of S to G. Hence we block all the further sub-trees along this path, as shown in the diagram below. We then move to F as that is the best option at this point with a value 7. then C, We see that C is a leaf node so we bind C too as shown in the next diagram. 52 © Copyright Virtual University of Pakistan Artificial Intelligence (CS607) Then we move to B on the right hand side of the tree and bind the sub trees ahead of B as they also exceed the path length 9. 53 © Copyright Virtual University of Pakistan Artificial Intelligence (CS607) We go on proceeding in this fashion, binding the paths that exceed 9 and hence we are saved from traversing a considerable portion of the tree. The subsequent diagrams complete the search until it has found all the optimal solution, that is along the right hand branch of the tree. 54 © Copyright Virtual University of Pakistan Artificial Intelligence (CS607) Notice that we have saved ours

Artificial Intelligence Handouts PDF

Document Details

Tags

Related

Summary

Full Transcript

Upgrade to continue