Early Systems in Artificial Intelligence PDF

UNIT 2 EARLY SYSTEMS IN ARTIFICIAL INTELLIGENCE STUDY GOALS On completion of this unit, you will have learned … – about important approaches that have defined the field of artificial intelligence in the past and that continue to influence it today. – why expert systems are important and how they have contributed to artificial intelli- gence and computer science. – about advances brought about in the Prolog programming language. – the definition of machine learning and how it contributes to artificial intelligence. 2. EARLY SYSTEMS IN ARTIFICIAL INTELLIGENCE Introduction Throughout the history of artificial intelligence, a diverse set of approaches have been explored to tackle the problem of emulating cognitive processes and capabilities. Some of these have been virtually abandoned by the scientific community while others are still being actively pursued to this day. However, most of them have experienced great varia- bility in popularity over the course of the last 70 years of research in artificial intelligence. Notably, even abandoned branches of artificial intelligence have brought forth valuable insights into different aspects of intelligence, highlighting the intricacy of cognitive pro- cesses and dispelling many early misconceptions about the supposed simplicity of per- ception and cognition-related tasks. In this unit, three important branches of artificial intelligence research are introduced, each representing a major vantage point of artificial intelligence, significantly advancing perception of research in the field. 2.1 Overview of Expert Systems As the name suggests, the goal of expert systems is to emulate the decision and solution finding process of an expert. The word “expert” refers to a human being with specialized knowledge and experience in a given field, such as medicine or mechanics. Since prob- lems in any given domain may be similar to each other, but never quite alike, solving prob- lems in a given domain cannot be accomplished by memorization alone. Rather, problem- solving is supplemented by a method matching experiential knowledge to new problems and application scenarios. Expert systems are therefore composed of a body of formalized knowledge and an inference engine that uses the knowledge base to draw conclusions. With respect to the representation of knowledge, three main approaches to expert sys- tems can be distinguished: Case-based systems store examples of concrete problems together with a successful solution. When presented with a novel, previously unseen case, the system tries to retrieve a solution to a similar case and apply this solution to the case at hand. The key challenge is to define a suitable similarity measure to compare problem settings. Rule-based systems represent the knowledge base in the form of facts and if-A-then-B- type rules that describe relations between facts. If the problem class to be solved can be categorized as a decision problem, the knowl- edge can be represented in the form of decision trees. The latter are typically gener- ated by analyzing a set of examples. 32 The inference engine, on the other hand, implements rules of logical reasoning to derive Decision tree new facts, rules, and conclusions not explicitly contained in the given corpus of the knowl- A visual representation of multi-decision processes edge base. in the form of a tree dia- gram is called a decision Historically, expert systems are an outgrowth of earlier attempts at implementing a so- tree. Decision alternatives can be read from the called general problem solver. This approach is primarily associated with the researchers branches with their Herbert A. Simon and Allen Newell, who used a combination of insights from cognitive sci- respective ramifications (nodes). ence and mathematical models of formal reasoning to build a system intended to solve arbitrary problems by successive reduction to simpler problems in the late 1950s. While this attempt ultimately has to be considered a failure when compared to its lofty goals, it has nevertheless proven highly influential in the development of cognitive science. One of the initial insights gained from the attempt at general problem solving was that the construction of a domain specific problem solver should—at least in principle—be easier to achieve. This led the way to thinking about systems that combined domain specific knowledge with domain dependent apposite reasoning patterns. Edward Feigenbaum, who worked at Stanford University, the leading academic institution for the subject at the time, defined the term expert system and built the first practical examples while leading the Heuristic Programming Project. The first notable application was DENDRAL, a system for identifying organic molecules. Given data and rules, the next step was to establish expert systems to help with medical diagnoses of infectious diseases. The expert system that evolved out of this was called MYCIN, which had a knowledge base of around 600 rules. However, it took until the 1980s for expert systems to reach the height of research interest, leading to the development of commercial applications. The main achievement of expert systems was their role in pioneering the idea of a formal, yet accessible representation of knowledge. This representation was explicit in the sense that it was formulated as a set of facts and rules that were suitable for creation, inspec- tion, and review by a domain expert. This approach thus clearly separates domain specific business logic from the general logic needed to run the program—the latter encapsulated in the inference engine. In stark contrast, more conventional programming approaches implicitly represent both internal control and business logic in the form of a program code that is hard to read and understand by people who are not IT experts. At least in principle, the approach championed by expert systems enabled even non-programmers to develop, improve, and maintain a software solution. Moreover, it introduced the idea of rapid pro- totyping since the fixed inference engine enabled the creation of programs for entirely Rapid prototyping different purposes simply by changing the set of underlying rules in the knowledge base. This is a procedure in which prototypes are built and evaluated as However, a major downside of the classical expert system paradigm, which also finally led quickly as possible. to a sharp decline in its popularity, was also related to the knowledge base. As expert sys- Through rapid prototyp- ing, feedback on impor- tems were engineered for a growing number of applications, many interesting use cases tant functionalities can be required larger and larger knowledge bases in order to satisfactorily represent the domain obtained from prospec- tive users early in the in question. This insight proved problematic in two different respects. Firstly, the compu- development process tational complexity of inference grows faster than it does linearly in the number of facts and rules. This means that for many practical problems the system’s answering times were 33 Consistency prohibitively high. Secondly, as a knowledge base grows, proving its consistency by This refers to a set of logi- ensuring that no constituent parts contradict each other, becomes exceedingly challeng- cal propositions free of contradictions in which ing. all propositions are true at the same time. A set of The construction of inference engines for expert systems highlighted the need for a pro- propositions in which all statements cannot be gramming language that facilitated the formulation of logical rules and reasoning pro- true at the same time is cesses. To this end, the programming language Prolog, meaning “programming in logic” called inconsistent. or “programmation en logique” in French, became relevant. 2.2 Introduction to Prolog Prolog was created by French computer scientists Alain Colmerauer and Philippe Roussel, with the logician Robert Kowalski further developing the language. It was first implemen- ted in the early 1970s. The main motivation for creating Prolog was to use it in the devel- opment of systems for natural language processing and artificial intelligence. The aim of this section is not to gain programming proficiency in Prolog, but rather to gain an appre- ciation of the language as a tool for solving logical problems and of its contribution to the development of artificial intelligence and the design of programming languages. At the most basic level, a digital computer processes information in the form of 0—1 values designated as bits. Clearly, this form of representation is not ideally suited for human interpretation and manipulation. In order to facilitate the programming of such a device, programming languages have been designed to provide abstractions to the fundamental technical layer that are closer to human thinking, algorithmic description, and reasoning patterns. The most important difference between programming languages stems from the degree and kind of abstractions that are introduced. Most of the computer languages developed during the period in which Prolog was conceived and implemented were imperative languages—that is, languages encoded in a program as a series of instructions for the machine to follow in order to produce a desired outcome or solution. Declarative In contrast, Prolog is based on a declarative programming paradigm. The programmer programming specifies characteristics of the desired solution and the programming language interpreter This is a programming style in which the pro- then constructs a sequence of processing steps to reach the given goal. A prominent exam- grammer specifies the ple of this paradigm is the structured query language (SQL) for relational databases. A typ- properties of the sought ical query is given by a statement specifying the table from which records are to be solution but not the algo- rithm—that is, the retrieved together with one or more conditions the records should fulfill. The database sequence of operations management system then automatically generates an execution plan—a sequence of pro- that lead to a solution. cessing steps—that produces the outcome as specified by the query. Analogously, a Prolog program consists of a collection of facts and rules that relate the facts to one another. Pro- gram execution is then initiated by formulating a query using the aforementioned knowl- edge base. Before we dive more deeply into Prolog, consider the following analogy: 34 1. As a human being, you have a brain that is full of data, facts, numbers, and bits and pieces of knowledge accumulated throughout the course of your life. Think of this as your knowledge base. 2. You also understand rules that go with facts, many of which you have observed and applied over time. Think of these as logic rules that when applied result in good deci- sions. 3. You are also curious and want to learn about changing your surroundings for the bet- ter, and so you often ask questions. To come up with answers, you draw on the facts, apply the rules, and get reasonable, common sense answers that hopefully constitute solutions to perceived problems. Prolog is designed to formalize these processes in the form of first order logic. Prolog’s First order logic structure is made up of predicates and clauses. A predicate is a Boolean function that This is a branch of mathe- matical logic, which is assigns a truth value to some object X. As such, predicates are commonly used to describe also called predicate properties of objects. The term “clause” denotes a logical expression formed from a finite logic. number of literals. Prolog programs typically start by declaring facts and relationships. For example: A and B are both male. A and B have the same father. A and B have the same mother. A and B are not the same. Another relationship declaration could be between a person and a piece of property. For example, the statement “Joachim owns a book” declares a relationship of ownership between Joachim and the book. Once basic relationships are declared, facts can be considered, questions can be asked, variables can be included, goals can be formulated, and patterns can be matched. What follows is a very small selection of very basic statements illustrating the nature of the lan- guage. Statements are always in lower case letters and variables start with a capital letter. Table 1: Example of the Prolog Language Prolog language construct Prolog syntax Meaning and output Fact lectures (Smith, DLMAIAI01) Establishes the fact that Dr. Smith teaches the course DLMAIAI01. It is an example of a Prolog clause. Predicate professor/1 Defines the one argument predi- professor(Smith). cate professor by three facts. professor(Jones). Drs. Smith, Jones, and Meyer are professor(Meyer). professors. Rule technicalCourse(X) :– engineer- All engineering courses are tech- ingCourse(X) nical courses. Note the use of variable X! 35 Prolog language construct Prolog syntax Meaning and output Query ? – lectures(Smith, DLMAIAI01) Does Dr. Smith teach DLMAIAI01? Goal ? – lectures(Smith, X) What courses does Dr. Smith teach? Note the use of the varia- ble X! Source: Created on behalf of IU (2019). As Prolog was uniquely well adapted to handling logic, and querying knowledge bases, it became instrumental in a variety of commercial applications (Roth, 2002), as listed below: Environmental studies modeling weather phenomena were conducted at Penn State University, with Prolog used to build a system for weather forecasting and air pollution dispersion. The University of Surrey in the United Kingdom developed several systems for water utilities, with Prolog used for water distribution and planning, especially in cases of emergency. Manufacturing always seeks to reduce costs. Prolog gained some recognition by Boeing, the aircraft manufacturer, for its development of a system called CASEy, which directs shop floor workers in the application of electrical parts and in how to follow proper operational procedures. This led to a reduction in assembly times. 2.3 Pattern Recognition and Machine Learning (ML) The field of machine learning is as old as artificial intelligence itself. However, it only recently became the dominant paradigm in artificial intelligence research. One of the most often cited operational definitions of machine learning was coined by the American researcher Tom Mitchell (1997, p. 2): “A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P if its performance at tasks in T, as measured by P, improves with experience E.” This definition underlines the fact that learning from data is a key characteristic of machine learning. To this end, it draws upon a multitude of methods from classical statistics to more algorithmically moti- vated approaches. In order to gain a better overview of machine learning it is helpful to distinguish between some prominent approaches to learning. Basically, the following three types can be distin- guished (Russell & Norvig, 2022; Marsland, 2014; Murphy, 2012): Supervised learning Supervised learning operates on labeled data sets, meaning that the learning examples consist of object descriptions in the form of features, together with labels pertaining to these objects. Tasks can then be described using the given examples to identify mapping 36 between feature values and outputs that enable the learner to predict the label for hith- erto unseen objects. Depending on the kind of output that is to be produced, one distin- guishes between regression and classification. In regression, the output is a continuous numerical variable. Thus, regression aims at finding real-value functions that represent mapping between the input space of features and the output space of associated values. On the other hand, if the output is restricted to a limited set of values, one speaks of classi- fication. Common examples include labeling e-mail messages as spam or finding images of certain content in large image databases. Unsupervised learning Unsupervised learning operates on data without any labeling information. The primary goal is to identify structures or patterns in the data. The most prominent examples of unsupervised learning techniques include clustering,(ie. finding groups of data points with high similarity), dimensionality reduction techniques (ie. constructing low-dimen- sional projections of potentially high dimensional feature spaces that, at the same time, preserve an interesting structure), and statistical techniques that estimate the probability density functions of random variables. Reinforcement learning Reinforcement learning considers a learning agent in an environment. The agent can per- Agent form actions that influence its internal state and that of the environment. A reward func- In the field of artificial intelligence, the term tion is employed to judge the utility of the performed actions with respect to a stated goal. agent refers to an autono- Since the agent creates its own learning data through trial and error testing of action alter- mous entity that per- natives, no prior data collection is necessary. Due to the setting of the learning problem, ceives its environment and acts on it in a goal- reinforcement learning is often associated with, or guided by, results from game and deci- oriented manner. sion theory. A prominent example of an artificial intelligence system that employs rein- forcement learning techniques is AlphaZero. Examples of this system learned to play the games Go, chess, and shogi at superhuman playing ability by only using knowledge of the basic rules and extensive self-play. 2.4 Use Cases In this unit, three major currents in the development of artificial intelligence have been identified. The following list provides a brief but by no means an exhaustive overview of the multitude of problems that are being addressed using artificial intelligence. Health care ◦ Wearable devices, not unlike a wristwatch, can monitor critical signs of life, such as blood pressure and body temperature. From this data, an artificial intelligence agent or expert system can dispense advice relative to the conditions of the wearer. ◦ Given multiple medical conditions, a prescription agent can suggest treatment options in terms of the optimal combination of prescriptions in order to avoid nega- tive side-effects. 37 ◦ An artificial intelligence agent can be used to monitor a physician’s patients and their respective needs to ensure that appointment deadlines are met, especially when there are many patients. Automobiles and transportation ◦ While automobiles are not yet fully autonomous in normal traffic, the average car is equipped with numerous sensing devices to assist the driver in remaining safe. ◦ Artificial intelligence sensors can detect technical problems originating from the car as well as medical conditions emanating from the driver, such as a driver’s alcoholic breath. Banking ◦ Major fraud has been detected years after it occurred. Even minor day-to-day irregu- larities are detectable with pattern recognition technologies using artificial intelli- gence. ◦ Counterfeit signatures are more easily detected due to the scanning of originals into a database. ◦ Robo-advising is also now offered by banks and broker-dealers. Based on a rich vari- ety of securities and an investor’s risk profile, a robo-advisor can be used to construct an appropriate portfolio. Manufacturing ◦ Certainty in correctly assembling >3000 aircraft parts can now be achieved with expert systems. ◦ Artificial intelligence technology is good at evaluating all of the many different possi- bilities facing the design of new products. It can, therefore, be used to assist in the creative design process. Education ◦ In online instruction, personalization can significantly improve the quality of teach- ing. This is particularly important when individual support is difficult due to the high number of participants, such as in Massive Open Online Courses (MOOCs). ◦ The timely grading of test results both quantitatively in terms of numerical grades and qualitatively in terms of verbal responses can be enhanced with artificial intelli- gence technologies. Retail ◦ Websites can track how interests change based on the number of website visits and purchases made. In cases where website visitors can be identified, artificial intelli- gence can make personalized purchase predictions. ◦ Chatbots of the future will know callers via voice recognition and display patience, good humor, good manners, and even be kind. To this end, chatbots support cus- tomer retention and customer service. ◦ Market segmentation used to be based on geographical regions, such as state, prov- ince, or county. It is now possible on a street-by-street basis. The options are endless and personal. For example, your vacuum cleaner will have learned your living room layout without missing a corner, minimizing the distance trav- eled, and your personal instantaneous translator will help you communicate with the locals on your next overseas holiday. 38 The examples above briefly summarize applications in which artificial intelligence will probably play a significant role. The following case study illustrates just how this technol- ogy will assist in meeting urgent company needs. BUSINESS CASE STUDY This case study involves the Mizuho Bank in Japan, which is headquartered in Tokyo and has over 500 branches. The issue at hand was customer service inter- action, which is varied and complex in the banking industry. It is not like a ware- house where all one has to do is consult an inventory list and answer customer questions with “yes we have that part, and the price is…”. Banking questions often involve international transfers, local regulations, tax issues, fraud, invest- ment advice, and interest lending rates. Mizuho’s ambition was to analyze customer conversations in real time using a natural language processing (NLP) algorithm so that employees answering cus- tomer enquiries had the best information available on their computer screens, enabling them to give good, real-time responses. This case illustrates how artifi- cial intelligence technology can be used as a tool to help humans at work. The bank’s objective was to improve employee performance in responding to customer calls, especially those of new employees with less practical experi- ence. The bank’s methods included the “Cloud”, internet, NPL algorithms, statis- tics, and continuous learning as a result of the algorithm listening to phone calls. The results included: 1. A higher level of customer service 2. A reduction in employee response time to customer questions 3. A reduction in call center staff quality improvement training However, assume that customer conversations were recorded without a cus- tomer’s permission. Do you think this constitutes an ethical violation? If so, why? If not, why not? SUMMARY This unit has highlighted three main currents in artificial intelligence research that have shaped the field at various times throughout its his- tory. Expert systems try to emulate the knowledge and decision-making capabilities of human experts. To this end, facts and their governing rules from a certain domain are encoded in a machine-readable form in 39 a knowledge base. An inference engine operates on that knowledge base in order to derive new and hitherto unknown facts and relations that can be used to make decisions or solve problems in the pertaining domain. Prolog was introduced as the primary example of logic programming. Logic programming tries to implement first order logical reasoning. Facts, rules, and relations are formulated via predicates and clauses. Programming is done in a declarative way by querying the knowledge base of facts and rules without explicitly specifying the steps that lead to a solution. As with artificial intelligence, the scientific field of machine learning was established in the late 1950s. Machine learning employes algorithmically motivated techniques and approaches that draw upon statistics to learn from data. Depending on the kind of data used, different types of learn- ing can be distinguished. While supervised learning depends on labels, unsupervised learning is more concerned with the identification of structures and regularities in data. Reinforcement learning is based on the concept of an agent that explores its environment through actions that lead to a reward based on their utility for reaching a goal. 40

Early Systems in Artificial Intelligence PDF

Document Details

Tags

Related

Summary

Full Transcript

Upgrade to continue