Modern Artificial Intelligence Systems PDF
Document Details
Uploaded by AstoundedComprehension6772
IU International University of Applied Sciences
Tags
Summary
This chapter of a course book details the history of modern Artificial Intelligence (AI) systems, starting from the 1950s to the turn of the 21st century. It covers the development of computing technologies, hardware and software, as well as the concept of AI and its interrelationship with computing.
Full Transcript
UNIT 4 MODERN ARTIFICIAL INTELLIGENCE SYSTEMS STUDY GOALS On completion of this unit, you will have learned … – about the interrelationship between computing and artificial intelligence. – to appreciate the advances in computing technologies since the 1950s and the new insights and opportun...
UNIT 4 MODERN ARTIFICIAL INTELLIGENCE SYSTEMS STUDY GOALS On completion of this unit, you will have learned … – about the interrelationship between computing and artificial intelligence. – to appreciate the advances in computing technologies since the 1950s and the new insights and opportunities this continues to provide. – about the difference between weak or narrow artificial intelligence used today versus the future possibility of Artificial General Intelligence. – about the potential benefits of modern systems in artificial intelligence relative to com- puter vision and natural language processing (NLP). 4. MODERN ARTIFICIAL INTELLIGENCE SYSTEMS Introduction The two research fields of computer science and artificial intelligence work hand in hand to produce advances. However, they are not always perfectly synchronized, and it can be argued that the current wave of progress in deep learning, in particular, and artificial intel- ligence, in general, is to a considerable extent driven by recent advances in data storage and computing capabilities. Reciprocally, the last decline of interest in connectionists models of machine intelligence and learning that preceded the current period of renewed enthusiasm would not have occurred—at least not in such a pronounced form—if today’s distributed computing and storage technology had been available. 4.1 Recent Developments in Hardware and Software In the 1950s computing technology developed into an industry, which was a period of great excitement. Many universities and government funded programs created innovative computing devices, nearly all of which ran on vacuum tubes in which the flow of electrons moving in a vacuum was used as a switch to turn on or to turn off bits of information. The UNIVAC computer was the first commercial version at a time when the whole world only had about 100 computers. During this time, Alan Turing (1950) also published the seminal paper “Computing Machinery and Intelligence” proposing the concept of machine intelli- gence. Interactions between humans and machines were demonstrated, and high-level languages, such as FORTRAN, COBOL, and Lisp, which alleviated the programmer of tedi- ous bit-level operations, appeared on the market. In the 1960s the rate of change accelerated with an emphasis on computing capacity and integrated circuit design rather than vacuum tubes, networks, and operating systems. An integrated circuit operates on semiconducting material, performing the same function as larger electronic components. The transformation from vacuum tubes to transistors and integrated circuits had vast commercial implications in lowering costs, reducing the weight and space of machinery, increasing reliability, and decreasing operational energy needs. The net effect was that computers became more affordable and simultaneously Moore’s Law more powerful. Moore’s Law, which is arguably running out of validity today, was also This law states that com- coined and became widely accepted. What were then called “third generation computers” plexity, as measured by the number of transistors included the IBM-360 and 700, DEC’s PDP-8 (Mini), and the CDC-6600, the first supercom- on a chip, doubles every puter of the day. The process of miniaturization continues today. two years. 58 In the 1970s advances in the size of hardware led to it becoming larger (in terms of the total installation) and smaller (in terms of component size) while software capability and the ease of programming improved. During this time, Microsoft Corporation was estab- lished by Bill Gates and Paul Allen while Apple Computer was founded by Steve Wozniak and Steve Jobs. Word processing started with a program called “Word Star” and spread- sheet technology from a program called Visicalc. E-mail initially began as ARPANET—a communication mechanism for transferring files for defense and academic personnel. The programming language C, which gave rise to C++ in the 1980s, promoted the structured programming paradigm. The intent was to create clarity in software development by avoiding go-to constructs, breaking large programs into smaller functions and using repe- tition in logic. A lot of progress made in the 1980s can be attributed to the private sector. Much of the personal computing technology we consider to be ordinary by today’s standards was established during this period, including, for example, MS-DOS, the microcomputer oper- ating system adopted by IBM. Many of the companies that made large contributions at the time are not around anymore, such as Tandy, Commodore, and Atari. Surviving their founding days are Adobe Photo Shop and TCP/IP, the protocol that carries information across the internet. The whole concept of the World Wide Web (WWW), the idea of format- ting online content under the protocol of HTTP, developed out of Switzerland’s CERN research. This laid the foundation for computer networks and the WWW. In addition, more and more computational work was carried out on workstations during this time. In the 1990s, hyperlinks became popular as a mechanism for connecting web pages. This made it possible to link related information and browse through it. One of the biggest global developments was the rise of the Windows operating system with Windows 3.0 in 1990, followed by Windows 95 and 98. Microsoft Office also became a worldwide standard and led to productivity increases due to the network effect. The Java language also Network effect became a commercial product during this period. On the hardware side, IBM’s special When the benefits of a product or service computer Deep Blue beat the champion chess player Garry Kasparov, which had implica- increase depending on tions for artificial intelligence. Nokia, a Finnish communications company, also introduced the number of users, this the first smart phone. is known as the network effect. A familiar example of this effect is the use of Since the year 2000, hardware and software have become more integrated, producing new Internet platforms. products and services. New product applications combining hardware and software relate to wearable devices and augmented reality (AR), of which Microsoft’s HoloLens is an exam- ple. AR glasses can display various information to the person wearing the glasses. AR sys- tems like HoloLens, Oculus Rift, and similar developments, combine business and enter- tainment on the one hand and technology and content on the other. Applications may be found in 3D tourism and 3D product catalogues. Since the year 2000, the open browser Firefox has also risen to prominence, competing with Google’s Chrome. In the new prod- ucts realm, Bitcoin, a cryptocurrency, was announced, which has since experienced con- siderable volatility. Ultimate acceptance or rejection of Bitcoin and other cryptocurrencies is yet to be established. 59 Cloud Computing The developed industrial world is moving away from individually owning products and consumables in both computer hardware and conventional consumer products. In major cities the per-use leasing of cars (car sharing) helps reduce traffic congestion and the occu- pation of parking spaces. Similar offerings for bicycles have been established, enabling more people to share bicycles rather than having to personally own them. Using Uber or Lyft taxis, and possibly even autonomous ones in future, is less expensive than owning a car in large and mid-sized cities. Renting a vacation home is also less costly than owning one, as the latter requires considerable financial means and entails ongoing expenses. In computing, similar sharing is also occurring and is referred to as cloud computing. The reasons for the emergence of cloud computing are summarized below: Artificial intelligence is advancing rapidly and requires huge computational and data storage resources. It is more efficient to share hardware and software resources than have each client make their own investments, duplicating the effort. Building a computer resource sharing facility constitutes an economic opportunity for organizations that have the skills and resources to establish one and use it correctly. A sample of cloud computing providers includes Amazon, IBM, SAS, Microsoft, Salesforce, Sun Microsystems, Oracle, and others. Cloud computing can also be considered to be an extension of the internet paradigm in which data and communications are becoming more democratic and less exclusive, with new economic opportunities being created in the form of paid advertising and data-related services. In the 1980s the concept of time sharing became popular, which was a forerunner to cloud computing. Time sharing differed from cloud computing in that it was concentrated on the sharing of hardware resources that were always available to everybody. The cost to subscribers was billed in terms of machine cycle time and storage units. In some respects, computing became a commodity like electricity, water, and natural gas. A few definitions are useful before we proceed to cloud computing and its application to artificial intelligence. Virtual computers Virtual computers are single virtual machines created inside a server environment or cloud facility to serve a single client as needed. Multiple virtual machines can operate on the same physical hardware at the same time, simultaneously drawing on data and pro- cessing resources. Cloud computing Cloud computing is parallel, geographically distributed, and virtualized. 60 Grid computing Grid computing gets its name from the electric grid. It is a parallel and geographically dis- tributed architecture that commonly consists of heterogenous nodes that perform differ- ent workloads or applications. Resources may be owned by multiple entities cooperating for the mutual good. Cluster computing systems Cluster computing systems are also parallel and geographically distributed with resources available at runtime. The difference to grid computing is that in the cluster computing case we have connected stand-alone computers operating in unison, i.e., on the same task. The main benefit of cloud technology for artificial intelligence is its ability to provide a set of highly scalable computational and data storage resources. Cloud Case—Artificial Intelligence in Supply Chains In order to demonstrate how cloud technology and artificial intelligence go hand in hand to solve important real-world application scenarios, we will now examine the case of sup- ply chain management. Supply chain management, which is a very significant area of application for artificial intelligence, is just one of the many examples of commercial scenarios that benefit from cloud technology. The cloud is a physically immense server installation. It can be located anywhere to provide communication and information exchange between worldwide sup- ply chain partner firms. Finally, one cloud can be connected with any number of other clouds. For example, if one cloud installation deals with tourism services, it may be logi- cally connected to another cloud structure focused on public transportation. Nowadays, supply chains are worldwide. They connect individual enterprises that coordi- nate their operations for the benefit of the end consumer and for mutual benefit. Cloud computing is the supply chain connecting mechanism, with artificial intelligence ensuring that supply chains operate more efficiently. When applied to supply change management, artificial intelligence uses natural language processing (NLP) to scan contracts, retrieve chat logs and orders, and speed up payments along the chain. uses machine learning to confirm trends, quantifying the flow of goods along the chain, with the objective of being at the right place and at the right time. forecasts the demand for specific products and shares this information with all partners along the supply chain. optimizes warehouse operations with respect to sending, receiving, picking, and storing products. operates autonomous transportation vehicles. 61 The challenge to making all of the above processes work is data. It must be complete, descriptive, accurate, and available in real time to all members in the supply chain. Assume company X is a large manufacturing enterprise and operates, like most enterpri- ses do, as a sophisticated supply chain stretching over several continents. Managing and coordinating a chain of legally independent companies, which are in part mutually dependent upon each other, has become more complicated. This is due to different rea- sons: trade laws have changed, technology has required new approaches to the assembly and marketing of finished goods, and, most importantly, supply chain members have insisted on maintaining incompatible IT systems. There are many performance measures in supply chains, but for this case let us limit performance to supply chain-wide inventory levels, on-time product deliveries in perfect quality all the time, and a satisfactorily auto- mated payment system. Let us further assume that such a system has been in operation for eight years and is becoming harder to maintain as product sales volume has doubled and product classes have increased. In order to manage the introduction of cloud comput- ing and artificial intelligence into a business’s IT management system, the following key points need to be considered: A business case in favor of the cloud needs to be developed in order to convince partici- pants along the supply chain that supporting the concept is worth the effort. Such a case study has to start with a complete understanding of the shortcomings of the cur- rent supply chain IT management system. The proposed replacement must address risk, governance cloud technology, and artificial intelligence applicable to the new IT environment. The introduction of cloud computing requires opportunity focused kind of thinking with respect to costs and benefits. Issues such as new opportunities for collaboration, new services, data security improvement, errors reduction, improved accountability, risk, and faster customer response times need to be considered. Working with stakeholders to create value is an important ingredient in selling an artifi- cial intelligence related cloud project. Stakeholders are all member companies in the supply chain, including major customers who buy the end product and ultimately pro- vide funding on a continuing basis. Cloud Artificial Intelligence Service Specialization Since advanced analytics and artificial intelligence constitute major application areas of cloud computing, relevant suppliers such as Google, Microsoft, and Amazon have integra- ted machine learning and artificial intelligence offerings into their portfolios. While some of these services are free to try, others require a usage fee. The fees are based on usage volume, storage space, proprietary data, frequency, and other criteria. Artificial intelligence libraries and applications are centered on chatbots that conduct simulated conversations with customers and search digital and graphic databases, formulating answers verbally, many times faster than a human oper- ator. NLP technology offered as a cloud platform service in many areas of interest, such as the translation of websites. 62 visual content classification services that have the capacity to classify a high volume of client images. documents from a customer source, such as in text graphics or photography, which can be presented for machine analysis (e.g., for plagiarism or style detection). Quantum Computing Current chip designs are approaching the physical limits of the semiconductor-based hardware paradigm. Thus, the exponential gains in size, complexity, and execution speed of processing units, as seen in the last decade, are likely to come to an end in the foreseea- ble future. One way around these limits is given by parallel and distributed technologies, as already outlined. Another way is by considering entirely novel computational para- digms. Quantum computing belongs to the latter category. A classical computer represents information in bit form—elementary units that can only take on either one of the values 0 or 1 at any given time. In contrast, quantum computing employs the quantum theoretical concept of superposition, derived from quantum phys- ics, which implies that any sum of quantum states is itself a quantum state and that any Quantum physics quantum state can be expressed as the sum of two or more other quantum states. Apply- This is a branch of physics that describes the behav- ing this paradigm to the representation of information leads to the concept of the qubit—a ior of elementary parti- unit of memory that can not only be in the two states of a classical bit—either 0 or 1—but cles and their interac- also in a superposition of these states. A quantum computer is a device that is based on tions. this representation of information and the pertaining opportunities for information pro- cessing it brings about. Key points of quantum computing The term quantum in quantum computing was adopted from a branch of physics called quantum mechanics. Quantum mechanics was developed in the search for a mathe- matical description of the behavior of subatomic particles. Quantum computing is a new technology that could significantly increase our ability to process information. While it is not yet commercially available, it is being researched by governments and major corporations. Quantum computing does not necessarily bring about improvements in all computa- tional tasks. It has been shown that for many important applications, classical algo- rithms are of comparable performance to their quantum counterparts. Quantum computing is expected to have a considerable impact on cryptography. On the one hand, it has the potential of rendering today’s most popular encryption algo- rithms useless; on the other hand, it could enable entirely new cipher methods. Since quantum mechanics is a theory that is probabilistic in nature and quantum com- puting has been shown to be especially well suited for search and optimization tasks, it stands to reason that artificial intelligence will also be impacted by quantum comput- ing. 63 4.2 Narrow Versus General Artificial Intelligence A more recent theme in artificial intelligence research is the clear distinction between the several related and yet diverse forms of artificial intelligence. In the most general terms, artificial intelligence can be defined as the mechanistic implementation of sensory per- ception processes, cognition, and problem-solving capabilities. Throughout the history of artificial intelligence as a scientific discipline, researchers have addressed this daunting endeavor by breaking down the challenge into a manageable size by implementing sys- tems that perform specialized functions in controlled environments. This approach is now termed Artificial Narrow Intelligence (ANI) or Weak Artificial Intelligence. It is the opposite of the open-ended, flexible, and domain independent form of intelligence expressed by human beings, which is commonly termed Artificial General Intelligence (AGI) or Strong Artificial Intelligence. Artificial Narrow Intelligence In defining the term Artificial Narrow Intelligence (ANI), it is useful to think of it as all of the artificial intelligence currently in existence and which will be realizable within the foresee- able future. As such, currently existing systems in the typical application areas like self- driving vehicles, translation between languages, sales forecasting, NLP, and facial recogni- tion all fall under the concept of ANI. The word narrow indicates that the type of intelligence in question only pertains to one domain at a time. For example, a given device or system may be able to play chess, but it will not be able to play another strategy game like Go or shogi, let alone perform com- pletely different tasks such as translation. In short, narrow means both a display of intelli- gence in the sense of the ability to solve a complex problem and a display of intelligence relative to only one task. Artificial General Intelligence For Artificial General Intelligence (AGI), the cognitive versatility of a human being is the ref- erence point against which this form of artificial intelligence is measured and judged. The goal is not only to replicate specific instances of sensory data interpretation, language interpretation, or other forms of intelligent behavior, but the full range of human cognitive abilities. Clearly, this entails displaying all the capabilities currently represented by Weak Artificial Intelligence as well as the ability to implement elements of generalization across domain boundaries—that is, implementing things learned in one task to different but rela- ted tasks, including motivation and volition. Philosophical sources on the matter (Searle, 1980), in particular, go one step further by also requiring that AGI possesses consciousness or self-awareness. It is an exceptionally difficult task to imagine the development of an artificial intelligence device that simultaneously had all of the following abilities: 64 the cognitive ability to learn and function in several domains the possession of human-level intelligence across all domains the possession of multi-domain problem-solving abilities at the average human level independent problem-solving ability the ability to think abstractly without direct reference to past experience the ability to perceive the whole environment in which it operates the ability to entertain hypotheticals for which it has no prior experience the ability to motivate itself and the possession of self-awareness Moreover, the concept of superintelligence—the idea that artificial intelligence acquires cognitive abilities beyond what is possible for humans by engaging in a recursive cycle of self-improvement—is contingent on it first reaching a state of Artificial General Intelli- gence. 4.3 Natural Language Processing (NLP) and Computer Vision The subject of natural language processing is a major application domain of current artifi- cial intelligence techniques, which has been the case throughout the history of artificial intelligence. It consists of three main constituent parts: (1) speech recognition—the identi- fication of words in spoken language and the transformation of speech into text; (2) lan- guage understanding—the extraction of meaning from words and sentences and reading comprehension; and (3) language generation and the ability to express information and meaning in the form of well-formed sentences or longer texts. The ultimate goal is to interpret and use language at a human level. This would not only enable humans to communicate with machines in their natural language, but also allow for a number of interesting language centered applications ranging from automatic trans- lation between different languages to the generation of text excerpts, digests, or complete works of literature. While this goal has not yet been achieved, natural language processing is making remarkable progress, which can be observed by the following developments: the development of virtual assistants on commercial phones and laptop computers that are ever more responsive to complex inquiries the development of enhanced machine translations between two different human lan- guages, which are continually improving and which can now also be run on commer- cially available smartphones, tablets or notebooks the development of key-word extraction to analyze volumes of text, for example, to assist with media reporting the application of sentiment analysis to e-mail and social media texts to assess the writ- er’s mood and emotional attitude towards the subject the ability of voice-recognition software to identify speakers the ability of speech-recognition software to recognize words measured by the accuracy rate and how well the system can keep up with an ongoing conversation in real time 65 Taking into account how much our human faculty of reasoning and logical inference is based on language, it is easy to see that the ability to process language is intimately tied to the problem of intelligence itself. As a case in point, consider the now famous Turing Test for the presence of artificial intelligence. Alan Turing (1950) proposed this test as a way of determining whether a machine could be considered intelligent. The test involves a machine and a human subject, with both answering a series of questions from an interrog- ator via telegraphic connections. If the interrogator cannot identify which of the conversa- tion partners is a human and which is a machine, the machine is considered intelligent. Clearly, this test scenario, which Turing himself called “the imitation game”, critically hinges on the ability of the machine to process natural language. Natural language processing, as a technical discipline, started in the mid-1950s during a time of heightened geopolitical tension between the United States and the former Soviet Union. American government institutions had a high demand for English and Russian translators such that translation was outsourced to machines. While preliminary results were promising, translation turned out to be far more complex than initially estimated, with substantial progress in the technology failing to materialize. In 1964, the Automatic Language Processing Advisory Committee (ALPAC) therefore described natural language processing technology as “hopeless”, temporarily ending funding for natural language processing research and initiating a natural language processing winter. Almost 20 years later in the early 1980s, the subject regained interest as a result of three events: Computing power increased in line with Moore’s Law, thereby enabling more computa- tionally demanding natural language processing methods. A paradigm shift occurred. The first wave of language models was characterized by a grammar-oriented approach that tried to implement ever more complex sets of rules to tackle the complexities of natural everyday language. This changed towards models based on a statistical and decision-theoretic foundation. One of the first approaches was the use of decision-tree analyses rather than man-made and hand-coded rules gov- erning the use of words. Decision tree models lead to hard if-then choices. Further refinement of natural language processing was achieved by the use of probabil- ity theory in a technique called part-of-speech tagging. This technique uses the stochas- tic Markov model to describe a dynamic system like speech. In a Markov model, only the last state of the system, together with a set of transition rules, defines the next state, as opposed to approaches that consider the whole history. Overall, the shift towards statistical, decision-theoretic, and machine learning models increased the robustness of natural language processing methods in terms of their ability to cope with previously unencountered constellations. Moreover, it opened up the oppor- tunity to learn and improve, making use of the growing corpora of literature available in electronic form. 66 Inside Natural Language Processing The process of natural language understanding is based on numerous constituent parts, such as syntax, semantics, or speech recognition. Discernment of the individual compo- nents listed below helps us understand the science of natural language processing. Syntax ◦ Syntax describes the grammar of a language and, in particular, the prescribed sequence of words in phrases. In language translation between two or more lan- guages, obviously more than one grammar simultaneously comes into play. Semantics ◦ Semantics refers to the meaning of words. In natural language processing it answers the question of the meaning and interpretation of a word in a given context. Speech recognition ◦ Speech recognition takes the recorded sound of a person speaking and converts it to text. The exact opposite is called text-to-speech. Speech-to-text is difficult because voice recognition has to deal with dialects and highly variable pronunciation. As human speech has practically no pauses between words, speech-to-text systems have the difficult task of segmenting words in order to process entire sentences. Text summaries ◦ Text summaries produce readable summaries of volumes of text on known subjects. An academic or research association may conduct an annual meeting during which many research papers are presented. These papers can be summarized and analyzed for conference reports. There are many processes operating inside a natural language processing system, and the following examples provide a sample of those that address both syntax and semantics. Terminology extraction ◦ Terminology extraction programs analyze text and semi-automatically count fre- quently used words in many languages. The frequency of certain terms as well as the frequency of co-occurrence with other terms can provide valuable hints on the topic of a text. Part-of-speech-tagging ◦ Part-of-speech tagging is a method of finding what part of speech a given word in a sentence represents. For example, the word book can be a noun, as in the sentence “I just bought this book”, or it can be a verb, as in the sentence “I just booked a table at a restaurant”. Parsing ◦ Parsing refers to grammatically analyzing a sentence or a string of characters. In natu- ral languages, grammar can be quite ambiguous, resulting in sentences with multiple meanings. Dependency parsing builds parse trees with inner nodes, i.e., non-terminal grammatical objects that need to be further broken down in order to arrive at the ter- minal nodes which represent actual words present in the parsed sentence. Constitu- ency parsing builds the parse-tree solely on the bases of terminal nodes without the help of intermediate grammatical constructs. 67 Word stemming ◦ The objective of word stemming is to trace derived words to their origin. For example, the word “opening” can be a noun, as in “the opening of an art show”. It can also be a verb, as in “She is opening the door”. Word stemming traces the word back to the stem “open”. Machine translation ◦ Machine translations between two or more natural languages are considered the most difficult to do well. It requires multi-language grammar knowledge, semantics, and facts about one or more domains. Named entities recognition ◦ Named entities recognition is the task of identifying and classifying words in terms of categories such as names of people, objects, and places. Capitalization of words pro- vides hints, but on its own is insufficient to discern named objects. For example, the grammatical rules of capitalization in English are quite different to those found in German. Relationship extraction ◦ Relationship extraction takes text and identifies the relationships between named objects, such as that between a father and son or between a mother and daughter. Sentiment analysis ◦ Sentiment analysis aims to discern the prevailing attitude, emotional state, or mood of the author based on word choice. Disambiguation ◦ Disambiguation of words in sentences deals with the multiple meanings of words. To do this, computers are given a dictionary of words and associated options for the meaning of the words. This enables NLP to make the best choice in a given context. However, new situations always arise and exceptions are possible. Questioning and answering ◦ Questioning and answering is widely used and highly popular in commercial applica- tions. Answers can be of a simple yes/no form with a specific one-word answer, or the process can be very complex. The question must be understood by the computer, the answer extracted from databases, and then verbalized in the form of an answer. Computer Vision Seeing and understanding the content of an image is second nature for humans. It is very difficult for computers to do likewise, but it is the ultimate goal of computer vision. Com- puter vision aims to help computers see and understand the content of images just as well as humans do if not also in some cases even better than humans do (Szeliski, 2021). In this sense, computer vision is a subfield of artificial intelligence that fits into the scheme of our studies, as pictured below: 68 Figure 8: Computer Vision and Artificial Intelligence Source: Created on behalf of IU (2019). To accomplish machine vision, machine learning is required to identify the content of images. While this process is still far from perfect, substantial advances have been ach- ieved. For humans, seeing and knowing what one is seeing comes naturally. Humans can easily describe the informational content of an image or moving picture and recognize the face of a person they have seen before. Computer vision aims to teach machines to be able to do the same thing. Developing computer vision is not an idle academic curiosity. It has many influential prac- tical applications associated with substantial market opportunities and improvements in the quality of life. It also carries significant risks. For example, computer vision techniques are used in semi-autonomous driving, robotic control, surveillance technology, and medi- cal image analysis. Image Acquisition and Signal Processing Conceptually, the acquisition of image data and the application of signal processing oper- ations, such as filtering, smoothing, or similar image manipulation techniques, has to be distinguished from vision, with the latter defined as the cognitive interpretation of the image content. In this section, we will therefore examine image acquisition in human and computer vision in order to identify similarities and differences. 69 Human vision In the process of human seeing, light enters the cornea, which is the transparent area at the front of the human eye. The cornea has an adjustable pupil by which the amount of light entering the eye is controlled. Behind the pupil is an oblong lens with an adjustable curvature. Adjustments are made by tightening or loosening attached muscles. A rela- tively flat curvature enables seeing distant objects. Conversely, a more curved lens brings near objects into focus. The inner eyeball is comprised of a jelly-like substance through which light travels into the retina. The retina is at the end of the eyeball and contains mil- lions of photo receptors in the form of cones and rods. These cells detect light signals and convert them into electrical signals that are fed into the brain via the optic nerve. The process of converting light into electrical signals is called transduction. These signals are interpreted by the brain and an understanding of the image content is formed. Camera vision The technical counterpart of the eye is the camera. Camera technology has a long-stand- Camera obscura ing history reaching as far back in time as antiquity with the camera obscura. Notable This is a natural optical advances in camera vision are introduced below. phenomenon that occurs when an image is projec- ted through a small hole Pinhole cameras in a screen or wall result- ing in a reversed and inverted image on the Unlike the human eye and all conventional photographic cameras, pinhole cameras have surface opposite to the no lens. They are made up of a sealed box with a small opening for light to enter. The opening. A pinhole cam- incoming light projects an inverted image of the scene in front of the aperture on the back era is based on the same physical principle. wall of the camera. This process takes advantage of the fact that light travels in straight lines (over short distances). Film cameras Film cameras differ from pinhole cameras in that they have a lens through which light has to travel as well as photographic film located in a sealed box that is exposed by the incom- ing light. Chemical reactions alter the material of the film, thereby conserving the image. Having a lens opens up the possibility for dynamic focusing and improvements in image quality. Digital cameras In digital cameras, a light-capturing sensor is used in place of photographic film. Light enters through a lens that projects the image onto a sensor chip, which in turn captures the image in the form of millions of individual elements called pixels. Conceptually, these millions of pixels emulate the function of millions of light sensitive cells in the retina of the human eye. A digital image can then be defined as a string of pixel numbers that represent light intensity and color that can be subsequently manipulated via image processing tools. 70 Computer Vision—From Features to Understanding Images A simple thought experiment is sufficient to show that direct mapping of pixel-wise image content to a semantically meaningful image interpretation is infeasible. To this end, visu- alize a very simple scene consisting of a single object in front of a uniform monochrome background. Imagine how the pixel values change when the object changes orientation, the camera zoom is varied, or differently colored lamps are used to light the scene. A sig- nificant part of any computer vision processing pipeline consists of the extraction of sali- ent image features above the abstraction level of the pixel. Typical examples of such fea- tures are edges (locations with a pronounced change in pixel values), corners (locations where two or more edges join or an edge rapidly changes direction), blobs (uniform subar- eas in a picture), and ridges (the axes of symmetry). These image features provide the input for pattern recognition and machine learning tech- niques employed to derive semantically interesting image content, such as the recogni- tion of objects like license plates, traffic signs for security or autonomous driving, faces for sorting photo collections with respect to a depicted person, or the discovery of malignant tissue in medical images. A typical computer vision pipeline thus contains the following steps: An image acquisition mechanism, such as a digital camera, is used to acquire an image in a form that is suitable for further computational processing. Techniques from the field of signal and image processing, such as sharpening or con- trast enhancement, may be employed to improve suitability for subsequent processing steps. Based on the pixel content of the image, higher-level image features are extracted to abstract from the raw pixel data. The acquired higher-level features are subjected to pattern recognition and machine learning techniques to infer semantically meaningful image content. Computer vision is, therefore, a discipline that employs methods, approaches, and techni- ques from numerous fields of scientific study, as indicated in the figure below. 71 Figure 9: Approaches and Techniques of Computer Vision Source: Created on behalf of IU (2019). SUMMARY This unit has illustrated how artificial intelligence and computer science technologies have advanced simultaneously. For example, techniques for distributed data storage and computing have been crucially impor- tant enablers for progress in artificial intelligence. Cloud computing has reduced the cost of data and its processing while enabling seamless scaling of computational and data storage resources according to demand. These cost reductions, in combination with insights and knowledge gained from artificial intelligence itself, have driven eco- nomic growth. While the history of artificial intelligence research has mainly focused on the emulation of cognitive abilities that are highly specialized in their task specificity (Artificial Narrow Intelligence), the goal of reproducing the full range and richness of human cognitive abilities (Artificial General Intelligence) continues to fascinate and occupy philosophers and artifi- cial intelligence researchers alike. Computer vision and natural language processing are two high-profile application areas of artificial intelligence that have penetrated markets with innovations that remain in high demand. These applications contribute to human well-being at the same time as posing significant ethical and political challenges. 72