Unit 3: Neuroscience and Cognitive Science PDF

UNIT 3 NEUROSCIENCE AND COGNITIVE SCIENCE STUDY GOALS On completion of this unit, you will have learned … – how neuroscience describes the anatomical and physiological composition of the brain. – how cognitive science unites different scientific disciplines in the search for models of cognitive processes. – some of the most salient relations and connections between neuroscience, cognitive science, and artificial intelligence, together with their implications for human and machine intelligence. 3. NEUROSCIENCE AND COGNITIVE SCIENCE Introduction The goal of artificial intelligence (AI) can be defined as the mechanical reproduction of intelligent behavior. This evidently poses an enormous scientific and engineering chal- lenge. It should, therefore, come as no surprise that throughout the history of artificial intelligence, researchers and engineers have sought inspiration from the study of natural systems that exhibit the traits and characteristics that artificial intelligence tries to emu- late. The only known working examples of such systems can be found in human and ani- mal brains and their associated cognitive functions and abilities. Therefore, the goal of this unit is to familiarize you with the basic tenets of neuroscience and cognitive science, which deal with the study of human and animal nervous systems and the broader scien- tific endeavor aiming to model and understand cognitive functions, respectively. 3.1 Neuroscience and the Human Brain As a scientific discipline, neuroscience tries to identify the relevant anatomical structures Nervous system that form nervous systems and their functional roles. As such, it belongs to the broader The sum total of all cells field of biology, combining anatomy, physiology, cytology, and the chemical and develop- in the body concerned with the forwarding and mental sciences. In the following section, the focus is primarily on the human brain, as processing of sensory and based on our current knowledge it constitutes the most complex and most capable brain control signals is referred specimen. to as the nervous system. Brain Anatomy and Physiology Anatomically, the brain is a lump of soft tissue that weighs between 1.2—1.4 kg in adults with considerable variation between individuals. Nevertheless, there is no evidence to suggest that brain size is connected with mental capacity. The outer layer of the brain is a highly wrinkled structure with a large surface area that fits closely within the available cra- nial volume. On a coarse scale, its constituent parts are the cerebrum, the cerebellum, and the brain stem. The latter functions as an interface or relay station between the brain and the spinal cord, which in turn branches out into the peripheral nervous system. It is the locus of control for basic perseverance and maintenance functions, such as heart-rate reg- ulation, breathing, regulation of body temperature, and the wake-sleep cycle. The cere- bellum is a structure adjacent to the brain stem underneath the cerebrum. Its main func- tion is motor control, such as steering movement, upholding balance, and maintaining body posture. The largest fraction of the brain is constituted by the cerebrum. All higher functions, such as the interpretation of sensory input, emotions and reasoning, and speech and language understanding, reside here. 42 Viewed from above, the cerebrum is split into two halves called hemispheres. An anatomi- cal structure called the corpus callosum forms the connection between these halves, ena- bling the exchange of signals and communication between the two hemispheres of the brain. The right hemisphere controls the left side of the body, and the left hemisphere con- trols the right side of the body. It thus follows that, generally, the brain halves are highly symmetrical with respect to their function. In contrast, one commonly finds broad claims of functional specialization, also called lateralization, of high-level cognitive functions in lateralization popular psychology. The notion that logical and analytical thinking resides in the left This refers to functional differentiation across the hemisphere, whereas creativity is situated in the right hemisphere, might serve as an central body plane—that often-encountered example. Such claims are inaccurate and misleading since most relia- is, left-right differentia- ble evidence for actual lateralization pertains to more low-level perceptual functions. One tion. notable example of a hemisphere asymmetry is given by Broca’s and Wernicke’s areas which play an important role in language processing. These brain regions are usually found in the hemisphere opposite to the dominant hand, which is the brain hemisphere that controls the dominant side of the body. Apart from the hemisphere structure, the brain can be compartmentalized into four main lobes, as indicated in the figure below. This division is based anatomically on the most visibly distinct fissures of the brain surface. Despite the fact that the vast majority of observable brain functions are based on the complex interaction of many of the brain’s constituent parts, one can justifiably attribute a certain functional specialization to these lobes. Figure 2: The Frontal, Parietal, Occipital, and Temporal Lobes of the Brain Source: Created on behalf of IU (2019), based on the Mayo Foundation for Medical Education and Research, all rights reserved. Used with permission, 2019. 43 The frontal lobe is responsible for many cognitive abilities that are commonly referred to as higher mental faculties, such as judgement, planning, problem-solving, intelligence, and self-awareness. It is also involved in complex motor control tasks and speech. The parietal lobe, by contrast, is mostly concerned with the interpretation of sensory input. As such, it constructs our spatial-visual perception and interprets vision, auditory, and touch signals. The temporal lobe is involved in the understanding of language, the formation of memory, and sequencing and organization. It is also involved in complex vision tasks, such as the recognition of objects and faces. The role of the occipital lobe lies in performing the early stages of visual signal processing and interpretation. On a cellular level, the human brain is, on average, composed of about 86 billion (0,86 · 1011) connected nerve cells called neurons, which are responsible for information process- ing, and an approximately tenfold higher number of glia cells, which are responsible for the protection, nourishment, and structural support of neurons. Figure 3: A Neuron Source: Created on behalf of IU (2019), based on Baillot, 2018. Above is a representation of a human nerve cell or neuron. The cell body is called the soma. Signals from other neurons reach the soma via branched structures called den- drites. The soma then processes the incoming information and produces a corresponding 44 output that is sent down the axon. The length of the axon can be 10 to 1000 times the diameter of the soma. At its end, it branches off into axon terminals that constitute the points of contact for dendrites from neurons further down the signal flow. The difference between soma and axons manifests itself in the distinction, visible to the naked eye, between the gray and white matter seen in brain cross-sections. Gray matter is formed by the cell bodies, whereas white matter is comprised of axons. Brain Function Summary The human brain functions as the regulator of all our bodily functions 24 hours a day, 7 days a week. On a basic level, an interconnected network of neurons controls the func- tioning of our body’s routine needs, such as breathing, blood pressure, and mobility. Com- munication between the brain and the body takes place along the spinal column. The brain is responsible for the processing of sensory input in the form of the following modalities: 1. Vision (sight) 2. Audition (hearing) 3. Gustation (taste) 4. Olfaction (smell) 5. Tactition (touch) 6. Thermoception (temperature) 7. Nociception (pain) 8. Equilibrioception (balance) 9. Proprioception (body awareness) Even more importantly for the subject of this course, the brain is responsible for motiva- tion—the promotion of behaviors that are considered beneficial for the organism, includ- ing attention, learning, memory, planning, problem-solving, understanding language, and the ability to form complex ideas. It is important to remember that the brain perceives inputs, processes these inputs based on what it already knows, and then initiates some sort of action. For example, most human brains will recognize the effect of touching a hot plate and react accordingly, at least most of the time. 45 Figure 4: The Perception of Inputs and the Cognition Process Source: Created on behalf of IU (2019). 3.2 Cognitive Science Whereas neuroscience focuses on the study of the anatomy and physiology of nervous Cognition systems, cognitive science, by contrast, takes a wider view and examines cognition and This is the mental process cognitive processes in their own right, abstracting from biological actualities to elucidate of gaining knowledge and understanding as a result corresponding functional relationships. Evolutionary and developmental aspects are also of thinking, experience, addressed. The typical cognitive processes studied in cognitive science are as follows: and the senses. behavior intelligence language memory perception emotion reasoning learning Approaches, History, and Methods Fitting its aspiration to be an encompassing study of the mind, the defining characteristic of cognitive science, as a field of scientific endeavor, is its interdisciplinary approach. It draws upon knowledge from a diverse set of disciplines: philosophy psychology neuroscience linguistics anthropology artificial intelligence From the earliest times, humans have been compelled to think about the origins and workings of the mind. Thus, like artificial intelligence, the intellectual history of cognitive science can be traced back to the dawn of philosophy in antiquity. However, current approaches and methods used in cognitive science derive from twentieth century devel- opments. Driving forces at the time were George Miller’s studies on mental representa- 46 tions and the limitations of short-term memory, Noam Chomsky’s work on formal gram- mars and his scathing critique of the psychological paradigm of radical behaviorism, and early efforts in artificial intelligence. However, it was not until 1975 that the term “cogni- tive science” was coined, and a common understanding of this discipline emerged across the various scientific fields. The scientific sub-disciplines of cognitive science are as diverse as its methodological approaches. Emprical data in the study of cognitive processes is commonly derived from typical experimental methods used in the various disciplines that concern themselves with the study of the mind. The three main approaches outlined below underly the major- ity of empirical findings. 1. Brain imaging: This is a tool commonly used in medicine and neuroscience that ena- bles the tracing of neural activity while the brain is performing complex mental tasks. 2. Behavioral experiments: Frequently used in psychology, behavioral experiments allow us to draw conclusions about the processing of stimuli. 3. Simulation via computational modeling: This technique allows us to verify theoretical ideas about the functional processes involved in mental activities by comparing simu- lated outcomes with real-world behavioral data. Key Concepts, Influences, and Critique The representational theory of mind is the prevalent paradigm uniting the majority of work in cognitive science. According to this framework, cognition is achieved by employ- ing computational procedures on mental constructs that can be likened to data structures in computer science. These mental objects or data structures could represent concrete objects in the sense of physical entities or abstractions that pertain to the mental domain alone, such as images, concepts, logical propositions, or analogies. The computational procedures are correspondingly variegated and include deduction, search and matching, and the like. Cognitive science as a field of research also includes contributions from numerous other specialized disciplines. Through its systemic approach, it also influences the thinking in many associated subject areas. It has thus made relevant contributions to behavioral eco- nomics (a newer branch of economics that studies how people actually behave in eco- nomic decision-making instead of postulating perfectly rational actors) and the study of cognitive biases and the judgement of risk. However, some of its most noteworthy contri- Cognitive bias butions relate to linguistics, the philosophy of language, and an understanding of the This refers to partiality in valuation or judgement functional roles and interplay of brain structures. that stands in the way of an objective considera- Despite its marked successes, conventional approaches to cognitive science have also tion of a given situation. It denotes an often system- been subject to critique. For example, cognitive science has only recently considered the atic—repeatedly occur- role of emotion in human thinking and the problem of consciousness. Additionally, as a ring—deviation from rationality. result of its focus on the individual mind, it has tended to neglect important aspects of cognition, such as its social dimension and issues to do with embodiment and the impact of the physical environment. 47 3.3 The Relationship Between Neuroscience, Cognitive Science, and Artificial Intelligence The preceding sections gave an overview of neuroscience and cognitive science. When the subject of this course, artificial intelligence, is considered in relation to these fields, excit- ing connections begin to emerge. Biological Neural Networks and the Mind While the question of whether the brain is the locus of cognition, or just an organ to cool the blood, was debated in Greek antiquity, today we know that the former hypothesis is correct. Not only do we have a wealth of documented cases where specific brain lesions due to accident or illness lead to specific functional impairments, we also know that marked changes in what we would colloquially term the character of a person result from measurable neurological damage. Thus, without resorting entirely, or at least partly, to metaphysical explanations of the origins of the mind, we now have to accept that the brain is the physical base of mental states. This does not mean that we can readily explain every aspect of the mind or cognition in terms of underlying neurological processes. If this were the case, the broad scope of cognitive science, as outlined above, would be superflu- ous. Paraphrasing the work of Siegel (2012), the human mind can be described as a human fac- ulty, which is an emerging and self-organizing relational process embodied in the human persona. It is also a facility regulating the flow of energy and information, which is com- plex, open, non-linear, and takes place simultaneously inside and outside the body. To clarify this definition, the following key terms are defined: Faculty “Faculty” refers to all the mental and physical abilities a person is endowed with, which can exhibit considerable variation between individuals. Self-organizing “Self-organizing” refers to a process of spontaneous ordering arising from local interac- tions. For example, clouds in the sky are considered to be self-organizing, as they move with the wind from warm to cold air, store and release moisture, and remain at a particular altitude for a certain period of time. Emerging “Emerging” means “to give rise to”. To stay with our previous example, that which gives rise to the formation of clouds in the sky is often a heat source on the ground, likely to be a plowed field’s dark soil absorbing heat from the sun. The resulting column of warm air ris- ing towards the sky forms a cloud. 48 Relational processes “Relational processes” signify that there is a significant relationship between the human persona and outside objects and processes, not in the least in the form of other minds. A sample of specific human faculties represented by the human mind is given below: Conscience: This is the human faculty that judges the difference between right and wrong based on an individual’s value system. Self-awareness: This is the conscious awareness of being and introspection. Judgement: This is the ability to consider evidence and other sources of knowledge in order to make decisions. Language: This is the ability to use languages to express ideas. Imagination: This is the ability to see possibilities beyond what is immediately being perceived. Memory: This is the ability to recall coded and stored information in the brain. Thinking: This is the faculty to search for possible reasons or causes. Neuroscience, Cognitive Science, and Artificial Neural Networks The relationship between our brain and the many manifestations of mental activities still requires further research, which is also true of the relationship between neural processes and their representation in the form of computational models. Since the beginning of the information technology and computer era, researchers have been fascinated with the prospect of reproducing mental faculties in computational machinery. This process has always been a two-way exchange. On the one hand, com- puter scientists, and in particular artificial intelligence researchers, have looked into philo- sophical, psychological, and neurological models of cognitive capabilities as inspiration for their endeavors. On the other hand, researchers in cognitive processes have built and employed computational models to gain insight into otherwise hard to test notions about the functioning of the mind or the neural circuitry found in organisms. One of the most prominent outflows of such research activities is the computational model of neural activity pioneered by Warren McCulloch and Walter Pitts in the 1940s, var- iants of which are still being used in connectionist machine learning models today. According to such models, a neuron’s function can be characterized in the following way: the cell receives input in the form of electrochemical signals from other neurons that are located upstream in the information processing flow. It then modulates the input accord- ing to how often two nerve cells are activated together. The more often this happens, the greater the upregulation of the connection between the neurons. If the total excitation exceeds some predefined threshold, each neuron takes the sum of all its inputs weighted in this manner and sends an impulse along the axon, its outgoing connection. This work- ing mechanism is depicted in the figure below based on the following steps: 1. The input is received via input connections that model connection strength via weight parameters wn. 2. The weighted inputs are summated. 49 3. The resulting sum S is then subjected to the activation function f(S). 4. Finally, the activation function value is distributed to output connections. Figure 5: Schematic Depiction of an Artificial Neuron Source: Created on behalf of IU (2019), based on Knowino, 2010. Information processing and the learning of input-output associations is, in a limited way, already possible with a single computational unit working according to the schema given above. Nevertheless, the analogy to biological neural systems is generally carried one step further by building networked structures that can be organized in a layered scheme. Figure 6: Schematic Diagram of a 1-Layer Neural Network Source: Created on behalf of IU (2019), based on Peixeiro, 2019. 50 Concerning the flow of information processing through the network, two approaches can be found in the literature. These are the feed forward and recurrent approaches. The feed forward approach In this type of network, processing, and thus the flow of information, only proceeds in one direction—upstream to downstream. Every node in the network receives inputs, does the processing based on its associated weights and transfer function, and passes the signal onto the connected neurons in the next layer without looping. One typically distinguishes three types of layers: (1) the input layer, (2) one or more hidden layers, and (3) the output layer that encodes the response of the network. The recurrent approach In recurrent networks, the flow of information follows a directed graph where the succes- sion of nodes along the graph encodes the temporal succession of processing steps per- formed by the network. This temporal aspect and the resulting dynamic behavior of the network make this network class particularly well suited to applications that have a time component, such as the processing of a time-series, speech, or handwriting recognition. Such networks also commonly contain memory units that can store information about previous states of the network or its constituent parts. Clearly, the aforementioned approaches to the creation of learning systems have been inspired by theories of neural information processing. However, this relationship is often over-emphasized in popular sources, such as magazine and news articles, and at times even sensationalized beyond what could be considered factually warranted. It is prudent to keep in mind that these artificial neural networks implement highly simplified models of neural activity that abstract away many of the complexities of biological neural activity. Thankfully, deep learning, the dominant paradigm of neural inspired machine learning models, emphasizes in its name the concrete property of its pertaining network models, i.e., depth of layering over vague allusions to the functioning of biological neural net- works. The Human Brain, Its Artificial Representations, and Computer Hardware Even without referring to computational schemes that explicitly draw inspiration from the neural architecture of our brains, one often finds comparisons between the complexity of current computing machinery to the brain in the popular science discourse. To this end, the complex mobile chip designs of today have transistor counts in the order of a magni- tude of 1010. Thus, the number of transistors in a modern central processing unit (CPU) already approaches the number of neurons in the human brain. Since a transistor is the most primitive switching unit imaginable and the representation of the function of a single neuron requires a sizeable numer of transistors, a more interest- ing comparison lies in the juxtaposition of the number of units in the largest artificial neu- ral network models existing today with biological counterparts. To this end, the largest 51 current artificial nets have unit counts of between 106 and 107; however, since the 1980s this number has been doubling roughly every 2.4 years. For reference, the number of neu- rons in humans is about 1011, in bees around 106, and in frogs 108. It has to be noted, however, that reducing general human intelligence to one number and comparing it to a representative number for machine intelligence is not very meaningful (see, for example, Russell and Norvig, 2022). Nor does comparing the neuron count of the human brain to transistor numbers in a CPU, or to unit counts in network models, lead to any profound conclusions about the state of artificial intelligence. It is no more than an interesting metric. Human and Machine Intelligence If we look at the history and current state of artificial intelligence, most research and development continues to focus on building systems that try to solve specific tasks, such as playing certain strategy games, identifying objects in images or videos, controlling par- ticular types of robots to achieve a certain goal, or translating written text. Nevertheless, since the beginning of artificial intelligence research, there has been a strong current of thought directed towards the construction of a system that matches or even exceeds human mental capacity in all its diversity. When comparing problem-solving with existing artificial intelligence models and the capabilties of the human mind, many striking differences are easily discernable. In the following, we look at three manifest dis- parities. Learning efficiency While there are current task specific artificial intelligence models that clearly exceed human ability in a particular application domain, they typically achieve this superior per- formance by processing vastly more training data than humans are ever able to use. As an example, consider DeepMind’s AlphaZero. Various versions of this artificial intelligence system have learned to play the games Go, chess, and shogi at a superhuman level simply by being given the rules of the game and having extensive opportunities for self-play. While learning, these systems have played millions of games against themselves in order to reach their final playing capacity—much more than even the most elite human players manage to play in their entire career. Put another way, the human brain seems to achieve an almost as high performance using much less data. Generalization and transfer While the general architecture of the artificial intelligence system mentioned in the previ- ous example was the same, no matter whether the game under consideration was Go, chess, or shogi, in each instance the system was only able to play the particular game it had been trained on. Yet there are numerous examples of human players achieving expert level proficiency in more that one of these games—often reporting interesting influences and inspiration in their strategic thinking in one game as compared to others. 52 Imagination Continuing with this example, master-level human players have noted that artificial intelli- gence occasionally comes up with moves that are purposeful, yet entirely novel and deeply surprising to human experts. While this could still be considered to be a certain type of creativity, it is still the case that, by and large, imaginativeness has not up until now been a strong point of artificial intelligence systems. Unsurprisingly, these and many other deficiencies of artificial intelligence systems with respect to the full spectrum of human capabilities have prompted researchers to attempt to close the gaps. Some noteworthy attempts are summarized below. Transfer learning The core idea of transfer learning is to take an existing model trained on a particular task and with a small amount of further training apply it to a different, yet related task. This technique is common in deep learning-based methods for object recognition in images or videos. In this domain, a system that has been trained to detect object “A” is repurposed to detect object “B”. This approach works because of the particular way in which objects are represented in such deep network models. The network constructs hierarchical repre- sentations of image properties in which early layers detect very general properties like edges or corners that are relevant for the recognition of many different object classes. Meta learning Meta learning takes one step back from the learning of concrete tasks and is concerned with the problem of learning to learn. To this end, it tries to abstract from individual learn- ing scenarios to find successful common strategies and approaches. Generative adversarial networks (GAN) This approach, developed by Ian Goodfellow at the University of Montreal (Goodfellow et al., 2014), constitutes another path by which artificial intelligence is approaching human imagination and creativity. A GAN is composed of two deep, multilayered neural networks that work in opposition to each other, hence the term adversarial. One of these networks tries to generate data from a certain category. Consequently, it is called the generator net- work. The other network is presented with generated, artificial data as well as real-world data from the same category. The task of the second network is to decide which data is real and which has been generated. Thus, the latter network is referred to as a discrimina- tor network. Both networks are then optimized simultaneously. The generator has to cre- ate ever more lifelike instances of synthetic data in order to keep up with the enhancing ability of the discriminator to discern instances of real and generated data. The following image shows a schematic representation of a GAN, using image processing as an example. 53 Figure 7: Generative Adversarial Network (GAN) Source: Created on behalf of IU (2019), based on Ahn, 2017. Super Intelligence Building on the notion of artificial intelligence as a human equivalent is the idea of super intelligence, which is the belief in the possibility of an artificially created intelligence which could exceed the capabilities of the human mind. Strikingly, adherents of this idea commonly think that the achievement of such a level of intelligence is not brought about by human scientists or engineers, but by intelligent machines themselves. In this line of thinking, a machine that has reached a versatile and open-ended level of intelligence equivalent to human level could use its capacities to acquire vast amounts of existing knowledge (because it has no memory capacity or retrieval limitations) and then use that knowledge together with its problem-solving capacity to improve itself. The resulting next generation artificial intelligence would then, in turn, use its superior resources and capaci- ties for self-improvement to create a subsequent version of ever-improved intelligence. Thus, a runaway evolution of ever increasingly intelligent machines could take place, quickly surpassing anything humanly imaginable. The point at which this exponential growth in machine intelligence kicks off is often referred to as the technological singular- ity. The two most influential thinkers behind the creation and subsequent popularization of the idea of a technological singularity, as described above, are Vernor Vinge and Ray Kurz- weil. In the early 1990s, Vinge (1993) predicted that greater than human intelligence would be achieved in the upcoming thirty years by either technological or biological means or a combination thereof. He also believed that a technological singularity is a process that gets triggered at a point in time when artificial intelligence systems become sufficiently developed. Subsequently, continuous improvements occur, which feed into themselves and thus keep accelerating. Ray Kurzweil is another renowned proponent of the concept of singularity. Early in his career, he gained recognition as an inventor and futurist by contributing to the fields of scanning and speech recognition, to name just a few. In his 2005 book, “The Singularity is Near”, he discussed the concept of singularity intensively, and thereby influenced the sci- entific community worldwide, as well as inspiring many books and films about the future 54 of AI. Kurzweil predicted that advances in intelligence would be non-biological and based on artificially created substrate rather than neurons, with the potential to become “a tril- lion times more powerful” (2005, p.25) than any intelligence that existed at the time. The successor of this famous book, “The Singularity is Nearer” has already been announced and should be published in 2024. However, there are other lines of argumentation that call into question whether such a a development is likely or even possible (see, for example, Walsh, 2017). A summary of some common counter-arguments to the concept of a technological singularity is listed below. Summary of common counter-arguments to the concept of a technological singular- ity An explosion of artificial intelligence cannot happen until machine intelligence sur- passes the human variety in all domains. This is not yet the case and we do not know whether this will be the case in the near future. Playing chess or the game of Go is not sufficient evidence for the development of an Artificial General Intelligence (AGI) which is at least at the same level as human intelligence. However, the recent advances of the language model GPT-3, which is able to produce human-like text in various cases, have fueled this discussion. Arguably, the generality of human intelligence is contingent on many mental factors, such as emotion, motivation, the feeling of autonomy and agency, and even, to some extent, our biases and seeming cognitive shortcomings. The logic of the artificial intelli- gence explosion seems to assume that a machine can achieve or mimic those while retaining control over its more mechanical and computerlike aspects, such as virtually unbounded memory and computational speed. In looking at the history of achievements in the field of artificial intelligence, practical successes, while steady, have also been rather modest. It is true that new discoveries in artificial intelligence are being made and have been demonstrated to be scientifically and economically successful. However, increased complexity is brought about by every new discovery. Taking this into consideration, the next discovery is likely to be more difficult. Walsh calls this the “diminishing returns” argument (2017, p. 61). Similarly, while forecasting large returns from artificial intelligence may be observable in day-to-day applications, trends may not continue forever given that no trend ever does. Eventually, exponential performance curves turn into s-curves, which taper off at the top. The argument in support of the theory of an artificial intelligence explosion exclusively focuses on the individual mind. However, much of what makes up the strength of the human mind is its social dimension. Times when the most gifted individuals could absorb the total amount of available knowledge are long gone. Our lives are shaped by the collective intelligence of our society and its division of labor. Even on their own, the brightest individual could not establish the quality of life and freedom to pursue the quest for all of human knowledge, at least in our developed societies. 55 THINKING EXERCISE Consider what a hypothetical society with advanced artificial intelligence might look like in thirty or more years from now. In what field will artificial intelligence play the greatest role? What effect will artificial intelligence have on this field and on society more generally? Do you think it is possible that rogue governments, corporations, or other criminal elements could highjack an advanced artificial intelligence system and exploit it for their own ends? Do you believe, as Kurzweil does, that the content of a human brain working in analog could in future be downloaded onto a digital storage device and preserved? SUMMARY In this unit, we focused on scientific disciplines closely related to and uniquely informing research in artificial intelligence. Focusing on neuroscience, this unit provided some basic anatomical facts about cells in the nervous system and the brain. The coarse scale structure of the brain—the division of the main lobes—was also descri- bed and the physiological relevance of the main constituent parts out- lined. Expanding upon neuroscience, cognitive science was introduced to give a broader point of view and a more general understanding of cognitive processes and phenomena utilizing contributions from diverse aca- demic fields, including philosophy, psychology, linguistics, and anthro- pology. The unit also explored the multitude of interrelations between these fields of study and their connection to artificial intelligence. 56 UNIT 4 MODERN ARTIFICIAL INTELLIGENCE SYSTEMS STUDY GOALS On completion of this unit, you will have learned … – about the interrelationship between computing and artificial intelligence. – to appreciate the advances in computing technologies since the 1950s and the new insights and opportunities this continues to provide. – about the difference between weak or narrow artificial intelligence used today versus the future possibility of Artificial General Intelligence. – about the potential benefits of modern systems in artificial intelligence relative to com- puter vision and natural language processing (NLP). 4. MODERN ARTIFICIAL INTELLIGENCE SYSTEMS Introduction The two research fields of computer science and artificial intelligence work hand in hand to produce advances. However, they are not always perfectly synchronized, and it can be argued that the current wave of progress in deep learning, in particular, and artificial intel- ligence, in general, is to a considerable extent driven by recent advances in data storage and computing capabilities. Reciprocally, the last decline of interest in connectionists models of machine intelligence and learning that preceded the current period of renewed enthusiasm would not have occurred—at least not in such a pronounced form—if today’s distributed computing and storage technology had been available. 4.1 Recent Developments in Hardware and Software In the 1950s computing technology developed into an industry, which was a period of great excitement. Many universities and government funded programs created innovative computing devices, nearly all of which ran on vacuum tubes in which the flow of electrons moving in a vacuum was used as a switch to turn on or to turn off bits of information. The UNIVAC computer was the first commercial version at a time when the whole world only had about 100 computers. During this time, Alan Turing (1950) also published the seminal paper “Computing Machinery and Intelligence” proposing the concept of machine intelli- gence. Interactions between humans and machines were demonstrated, and high-level languages, such as FORTRAN, COBOL, and Lisp, which alleviated the programmer of tedi- ous bit-level operations, appeared on the market. In the 1960s the rate of change accelerated with an emphasis on computing capacity and integrated circuit design rather than vacuum tubes, networks, and operating systems. An integrated circuit operates on semiconducting material, performing the same function as larger electronic components. The transformation from vacuum tubes to transistors and integrated circuits had vast commercial implications in lowering costs, reducing the weight and space of machinery, increasing reliability, and decreasing operational energy needs. The net effect was that computers became more affordable and simultaneously Moore’s Law more powerful. Moore’s Law, which is arguably running out of validity today, was also This law states that com- coined and became widely accepted. What were then called “third generation computers” plexity, as measured by the number of transistors included the IBM-360 and 700, DEC’s PDP-8 (Mini), and the CDC-6600, the first supercom- on a chip, doubles every puter of the day. The process of miniaturization continues today. two years. 58 In the 1970s advances in the size of hardware led to it becoming larger (in terms of the total installation) and smaller (in terms of component size) while software capability and the ease of programming improved. During this time, Microsoft Corporation was estab- lished by Bill Gates and Paul Allen while Apple Computer was founded by Steve Wozniak and Steve Jobs. Word processing started with a program called “Word Star” and spread- sheet technology from a program called Visicalc. E-mail initially began as ARPANET—a communication mechanism for transferring files for defense and academic personnel. The programming language C, which gave rise to C++ in the 1980s, promoted the structured programming paradigm. The intent was to create clarity in software development by avoiding go-to constructs, breaking large programs into smaller functions and using repe- tition in logic. A lot of progress made in the 1980s can be attributed to the private sector. Much of the personal computing technology we consider to be ordinary by today’s standards was established during this period, including, for example, MS-DOS, the microcomputer oper- ating system adopted by IBM. Many of the companies that made large contributions at the time are not around anymore, such as Tandy, Commodore, and Atari. Surviving their founding days are Adobe Photo Shop and TCP/IP, the protocol that carries information across the internet. The whole concept of the World Wide Web (WWW), the idea of format- ting online content under the protocol of HTTP, developed out of Switzerland’s CERN research. This laid the foundation for computer networks and the WWW. In addition, more and more computational work was carried out on workstations during this time. In the 1990s, hyperlinks became popular as a mechanism for connecting web pages. This made it possible to link related information and browse through it. One of the biggest global developments was the rise of the Windows operating system with Windows 3.0 in 1990, followed by Windows 95 and 98. Microsoft Office also became a worldwide standard and led to productivity increases due to the network effect. The Java language also Network effect became a commercial product during this period. On the hardware side, IBM’s special When the benefits of a product or service computer Deep Blue beat the champion chess player Garry Kasparov, which had implica- increase depending on tions for artificial intelligence. Nokia, a Finnish communications company, also introduced the number of users, this the first smart phone. is known as the network effect. A familiar example of this effect is the use of Since the year 2000, hardware and software have become more integrated, producing new Internet platforms. products and services. New product applications combining hardware and software relate to wearable devices and augmented reality (AR), of which Microsoft’s HoloLens is an exam- ple. AR glasses can display various information to the person wearing the glasses. AR sys- tems like HoloLens, Oculus Rift, and similar developments, combine business and enter- tainment on the one hand and technology and content on the other. Applications may be found in 3D tourism and 3D product catalogues. Since the year 2000, the open browser Firefox has also risen to prominence, competing with Google’s Chrome. In the new prod- ucts realm, Bitcoin, a cryptocurrency, was announced, which has since experienced con- siderable volatility. Ultimate acceptance or rejection of Bitcoin and other cryptocurrencies is yet to be established. 59 Cloud Computing The developed industrial world is moving away from individually owning products and consumables in both computer hardware and conventional consumer products. In major cities the per-use leasing of cars (car sharing) helps reduce traffic congestion and the occu- pation of parking spaces. Similar offerings for bicycles have been established, enabling more people to share bicycles rather than having to personally own them. Using Uber or Lyft taxis, and possibly even autonomous ones in future, is less expensive than owning a car in large and mid-sized cities. Renting a vacation home is also less costly than owning one, as the latter requires considerable financial means and entails ongoing expenses. In computing, similar sharing is also occurring and is referred to as cloud computing. The reasons for the emergence of cloud computing are summarized below: Artificial intelligence is advancing rapidly and requires huge computational and data storage resources. It is more efficient to share hardware and software resources than have each client make their own investments, duplicating the effort. Building a computer resource sharing facility constitutes an economic opportunity for organizations that have the skills and resources to establish one and use it correctly. A sample of cloud computing providers includes Amazon, IBM, SAS, Microsoft, Salesforce, Sun Microsystems, Oracle, and others. Cloud computing can also be considered to be an extension of the internet paradigm in which data and communications are becoming more democratic and less exclusive, with new economic opportunities being created in the form of paid advertising and data-related services. In the 1980s the concept of time sharing became popular, which was a forerunner to cloud computing. Time sharing differed from cloud computing in that it was concentrated on the sharing of hardware resources that were always available to everybody. The cost to subscribers was billed in terms of machine cycle time and storage units. In some respects, computing became a commodity like electricity, water, and natural gas. A few definitions are useful before we proceed to cloud computing and its application to artificial intelligence. Virtual computers Virtual computers are single virtual machines created inside a server environment or cloud facility to serve a single client as needed. Multiple virtual machines can operate on the same physical hardware at the same time, simultaneously drawing on data and pro- cessing resources. Cloud computing Cloud computing is parallel, geographically distributed, and virtualized. 60 Grid computing Grid computing gets its name from the electric grid. It is a parallel and geographically dis- tributed architecture that commonly consists of heterogenous nodes that perform differ- ent workloads or applications. Resources may be owned by multiple entities cooperating for the mutual good. Cluster computing systems Cluster computing systems are also parallel and geographically distributed with resources available at runtime. The difference to grid computing is that in the cluster computing case we have connected stand-alone computers operating in unison, i.e., on the same task. The main benefit of cloud technology for artificial intelligence is its ability to provide a set of highly scalable computational and data storage resources. Cloud Case—Artificial Intelligence in Supply Chains In order to demonstrate how cloud technology and artificial intelligence go hand in hand to solve important real-world application scenarios, we will now examine the case of sup- ply chain management. Supply chain management, which is a very significant area of application for artificial intelligence, is just one of the many examples of commercial scenarios that benefit from cloud technology. The cloud is a physically immense server installation. It can be located anywhere to provide communication and information exchange between worldwide sup- ply chain partner firms. Finally, one cloud can be connected with any number of other clouds. For example, if one cloud installation deals with tourism services, it may be logi- cally connected to another cloud structure focused on public transportation. Nowadays, supply chains are worldwide. They connect individual enterprises that coordi- nate their operations for the benefit of the end consumer and for mutual benefit. Cloud computing is the supply chain connecting mechanism, with artificial intelligence ensuring that supply chains operate more efficiently. When applied to supply change management, artificial intelligence uses natural language processing (NLP) to scan contracts, retrieve chat logs and orders, and speed up payments along the chain. uses machine learning to confirm trends, quantifying the flow of goods along the chain, with the objective of being at the right place and at the right time. forecasts the demand for specific products and shares this information with all partners along the supply chain. optimizes warehouse operations with respect to sending, receiving, picking, and storing products. operates autonomous transportation vehicles. 61 The challenge to making all of the above processes work is data. It must be complete, descriptive, accurate, and available in real time to all members in the supply chain. Assume company X is a large manufacturing enterprise and operates, like most enterpri- ses do, as a sophisticated supply chain stretching over several continents. Managing and coordinating a chain of legally independent companies, which are in part mutually dependent upon each other, has become more complicated. This is due to different rea- sons: trade laws have changed, technology has required new approaches to the assembly and marketing of finished goods, and, most importantly, supply chain members have insisted on maintaining incompatible IT systems. There are many performance measures in supply chains, but for this case let us limit performance to supply chain-wide inventory levels, on-time product deliveries in perfect quality all the time, and a satisfactorily auto- mated payment system. Let us further assume that such a system has been in operation for eight years and is becoming harder to maintain as product sales volume has doubled and product classes have increased. In order to manage the introduction of cloud comput- ing and artificial intelligence into a business’s IT management system, the following key points need to be considered: A business case in favor of the cloud needs to be developed in order to convince partici- pants along the supply chain that supporting the concept is worth the effort. Such a case study has to start with a complete understanding of the shortcomings of the cur- rent supply chain IT management system. The proposed replacement must address risk, governance cloud technology, and artificial intelligence applicable to the new IT environment. The introduction of cloud computing requires opportunity focused kind of thinking with respect to costs and benefits. Issues such as new opportunities for collaboration, new services, data security improvement, errors reduction, improved accountability, risk, and faster customer response times need to be considered. Working with stakeholders to create value is an important ingredient in selling an artifi- cial intelligence related cloud project. Stakeholders are all member companies in the supply chain, including major customers who buy the end product and ultimately pro- vide funding on a continuing basis. Cloud Artificial Intelligence Service Specialization Since advanced analytics and artificial intelligence constitute major application areas of cloud computing, relevant suppliers such as Google, Microsoft, and Amazon have integra- ted machine learning and artificial intelligence offerings into their portfolios. While some of these services are free to try, others require a usage fee. The fees are based on usage volume, storage space, proprietary data, frequency, and other criteria. Artificial intelligence libraries and applications are centered on chatbots that conduct simulated conversations with customers and search digital and graphic databases, formulating answers verbally, many times faster than a human oper- ator. NLP technology offered as a cloud platform service in many areas of interest, such as the translation of websites. 62 visual content classification services that have the capacity to classify a high volume of client images. documents from a customer source, such as in text graphics or photography, which can be presented for machine analysis (e.g., for plagiarism or style detection). Quantum Computing Current chip designs are approaching the physical limits of the semiconductor-based hardware paradigm. Thus, the exponential gains in size, complexity, and execution speed of processing units, as seen in the last decade, are likely to come to an end in the foreseea- ble future. One way around these limits is given by parallel and distributed technologies, as already outlined. Another way is by considering entirely novel computational para- digms. Quantum computing belongs to the latter category. A classical computer represents information in bit form—elementary units that can only take on either one of the values 0 or 1 at any given time. In contrast, quantum computing employs the quantum theoretical concept of superposition, derived from quantum phys- ics, which implies that any sum of quantum states is itself a quantum state and that any Quantum physics quantum state can be expressed as the sum of two or more other quantum states. Apply- This is a branch of physics that describes the behav- ing this paradigm to the representation of information leads to the concept of the qubit—a ior of elementary parti- unit of memory that can not only be in the two states of a classical bit—either 0 or 1—but cles and their interac- also in a superposition of these states. A quantum computer is a device that is based on tions. this representation of information and the pertaining opportunities for information pro- cessing it brings about. Key points of quantum computing The term quantum in quantum computing was adopted from a branch of physics called quantum mechanics. Quantum mechanics was developed in the search for a mathe- matical description of the behavior of subatomic particles. Quantum computing is a new technology that could significantly increase our ability to process information. While it is not yet commercially available, it is being researched by governments and major corporations. Quantum computing does not necessarily bring about improvements in all computa- tional tasks. It has been shown that for many important applications, classical algo- rithms are of comparable performance to their quantum counterparts. Quantum computing is expected to have a considerable impact on cryptography. On the one hand, it has the potential of rendering today’s most popular encryption algo- rithms useless; on the other hand, it could enable entirely new cipher methods. Since quantum mechanics is a theory that is probabilistic in nature and quantum com- puting has been shown to be especially well suited for search and optimization tasks, it stands to reason that artificial intelligence will also be impacted by quantum comput- ing. 63 4.2 Narrow Versus General Artificial Intelligence A more recent theme in artificial intelligence research is the clear distinction between the several related and yet diverse forms of artificial intelligence. In the most general terms, artificial intelligence can be defined as the mechanistic implementation of sensory per- ception processes, cognition, and problem-solving capabilities. Throughout the history of artificial intelligence as a scientific discipline, researchers have addressed this daunting endeavor by breaking down the challenge into a manageable size by implementing sys- tems that perform specialized functions in controlled environments. This approach is now termed Artificial Narrow Intelligence (ANI) or Weak Artificial Intelligence. It is the opposite of the open-ended, flexible, and domain independent form of intelligence expressed by human beings, which is commonly termed Artificial General Intelligence (AGI) or Strong Artificial Intelligence. Artificial Narrow Intelligence In defining the term Artificial Narrow Intelligence (ANI), it is useful to think of it as all of the artificial intelligence currently in existence and which will be realizable within the foresee- able future. As such, currently existing systems in the typical application areas like self- driving vehicles, translation between languages, sales forecasting, NLP, and facial recogni- tion all fall under the concept of ANI. The word narrow indicates that the type of intelligence in question only pertains to one domain at a time. For example, a given device or system may be able to play chess, but it will not be able to play another strategy game like Go or shogi, let alone perform com- pletely different tasks such as translation. In short, narrow means both a display of intelli- gence in the sense of the ability to solve a complex problem and a display of intelligence relative to only one task. Artificial General Intelligence For Artificial General Intelligence (AGI), the cognitive versatility of a human being is the ref- erence point against which this form of artificial intelligence is measured and judged. The goal is not only to replicate specific instances of sensory data interpretation, language interpretation, or other forms of intelligent behavior, but the full range of human cognitive abilities. Clearly, this entails displaying all the capabilities currently represented by Weak Artificial Intelligence as well as the ability to implement elements of generalization across domain boundaries—that is, implementing things learned in one task to different but rela- ted tasks, including motivation and volition. Philosophical sources on the matter (Searle, 1980), in particular, go one step further by also requiring that AGI possesses consciousness or self-awareness. It is an exceptionally difficult task to imagine the development of an artificial intelligence device that simultaneously had all of the following abilities: 64 the cognitive ability to learn and function in several domains the possession of human-level intelligence across all domains the possession of multi-domain problem-solving abilities at the average human level independent problem-solving ability the ability to think abstractly without direct reference to past experience the ability to perceive the whole environment in which it operates the ability to entertain hypotheticals for which it has no prior experience the ability to motivate itself and the possession of self-awareness Moreover, the concept of superintelligence—the idea that artificial intelligence acquires cognitive abilities beyond what is possible for humans by engaging in a recursive cycle of self-improvement—is contingent on it first reaching a state of Artificial General Intelli- gence. 4.3 Natural Language Processing (NLP) and Computer Vision The subject of natural language processing is a major application domain of current artifi- cial intelligence techniques, which has been the case throughout the history of artificial intelligence. It consists of three main constituent parts: (1) speech recognition—the identi- fication of words in spoken language and the transformation of speech into text; (2) lan- guage understanding—the extraction of meaning from words and sentences and reading comprehension; and (3) language generation and the ability to express information and meaning in the form of well-formed sentences or longer texts. The ultimate goal is to interpret and use language at a human level. This would not only enable humans to communicate with machines in their natural language, but also allow for a number of interesting language centered applications ranging from automatic trans- lation between different languages to the generation of text excerpts, digests, or complete works of literature. While this goal has not yet been achieved, natural language processing is making remarkable progress, which can be observed by the following developments: the development of virtual assistants on commercial phones and laptop computers that are ever more responsive to complex inquiries the development of enhanced machine translations between two different human lan- guages, which are continually improving and which can now also be run on commer- cially available smartphones, tablets or notebooks the development of key-word extraction to analyze volumes of text, for example, to assist with media reporting the application of sentiment analysis to e-mail and social media texts to assess the writ- er’s mood and emotional attitude towards the subject the ability of voice-recognition software to identify speakers the ability of speech-recognition software to recognize words measured by the accuracy rate and how well the system can keep up with an ongoing conversation in real time 65 Taking into account how much our human faculty of reasoning and logical inference is based on language, it is easy to see that the ability to process language is intimately tied to the problem of intelligence itself. As a case in point, consider the now famous Turing Test for the presence of artificial intelligence. Alan Turing (1950) proposed this test as a way of determining whether a machine could be considered intelligent. The test involves a machine and a human subject, with both answering a series of questions from an interrog- ator via telegraphic connections. If the interrogator cannot identify which of the conversa- tion partners is a human and which is a machine, the machine is considered intelligent. Clearly, this test scenario, which Turing himself called “the imitation game”, critically hinges on the ability of the machine to process natural language. Natural language processing, as a technical discipline, started in the mid-1950s during a time of heightened geopolitical tension between the United States and the former Soviet Union. American government institutions had a high demand for English and Russian translators such that translation was outsourced to machines. While preliminary results were promising, translation turned out to be far more complex than initially estimated, with substantial progress in the technology failing to materialize. In 1964, the Automatic Language Processing Advisory Committee (ALPAC) therefore described natural language processing technology as “hopeless”, temporarily ending funding for natural language processing research and initiating a natural language processing winter. Almost 20 years later in the early 1980s, the subject regained interest as a result of three events: Computing power increased in line with Moore’s Law, thereby enabling more computa- tionally demanding natural language processing methods. A paradigm shift occurred. The first wave of language models was characterized by a grammar-oriented approach that tried to implement ever more complex sets of rules to tackle the complexities of natural everyday language. This changed towards models based on a statistical and decision-theoretic foundation. One of the first approaches was the use of decision-tree analyses rather than man-made and hand-coded rules gov- erning the use of words. Decision tree models lead to hard if-then choices. Further refinement of natural language processing was achieved by the use of probabil- ity theory in a technique called part-of-speech tagging. This technique uses the stochas- tic Markov model to describe a dynamic system like speech. In a Markov model, only the last state of the system, together with a set of transition rules, defines the next state, as opposed to approaches that consider the whole history. Overall, the shift towards statistical, decision-theoretic, and machine learning models increased the robustness of natural language processing methods in terms of their ability to cope with previously unencountered constellations. Moreover, it opened up the oppor- tunity to learn and improve, making use of the growing corpora of literature available in electronic form. 66 Inside Natural Language Processing The process of natural language understanding is based on numerous constituent parts, such as syntax, semantics, or speech recognition. Discernment of the individual compo- nents listed below helps us understand the science of natural language processing. Syntax ◦ Syntax describes the grammar of a language and, in particular, the prescribed sequence of words in phrases. In language translation between two or more lan- guages, obviously more than one grammar simultaneously comes into play. Semantics ◦ Semantics refers to the meaning of words. In natural language processing it answers the question of the meaning and interpretation of a word in a given context. Speech recognition ◦ Speech recognition takes the recorded sound of a person speaking and converts it to text. The exact opposite is called text-to-speech. Speech-to-text is difficult because voice recognition has to deal with dialects and highly variable pronunciation. As human speech has practically no pauses between words, speech-to-text systems have the difficult task of segmenting words in order to process entire sentences. Text summaries ◦ Text summaries produce readable summaries of volumes of text on known subjects. An academic or research association may conduct an annual meeting during which many research papers are presented. These papers can be summarized and analyzed for conference reports. There are many processes operating inside a natural language processing system, and the following examples provide a sample of those that address both syntax and semantics. Terminology extraction ◦ Terminology extraction programs analyze text and semi-automatically count fre- quently used words in many languages. The frequency of certain terms as well as the frequency of co-occurrence with other terms can provide valuable hints on the topic of a text. Part-of-speech-tagging ◦ Part-of-speech tagging is a method of finding what part of speech a given word in a sentence represents. For example, the word book can be a noun, as in the sentence “I just bought this book”, or it can be a verb, as in the sentence “I just booked a table at a restaurant”. Parsing ◦ Parsing refers to grammatically analyzing a sentence or a string of characters. In natu- ral languages, grammar can be quite ambiguous, resulting in sentences with multiple meanings. Dependency parsing builds parse trees with inner nodes, i.e., non-terminal grammatical objects that need to be further broken down in order to arrive at the ter- minal nodes which represent actual words present in the parsed sentence. Constitu- ency parsing builds the parse-tree solely on the bases of terminal nodes without the help of intermediate grammatical constructs. 67 Word stemming ◦ The objective of word stemming is to trace derived words to their origin. For example, the word “opening” can be a noun, as in “the opening of an art show”. It can also be a verb, as in “She is opening the door”. Word stemming traces the word back to the stem “open”. Machine translation ◦ Machine translations between two or more natural languages are considered the most difficult to do well. It requires multi-language grammar knowledge, semantics, and facts about one or more domains. Named entities recognition ◦ Named entities recognition is the task of identifying and classifying words in terms of categories such as names of people, objects, and places. Capitalization of words pro- vides hints, but on its own is insufficient to discern named objects. For example, the grammatical rules of capitalization in English are quite different to those found in German. Relationship extraction ◦ Relationship extraction takes text and identifies the relationships between named objects, such as that between a father and son or between a mother and daughter. Sentiment analysis ◦ Sentiment analysis aims to discern the prevailing attitude, emotional state, or mood of the author based on word choice. Disambiguation ◦ Disambiguation of words in sentences deals with the multiple meanings of words. To do this, computers are given a dictionary of words and associated options for the meaning of the words. This enables NLP to make the best choice in a given context. However, new situations always arise and exceptions are possible. Questioning and answering ◦ Questioning and answering is widely used and highly popular in commercial applica- tions. Answers can be of a simple yes/no form with a specific one-word answer, or the process can be very complex. The question must be understood by the computer, the answer extracted from databases, and then verbalized in the form of an answer. Computer Vision Seeing and understanding the content of an image is second nature for humans. It is very difficult for computers to do likewise, but it is the ultimate goal of computer vision. Com- puter vision aims to help computers see and understand the content of images just as well as humans do if not also in some cases even better than humans do (Szeliski, 2021). In this sense, computer vision is a subfield of artificial intelligence that fits into the scheme of our studies, as pictured below: 68 Figure 8: Computer Vision and Artificial Intelligence Source: Created on behalf of IU (2019). To accomplish machine vision, machine learning is required to identify the content of images. While this process is still far from perfect, substantial advances have been ach- ieved. For humans, seeing and knowing what one is seeing comes naturally. Humans can easily describe the informational content of an image or moving picture and recognize the face of a person they have seen before. Computer vision aims to teach machines to be able to do the same thing. Developing computer vision is not an idle academic curiosity. It has many influential prac- tical applications associated with substantial market opportunities and improvements in the quality of life. It also carries significant risks. For example, computer vision techniques are used in semi-autonomous driving, robotic control, surveillance technology, and medi- cal image analysis. Image Acquisition and Signal Processing Conceptually, the acquisition of image data and the application of signal processing oper- ations, such as filtering, smoothing, or similar image manipulation techniques, has to be distinguished from vision, with the latter defined as the cognitive interpretation of the image content. In this section, we will therefore examine image acquisition in human and computer vision in order to identify similarities and differences. 69 Human vision In the process of human seeing, light enters the cornea, which is the transparent area at the front of the human eye. The cornea has an adjustable pupil by which the amount of light entering the eye is controlled. Behind the pupil is an oblong lens with an adjustable curvature. Adjustments are made by tightening or loosening attached muscles. A rela- tively flat curvature enables seeing distant objects. Conversely, a more curved lens brings near objects into focus. The inner eyeball is comprised of a jelly-like substance through which light travels into the retina. The retina is at the end of the eyeball and contains mil- lions of photo receptors in the form of cones and rods. These cells detect light signals and convert them into electrical signals that are fed into the brain via the optic nerve. The process of converting light into electrical signals is called transduction. These signals are interpreted by the brain and an understanding of the image content is formed. Camera vision The technical counterpart of the eye is the camera. Camera technology has a long-stand- Camera obscura ing history reaching as far back in time as antiquity with the camera obscura. Notable This is a natural optical advances in camera vision are introduced below. phenomenon that occurs when an image is projec- ted through a small hole Pinhole cameras in a screen or wall result- ing in a reversed and inverted image on the Unlike the human eye and all conventional photographic cameras, pinhole cameras have surface opposite to the no lens. They are made up of a sealed box with a small opening for light to enter. The opening. A pinhole cam- incoming light projects an inverted image of the scene in front of the aperture on the back era is based on the same physical principle. wall of the camera. This process takes advantage of the fact that light travels in straight lines (over short distances). Film cameras Film cameras differ from pinhole cameras in that they have a lens through which light has to travel as well as photographic film located in a sealed box that is exposed by the incom- ing light. Chemical reactions alter the material of the film, thereby conserving the image. Having a lens opens up the possibility for dynamic focusing and improvements in image quality. Digital cameras In digital cameras, a light-capturing sensor is used in place of photographic film. Light enters through a lens that projects the image onto a sensor chip, which in turn captures the image in the form of millions of individual elements called pixels. Conceptually, these millions of pixels emulate the function of millions of light sensitive cells in the retina of the human eye. A digital image can then be defined as a string of pixel numbers that represent light intensity and color that can be subsequently manipulated via image processing tools. 70 Computer Vision—From Features to Understanding Images A simple thought experiment is sufficient to show that direct mapping of pixel-wise image content to a semantically meaningful image interpretation is infeasible. To this end, visu- alize a very simple scene consisting of a single object in front of a uniform monochrome background. Imagine how the pixel values change when the object changes orientation, the camera zoom is varied, or differently colored lamps are used to light the scene. A sig- nificant part of any computer vision processing pipeline consists of the extraction of sali- ent image features above the abstraction level of the pixel. Typical examples of such fea- tures are edges (locations with a pronounced change in pixel values), corners (locations where two or more edges join or an edge rapidly changes direction), blobs (uniform subar- eas in a picture), and ridges (the axes of symmetry). These image features provide the input for pattern recognition and machine learning tech- niques employed to derive semantically interesting image content, such as the recogni- tion of objects like license plates, traffic signs for security or autonomous driving, faces for sorting photo collections with respect to a depicted person, or the discovery of malignant tissue in medical images. A typical computer vision pipeline thus contains the following steps: An image acquisition mechanism, such as a digital camera, is used to acquire an image in a form that is suitable for further computational processing. Techniques from the field of signal and image processing, such as sharpening or con- trast enhancement, may be employed to improve suitability for subsequent processing steps. Based on the pixel content of the image, higher-level image features are extracted to abstract from the raw pixel data. The acquired higher-level features are subjected to pattern recognition and machine learning techniques to infer semantically meaningful image content. Computer vision is, therefore, a discipline that employs methods, approaches, and techni- ques from numerous fields of scientific study, as indicated in the figure below. 71 Figure 9: Approaches and Techniques of Computer Vision Source: Created on behalf of IU (2019). SUMMARY This unit has illustrated how artificial intelligence and computer science technologies have advanced simultaneously. For example, techniques for distributed data storage and computing have been crucially impor- tant enablers for progress in artificial intelligence. Cloud computing has reduced the cost of data and its processing while enabling seamless scaling of computational and data storage resources according to demand. These cost reductions, in combination with insights and knowledge gained from artificial intelligence itself, have driven eco- nomic growth. While the history of artificial intelligence research has mainly focused on the emulation of cognitive abilities that are highly specialized in their task specificity (Artificial Narrow Intelligence), the goal of reproducing the full range and richness of human cognitive abilities (Artificial General Intelligence) continues to fascinate and occupy philosophers and artifi- cial intelligence researchers alike. Computer vision and natural language processing are two high-profile application areas of artificial intelligence that have penetrated markets with innovations that remain in high demand. These applications contribute to human well-being at the same time as posing significant ethical and political challenges. 72 UNIT 5 APPLICATIONS OF ARTIFICIAL INTELLIGENCE STUDY GOALS On completion of this unit, you will have learned … – how artificial intelligence techniques will aid the coming mobility revolution. – about the ways the medicine and health care sectors can benefit from artificial intelli- gence. – to distinguish between the multitude of ways artificial intelligence is used to support current financial processes as well as enable entirely new business models in the finan- cial sector. – how artificial intelligence is employed in retail to automize workflows, optimize supply chains, and help in tailoring services to customers. 5. APPLICATIONS OF ARTIFICIAL INTELLIGENCE Introduction While artificial intelligence as a scientific discipline is related to developments in other sci- entific disciplines, it is not solely a subject of academic interest. In order to demonstrate the broad impact that artificial intelligence has on the economy and society as a whole, this unit is dedicated to applications of artificial intelligence. We will start with a focus on mobility, followed by medicine, banking and financial services, and the retail sector. 5.1 Mobility and Autonomous Vehicles By mobility, we mean how people and their cargo move from point A to point B, now and in the future. This section therefore focuses on the role of artificial intelligence in future and emerging mobility trends: car and ridesharing and the general trend away from vehicle ownership the development of autonomous vehicles equipped with sensing devices to support driverless mobility advances in networking between different modes of transport, such as trains, trolleys, and busses, creating a seamless journey spanning multiple modalities Economic and social forces, combined with developments in artificial intelligence and engineering, are bringing about rapid changes in mobility, making it faster, less expensive, safer, and more efficient (Khamis, 2021). In the future, mobility is likely to evolve further to become more personalized, interconnected and sustainable. This visionary, but at the Smart mobility same time feasible concept of mobility is referred to as smart mobility. There are two This is a networked form views of how the revolution in mobility is likely to continue. One view is that change will of mobility that uses data and artificial intelligence come gradually; an opposing view is that it will disrupt everything very quickly. The argu- to connect different trans- ment in favor of gradualism is that industry prefers to keep current assets deployed until port modalities. It they are fully depreciated while also experimenting and testing new technologies. This includes vehicle sharing and autonomous, self- policy is already visible in the market for new automobiles (Sjafrie, 2019). New automo- driving vehicles. biles are fitted with self-driving technology without altering the driver-car relationship that has existed for more than one hundred years. Many of the self-driving technologies that are already built into more technologically advanced automobiles have already been trialed and tested, although the self-driving mode is still restricted to well-defined set- tings. For intermodal mobility, i.e., travel requiring two or more modes of transport per trip such as bus and train, the mobility ecosystem support has to be much more sophisticated than what it once was. New ventures are currently emerging, with new services, solutions, and 74 products to enable multiple modes of mobility. The goal is for mobility to be seamlessly integrated and dependable, and for it to be more sustainable and efficient than is cur- rently still the case with individual automobiles (Khamis, 2021). Importance of Extended Mobility The nature of mobility affects national and global economies in far-reaching ways. Just consider the impact of autonomous driving technology on the automotive industry. Fully automated vehicles can operate in a 24/7 mode. Combined with a shared approach to vehicle usage, this means that consumer mobility needs will be able to be satisfied with far viewer automobiles. As a result, automobile ownership is likely to become less and less attractive, leading to a considerable decline in car sales. Car rentals, truck rentals, taxis, and parking garages will, therefore, likely face disruptions. Compared to cars built with internal combustion engines, electric cars require one-fifth of the parts and are thus much easier to manufacture and maintain (Woolsey, 2018). The word “parts” refers to pre- assembled components, a major component being transmissions, which electric cars do not have as electric motors deliver enough torque to each wheel. Autonomous driving aims to reduce accidents and injuries, which will have a positive effect on hospital emergency room capacity needs and insurance rates (see Sjafrie, 2019 and Găiceanu, 2021for an analysis of various statistics). In the United States, infrastructure projects like road maintenance and bridge repairs are financed by a fuel tax applied to each gallon (3.79 l) of gasoline purchased at the retail level. Taxes are applied at both the state and federal levels, and all funds are placed and administered in public trusts. Differ- ent but similar public infrastructure financing schemes apply in other countries. With fewer cars using gasoline, this form of taxation will change. With fewer cars being owned by individuals, licensing fee structures are also likely to change in the future. Other Mobility Considerations New mobility solutions may range from non-owner pods in predominantly urban areas to customized, personally owned, personally driven automobiles with self-driving ability (Sjafrie, 2019). The transition between personally owned means of transportation, such as cars or bicycles, and public transportation will be seamless. New car features represent new selling opportunities, including advertising and entertainment content. They enable in-vehicle services like navigation and data analytics about the vehicle, its owner, and its drivers, irrespective of whether the owner is a person or a leasing company (Khamis, 2021). While many of these features are in current usage, there is always room for improvement and for new, innovative offerings. Commercial product and service provid- ers will strive to make mobility safe, pleasant, and cost-effective. On the other hand, too many distractions and too much complexity created by the new mobility ecosystem could over time prove to be too demanding for both the average driver and his or her passen- gers. In the past, our mobility ecosystem was only composed of a transport infrastructure, including roads, airports, train stations, and bridges, with their corresponding traffic rules. Nowadays, our mobility systems include yet another component—data. The amount and the variety of available data will increase drastically in the next years. A wide range of traf- 75 fic-related analyses can be performed based on this data. For example, communication signals between vehicles relative to their surroundings provide a rich setting for machine learning. The Relationship Between Artificial Intelligence and Automobiles The automobile industry is teaching autonomous vehicles to drive prior to having been granted permission for their independent use on public roads. Mechanically, some aspects of self-driving, such as acceleration, breaking, and steering have been possible for some time. However, the ability of artificial intelligence (the “brain”) to connect all the different variables in order to make timely practical decisions is new. Many automobile companies, part suppliers, and automotive start-ups are developing self-driving automobiles. To ach- ieve the required capabilities, a broad collection of technologies is employed (Khamis, 2021); among them are radar, high resolution cameras, GPS, and cloud services. Tesla, the prime example of a relatively new challenger in the automotive space, offers convenience services to drivers, especially in conjunction with their personal smart- phones, as well as services such as artificial intelligence-based predictive vehicle mainte- nance. Many cars that are commercially available today feature pre-self-driving capabili- ties, such as forward collision alerts, front pedestrian alerts, and automatic braking at a speed of

Unit 3: Neuroscience and Cognitive Science PDF

Document Details

Tags

Related

Summary

Full Transcript