ArchComputer_Organization_and_Architecture_8th_-_William_Stallings-lec2.pdf
Document Details
Uploaded by JubilantNovaculite4325
Badr University in Assiut
Full Transcript
CHAPTER INTRODUCTION 1.1 Organization and Architecture 1.2 Structure and Function Function Structure 1.3 Key Terms and Review Questions 8 1.1 / ORGANIZATION AND ARCHITECTURE 9 This book is abo...
CHAPTER INTRODUCTION 1.1 Organization and Architecture 1.2 Structure and Function Function Structure 1.3 Key Terms and Review Questions 8 1.1 / ORGANIZATION AND ARCHITECTURE 9 This book is about the structure and function of computers. Its purpose is to present, as clearly and completely as possible, the nature and characteristics of modern-day com- puters. This task is a challenging one for two reasons. First, there is a tremendous variety of products, from single-chip microcomputers costing a few dollars to supercomputers costing tens of millions of dollars, that can rightly claim the name computer. Variety is exhibited not only in cost, but also in size, performance, and application. Second, the rapid pace of change that has always charac- terized computer technology continues with no letup. These changes cover all aspects of computer technology, from the underlying integrated circuit technology used to con- struct computer components to the increasing use of parallel organization concepts in combining those components. In spite of the variety and pace of change in the computer field, certain funda- mental concepts apply consistently throughout.To be sure, the application of these con- cepts depends on the current state of technology and the price/performance objectives of the designer.The intent of this book is to provide a thorough discussion of the funda- mentals of computer organization and architecture and to relate these to contemporary computer design issues. This chapter introduces the descriptive approach to be taken. 1.1 ORGANIZATION AND ARCHITECTURE In describing computers, a distinction is often made between computer architecture and computer organization. Although it is difficult to give precise definitions for these terms, a consensus exists about the general areas covered by each (e.g., see [VRAN80], [SIEW82], and [BELL78a]); an interesting alternative view is presented in [REDD76]. Computer architecture refers to those attributes of a system visible to a pro- grammer or, put another way, those attributes that have a direct impact on the logi- cal execution of a program. Computer organization refers to the operational units and their interconnections that realize the architectural specifications. Examples of architectural attributes include the instruction set, the number of bits used to repre- sent various data types (e.g., numbers, characters), I/O mechanisms, and techniques for addressing memory. Organizational attributes include those hardware details transparent to the programmer, such as control signals; interfaces between the com- puter and peripherals; and the memory technology used. For example, it is an architectural design issue whether a computer will have a multiply instruction. It is an organizational issue whether that instruction will be im- plemented by a special multiply unit or by a mechanism that makes repeated use of the add unit of the system. The organizational decision may be based on the antici- pated frequency of use of the multiply instruction, the relative speed of the two ap- proaches, and the cost and physical size of a special multiply unit. Historically, and still today, the distinction between architecture and organiza- tion has been an important one. Many computer manufacturers offer a family of computer models, all with the same architecture but with differences in organization. Consequently, the different models in the family have different price and perfor- mance characteristics. Furthermore, a particular architecture may span many years and encompass a number of different computer models, its organization changing with changing technology. A prominent example of both these phenomena is the 10 CHAPTER 1 / INTRODUCTION IBM System/370 architecture. This architecture was first introduced in 1970 and in- cluded a number of models. The customer with modest requirements could buy a cheaper, slower model and, if demand increased, later upgrade to a more expensive, faster model without having to abandon software that had already been developed. Over the years, IBM has introduced many new models with improved technology to replace older models, offering the customer greater speed, lower cost, or both. These newer models retained the same architecture so that the customer’s software invest- ment was protected. Remarkably, the System/370 architecture, with a few enhance- ments, has survived to this day as the architecture of IBM’s mainframe product line. In a class of computers called microcomputers, the relationship between archi- tecture and organization is very close. Changes in technology not only influence or- ganization but also result in the introduction of more powerful and more complex architectures. Generally, there is less of a requirement for generation-to-generation compatibility for these smaller machines. Thus, there is more interplay between or- ganizational and architectural design decisions. An intriguing example of this is the reduced instruction set computer (RISC), which we examine in Chapter 13. This book examines both computer organization and computer architecture. The emphasis is perhaps more on the side of organization. However, because a com- puter organization must be designed to implement a particular architectural specifi- cation, a thorough treatment of organization requires a detailed examination of architecture as well. 1.2 STRUCTURE AND FUNCTION A computer is a complex system; contemporary computers contain millions of elemen- tary electronic components. How, then, can one clearly describe them? The key is to rec- ognize the hierarchical nature of most complex systems, including the computer [SIMO96].A hierarchical system is a set of interrelated subsystems, each of the latter, in turn, hierarchical in structure until we reach some lowest level of elementary subsystem. The hierarchical nature of complex systems is essential to both their design and their description. The designer need only deal with a particular level of the system at a time. At each level, the system consists of a set of components and their interrela- tionships. The behavior at each level depends only on a simplified, abstracted charac- terization of the system at the next lower level. At each level, the designer is concerned with structure and function: Structure: The way in which the components are interrelated Function: The operation of each individual component as part of the structure In terms of description, we have two choices: starting at the bottom and build- ing up to a complete description, or beginning with a top view and decomposing the system into its subparts. Evidence from a number of fields suggests that the top- down approach is the clearest and most effective [WEIN75]. The approach taken in this book follows from this viewpoint. The computer system will be described from the top down. We begin with the major components of a computer, describing their structure and function, and proceed to successively lower layers of the hierarchy. The remainder of this section provides a very brief overview of this plan of attack. 1.2 / STRUCTURE AND FUNCTION 11 Operating environment (source and destination of data) Data movement apparatus Control mechanism Data Data storage processing facility facility Figure 1.1 A Functional View of the Computer Function Both the structure and functioning of a computer are, in essence, simple. Figure 1.1 depicts the basic functions that a computer can perform. In general terms, there are only four: Data processing Data storage Data movement Control The computer, of course, must be able to process data. The data may take a wide variety of forms, and the range of processing requirements is broad. However, we shall see that there are only a few fundamental methods or types of data processing. It is also essential that a computer store data. Even if the computer is processing data on the fly (i.e., data come in and get processed, and the results go out immedi- ately), the computer must temporarily store at least those pieces of data that are being 12 CHAPTER 1 / INTRODUCTION Movement Movement Control Control Storage Processing Storage Processing (a) (b) Movement Movement Control Control Storage Processing Storage Processing (c) (d) Figure 1.2 Possible Computer Operations worked on at any given moment. Thus, there is at least a short-term data storage func- tion. Equally important, the computer performs a long-term data storage function. Files of data are stored on the computer for subsequent retrieval and update. The computer must be able to move data between itself and the outside world. The computer’s operating environment consists of devices that serve as either 1.2 / STRUCTURE AND FUNCTION 13 sources or destinations of data. When data are received from or delivered to a device that is directly connected to the computer, the process is known as input–output (I/O), and the device is referred to as a peripheral. When data are moved over longer distances, to or from a remote device, the process is known as data communications. Finally, there must be control of these three functions. Ultimately, this control is exercised by the individual(s) who provides the computer with instructions. Within the computer, a control unit manages the computer’s resources and orchestrates the performance of its functional parts in response to those instructions. At this general level of discussion, the number of possible operations that can be performed is few. Figure 1.2 depicts the four possible types of operations. The computer can function as a data movement device (Figure 1.2a), simply transferring data from one peripheral or communications line to another. It can also function as a data storage device (Figure 1.2b), with data transferred from the external environ- ment to computer storage (read) and vice versa (write). The final two diagrams show operations involving data processing, on data either in storage (Figure 1.2c) or en route between storage and the external environment (Figure 1.2d). The preceding discussion may seem absurdly generalized. It is certainly possi- ble, even at a top level of computer structure, to differentiate a variety of functions, but, to quote [SIEW82], There is remarkably little shaping of computer structure to fit the function to be performed.At the root of this lies the general-purpose nature of computers, in which all the functional specialization occurs at the time of programming and not at the time of design. Structure Figure 1.3 is the simplest possible depiction of a computer. The computer interacts in some fashion with its external environment. In general, all of its linkages to the external environment can be classified as peripheral devices or communication lines. We will have something to say about both types of linkages. Co m s m al un er ph ica ri tio Pe n lin es COMPUTER Storage Processing Figure 1.3 The Computer 14 CHAPTER 1 / INTRODUCTION COMPUTER I/O Main memory System bus CPU CPU Registers ALU Internal bus Control unit CONTROL UNIT Sequencing logic Control unit registers and decoders Control memory Figure 1.4 The Computer: Top-Level Structure But of greater concern in this book is the internal structure of the computer itself, which is shown in Figure 1.4. There are four main structural components: Central processing unit (CPU): Controls the operation of the computer and performs its data processing functions; often simply referred to as processor. Main memory: Stores data. I/O: Moves data between the computer and its external environment. System interconnection: Some mechanism that provides for communica- tion among CPU, main memory, and I/O. A common example of system 1.3 / KEY TERMS AND REVIEW QUESTIONS 15 interconnection is by means of a system bus, consisting of a number of con- ducting wires to which all the other components attach. There may be one or more of each of the aforementioned components. Tradi- tionally, there has been just a single processor. In recent years, there has been in- creasing use of multiple processors in a single computer. Some design issues relating to multiple processors crop up and are discussed as the text proceeds; Part Five focuses on such computers. Each of these components will be examined in some detail in Part Two. How- ever, for our purposes, the most interesting and in some ways the most complex component is the CPU. Its major structural components are as follows: Control unit: Controls the operation of the CPU and hence the computer Arithmetic and logic unit (ALU): Performs the computer’s data processing functions Registers: Provides storage internal to the CPU CPU interconnection: Some mechanism that provides for communication among the control unit, ALU, and registers Each of these components will be examined in some detail in Part Three, where we will see that complexity is added by the use of parallel and pipelined organizational techniques. Finally, there are several approaches to the implementation of the con- trol unit; one common approach is a microprogrammed implementation. In essence, a microprogrammed control unit operates by executing microinstructions that define the functionality of the control unit. With this approach, the structure of the control unit can be depicted, as in Figure 1.4. This structure will be examined in Part Four. 1.3 KEY TERMS AND REVIEW QUESTIONS Key Terms arithmetic and logic unit computer organization processor (ALU) control unit registers central processing unit (CPU) input–output (I/O) system bus computer architecture main memory Review Questions 1.1. What, in general terms, is the distinction between computer organization and com- puter architecture? 1.2. What, in general terms, is the distinction between computer structure and computer function? 1.3. What are the four main functions of a computer? 1.4. List and briefly define the main structural components of a computer. 1.5. List and briefly define the main structural components of a processor. CHAPTER COMPUTER EVOLUTION AND PERFORMANCE 2.1 A Brief History of Computers The First Generation: Vacuum Tubes The Second Generation: Transistors The Third Generation: Integrated Circuits Later Generations 2.2 Designing for Performance Microprocessor Speed Performance Balance Improvements in Chip Organization and Architecture 2.3 The Evolution of the Intel x86 Architecture 2.4 Embedded Systems and the ARM Embedded Systems ARM Evolution 2.5 Performance Assessment Clock Speed and Instructions per Second Benchmarks Amdahl’s Law 2.6 Recommended Reading and Web Sites 2.7 Key Terms, Review Questions, and Problems 16 2.1 / A BRIEF HISTORY OF COMPUTERS 17 KEY POINTS ◆ The evolution of computers has been characterized by increasing processor speed, decreasing component size, increasing memory size, and increasing I/O capacity and speed. ◆ One factor responsible for the great increase in processor speed is the shrinking size of microprocessor components; this reduces the distance be- tween components and hence increases speed. However, the true gains in speed in recent years have come from the organization of the processor, in- cluding heavy use of pipelining and parallel execution techniques and the use of speculative execution techniques (tentative execution of future in- structions that might be needed). All of these techniques are designed to keep the processor busy as much of the time as possible. ◆ A critical issue in computer system design is balancing the performance of the various elements so that gains in performance in one area are not hand- icapped by a lag in other areas. In particular, processor speed has increased more rapidly than memory access time. A variety of techniques is used to compensate for this mismatch, including caches, wider data paths from memory to processor, and more intelligent memory chips. We begin our study of computers with a brief history. This history is itself interest- ing and also serves the purpose of providing an overview of computer structure and function. Next, we address the issue of performance. A consideration of the need for balanced utilization of computer resources provides a context that is use- ful throughout the book. Finally, we look briefly at the evolution of the two sys- tems that serve as key examples throughout the book: the Intel x86 and ARM processor families. 2.1 A BRIEF HISTORY OF COMPUTERS The First Generation:Vacuum Tubes ENIAC The ENIAC (Electronic Numerical Integrator And Computer), designed and constructed at the University of Pennsylvania, was the world’s first general- purpose electronic digital computer. The project was a response to U.S. needs during World War II. The Army’s Ballistics Research Laboratory (BRL), an agency respon- sible for developing range and trajectory tables for new weapons, was having diffi- culty supplying these tables accurately and within a reasonable time frame. Without these firing tables, the new weapons and artillery were useless to gunners. The BRL employed more than 200 people who, using desktop calculators, solved the neces- sary ballistics equations. Preparation of the tables for a single weapon would take one person many hours, even days. 18 CHAPTER 2 / COMPUTER EVOLUTION AND PERFORMANCE John Mauchly, a professor of electrical engineering at the University of Pennsylvania, and John Eckert, one of his graduate students, proposed to build a general-purpose computer using vacuum tubes for the BRL’s application. In 1943, the Army accepted this proposal, and work began on the ENIAC. The resulting machine was enormous, weighing 30 tons, occupying 1500 square feet of floor space, and containing more than 18,000 vacuum tubes. When operating, it con- sumed 140 kilowatts of power. It was also substantially faster than any electro- mechanical computer, capable of 5000 additions per second. The ENIAC was a decimal rather than a binary machine. That is, numbers were represented in decimal form, and arithmetic was performed in the decimal sys- tem. Its memory consisted of 20 “accumulators,” each capable of holding a 10-digit decimal number. A ring of 10 vacuum tubes represented each digit. At any time, only one vacuum tube was in the ON state, representing one of the 10 digits. The major drawback of the ENIAC was that it had to be programmed manually by set- ting switches and plugging and unplugging cables. The ENIAC was completed in 1946, too late to be used in the war effort. In- stead, its first task was to perform a series of complex calculations that were used to help determine the feasibility of the hydrogen bomb. The use of the ENIAC for a purpose other than that for which it was built demonstrated its general-purpose nature. The ENIAC continued to operate under BRL management until 1955, when it was disassembled. THE VON NEUMANN MACHINE The task of entering and altering programs for the ENIAC was extremely tedious. The programming process could be facilitated if the program could be represented in a form suitable for storing in memory alongside the data. Then, a computer could get its instructions by reading them from memory, and a program could be set or altered by setting the values of a portion of memory. This idea, known as the stored-program concept, is usually attributed to the ENIAC designers, most notably the mathematician John von Neumann, who was a consultant on the ENIAC project. Alan Turing developed the idea at about the same time. The first publication of the idea was in a 1945 proposal by von Neumann for a new computer, the EDVAC (Electronic Discrete Variable Computer). In 1946, von Neumann and his colleagues began the design of a new stored- program computer, referred to as the IAS computer, at the Princeton Institute for Advanced Studies. The IAS computer, although not completed until 1952, is the pro- totype of all subsequent general-purpose computers. Figure 2.1 shows the general structure of the IAS computer (compare to mid- dle portion of Figure 1.4). It consists of A main memory, which stores both data and instructions1 An arithmetic and logic unit (ALU) capable of operating on binary data 1 In this book, unless otherwise noted, the term instruction refers to a machine instruction that is directly interpreted and executed by the processor, in contrast to an instruction in a high-level lan- guage, such as Ada or C++, which must first be compiled into a series of machine instructions before being executed. 2.1 / A BRIEF HISTORY OF COMPUTERS 19 Central Processing Unit (CPU) Arithmetic- logic unit (CA) I/O Main Equip- memory ment (M) (I, O) Program control unit (CC) Figure 2.1 Structure of the IAS Computer A control unit, which interprets the instructions in memory and causes them to be executed Input and output (I/O) equipment operated by the control unit This structure was outlined in von Neumann’s earlier proposal, which is worth quoting at this point [VONN45]: 2.2 First: Because the device is primarily a computer, it will have to perform the elementary operations of arithmetic most fre- quently. These are addition, subtraction, multiplication and divi- sion. It is therefore reasonable that it should contain specialized organs for just these operations. It must be observed, however, that while this principle as such is probably sound, the specific way in which it is realized re- quires close scrutiny. At any rate a central arithmetical part of the device will probably have to exist and this constitutes the first spe- cific part: CA. 2.3 Second: The logical control of the device, that is, the proper sequencing of its operations, can be most efficiently carried out by a central control organ. If the device is to be elastic, that is, as nearly as possible all purpose, then a distinction must be made be- tween the specific instructions given for and defining a particular problem, and the general control organs which see to it that these instructions—no matter what they are—are carried out. The for- mer must be stored in some way; the latter are represented by def- inite operating parts of the device. By the central control we mean this latter function only, and the organs which perform it form the second specific part: CC. 20 CHAPTER 2 / COMPUTER EVOLUTION AND PERFORMANCE 2.4 Third: Any device which is to carry out long and compli- cated sequences of operations (specifically of calculations) must have a considerable memory... (b) The instructions which govern a complicated problem may constitute considerable material, particularly so, if the code is circumstantial (which it is in most arrangements). This material must be remembered. At any rate, the total memory constitutes the third specific part of the device: M. 2.6 The three specific parts CA, CC (together C), and M cor- respond to the associative neurons in the human nervous system. It remains to discuss the equivalents of the sensory or afferent and the motor or efferent neurons. These are the input and output organs of the device. The device must be endowed with the ability to maintain input and output (sensory and motor) contact with some specific medium of this type. The medium will be called the outside record- ing medium of the device: R. 2.7 Fourth: The device must have organs to transfer... infor- mation from R into its specific parts C and M. These organs form its input, the fourth specific part: I. It will be seen that it is best to make all transfers from R (by I) into M and never directly from C. 2.8 Fifth: The device must have organs to transfer... from its specific parts C and M into R. These organs form its output, the fifth specific part: O. It will be seen that it is again best to make all trans- fers from M (by O) into R, and never directly from C. With rare exceptions, all of today’s computers have this same general structure and function and are thus referred to as von Neumann machines. Thus, it is worth- while at this point to describe briefly the operation of the IAS computer [BURK46]. Following [HAYE98], the terminology and notation of von Neumann are changed in the following to conform more closely to modern usage; the examples and illus- trations accompanying this discussion are based on that latter text. The memory of the IAS consists of 1000 storage locations, called words, of 40 binary digits (bits) each.2 Both data and instructions are stored there. Numbers are represented in binary form, and each instruction is a binary code. Figure 2.2 illustrates these formats. Each number is represented by a sign bit and a 39-bit value. A word may also contain two 20-bit instructions, with each instruction consisting of an 8-bit operation code (opcode) specifying the operation to be performed and a 12-bit address designating one of the words in memory (numbered from 0 to 999). The control unit operates the IAS by fetching instructions from memory and executing them one at a time. To explain this, a more detailed structure diagram is 2 There is no universal definition of the term word. In general, a word is an ordered set of bytes or bits that is the normal unit in which information may be stored, transmitted, or operated on within a given com- puter. Typically, if a processor has a fixed-length instruction set, then the instruction length equals the word length. 2.1 / A BRIEF HISTORY OF COMPUTERS 21 0 1 39 Sign bit (a) Number word Left instruction Right instruction 0 8 20 28 39 Opcode Address Opcode Address (b) Instruction word Figure 2.2 IAS Memory Formats needed, as indicated in Figure 2.3. This figure reveals that both the control unit and the ALU contain storage locations, called registers, defined as follows: Memory buffer register (MBR): Contains a word to be stored in memory or sent to the I/O unit, or is used to receive a word from memory or from the I/O unit. Memory address register (MAR): Specifies the address in memory of the word to be written from or read into the MBR. Instruction register (IR): Contains the 8-bit opcode instruction being exe- cuted. Instruction buffer register (IBR): Employed to hold temporarily the right- hand instruction from a word in memory. Program counter (PC): Contains the address of the next instruction-pair to be fetched from memory. Accumulator (AC) and multiplier quotient (MQ): Employed to hold tem- porarily operands and results of ALU operations. For example, the result of multiplying two 40-bit numbers is an 80-bit number; the most significant 40 bits are stored in the AC and the least significant in the MQ. The IAS operates by repetitively performing an instruction cycle, as shown in Figure 2.4. Each instruction cycle consists of two subcycles. During the fetch cycle, the opcode of the next instruction is loaded into the IR and the address portion is loaded into the MAR. This instruction may be taken from the IBR, or it can be ob- tained from memory by loading a word into the MBR, and then down to the IBR, IR, and MAR. Why the indirection? These operations are controlled by electronic circuitry and result in the use of data paths. To simplify the electronics, there is only one 22 CHAPTER 2 / COMPUTER EVOLUTION AND PERFORMANCE Arithmetic-logic unit (ALU) AC MQ Input– Arithmetic-logic output circuits equipment MBR Instructions and data IBR PC Main IR MAR memory M Control Control circuits signals Addresses Program control unit Figure 2.3 Expanded Structure of IAS Computer register that is used to specify the address in memory for a read or write and only one register used for the source or destination. Once the opcode is in the IR, the execute cycle is performed. Control circuitry in- terprets the opcode and executes the instruction by sending out the appropriate con- trol signals to cause data to be moved or an operation to be performed by the ALU. The IAS computer had a total of 21 instructions, which are listed in Table 2.1. These can be grouped as follows: Data transfer: Move data between memory and ALU registers or between two ALU registers. 2.1 / A BRIEF HISTORY OF COMPUTERS 23 Start Is next Yes No instruction MAR PC No memory in IBR? Fetch access cycle required MBR M(MAR) Left IR IBR (0:7) IR MBR (20:27) No instruction Yes IBR MBR (20:39) IR MBR (0:7) MAR IBR (8:19) MAR MBR (28:39) required? MAR MBR (8:19) PC PC + 1 Decode instruction in IR AC M(X) Go to M(X, 0:19) If AC > 0 then AC AC + M(X) go to M(X, 0:19) Execution Yes Is AC > 0? cycle MBR M(MAR) PC MAR MBR M(MAR) AC MBR AC AC + MBR M(X) = contents of memory location whose address is X (i:j) = bits i through j Figure 2.4 Partial Flowchart of IAS Operation Unconditional branch: Normally, the control unit executes instructions in se- quence from memory. This sequence can be changed by a branch instruction, which facilitates repetitive operations. Conditional branch: The branch can be made dependent on a condition, thus allowing decision points. Arithmetic: Operations performed by the ALU. Address modify: Permits addresses to be computed in the ALU and then in- serted into instructions stored in memory. This allows a program considerable addressing flexibility. 24 CHAPTER 2 / COMPUTER EVOLUTION AND PERFORMANCE Table 2.1 The IAS Instruction Set Instruction Symbolic Type Opcode Representation Description 00001010 LOAD MQ Transfer contents of register MQ to the accumulator AC 00001001 LOAD MQ,M(X) Transfer contents of memory location X to MQ 00100001 STOR M(X) Transfer contents of accumulator to memory location X Data transfer 00000001 LOAD M(X) Transfer M(X) to the accumulator 00000010 LOAD - M(X) Transfer - M(X) to the accumulator 00000011 LOAD |M(X)| Transfer absolute value of M(X) to the accumulator 00000100 LOAD - |M(X)| Transfer - |M(X)| to the accumulator Unconditional 00001101 JUMP M(X,0:19) Take next instruction from left half of M(X) branch 00001110 JUMP M(X,20:39) Take next instruction from right half of M(X) 00001111 JUMP + M(X,0:19) If number in the accumulator is nonnegative, take next in- Conditional struction from left half of M(X) branch 00010000 JUMP + M(X,20:39) If number in the accumulator is nonnegative, take next instruction from right half of M(X) 00000101 ADD M(X) Add M(X) to AC; put the result in AC 00000111 ADD |M(X)| Add |M(X)| to AC; put the result in AC 00000110 SUB M(X) Subtract M(X) from AC; put the result in AC 00001000 SUB |M(X)| Subtract |M(X)| from AC; put the remainder in AC Arithmetic 00001011 MUL M(X) Multiply M(X) by MQ; put most significant bits of result in AC, put least significant bits in MQ 00001100 DIV M(X) Divide AC by M(X); put the quotient in MQ and the remainder in AC 00010100 LSH Multiply accumulator by 2; i.e., shift left one bit position 00010101 RSH Divide accumulator by 2; i.e., shift right one position 00010010 STOR M(X,8:19) Replace left address field at M(X) by 12 rightmost bits Address of AC modify 00010011 STOR M(X,28:39) Replace right address field at M(X) by 12 rightmost bits of AC Table 2.1 presents instructions in a symbolic, easy-to-read form. Actually, each instruction must conform to the format of Figure 2.2b. The opcode portion (first 8 bits) specifies which of the 21 instructions is to be executed. The address portion (remaining 12 bits) specifies which of the 1000 memory locations is to be involved in the execution of the instruction. Figure 2.4 shows several examples of instruction execution by the control unit. Note that each operation requires several steps. Some of these are quite elaborate. The multiplication operation requires 39 suboperations, one for each bit position ex- cept that of the sign bit. COMMERCIAL COMPUTERS The 1950s saw the birth of the computer industry with two companies, Sperry and IBM, dominating the marketplace. 2.1 / A BRIEF HISTORY OF COMPUTERS 25 In 1947, Eckert and Mauchly formed the Eckert-Mauchly Computer Corpora- tion to manufacture computers commercially. Their first successful machine was the UNIVAC I (Universal Automatic Computer), which was commissioned by the Bureau of the Census for the 1950 calculations. The Eckert-Mauchly Computer Cor- poration became part of the UNIVAC division of Sperry-Rand Corporation, which went on to build a series of successor machines. The UNIVAC I was the first successful commercial computer. It was intended for both scientific and commercial applications. The first paper describing the sys- tem listed matrix algebraic computations, statistical problems, premium billings for a life insurance company, and logistical problems as a sample of the tasks it could perform. The UNIVAC II, which had greater memory capacity and higher performance than the UNIVAC I, was delivered in the late 1950s and illustrates several trends that have remained characteristic of the computer industry. First, advances in technology allow companies to continue to build larger, more powerful computers. Second, each company tries to make its new machines backward compatible3 with the older ma- chines. This means that the programs written for the older machines can be executed on the new machine. This strategy is adopted in the hopes of retaining the customer base; that is, when a customer decides to buy a newer machine, he or she is likely to get it from the same company to avoid losing the investment in programs. The UNIVAC division also began development of the 1100 series of comput- ers, which was to be its major source of revenue. This series illustrates a distinction that existed at one time. The first model, the UNIVAC 1103, and its successors for many years were primarily intended for scientific applications, involving long and complex calculations. Other companies concentrated on business applications, which involved processing large amounts of text data. This split has largely disappeared, but it was evident for a number of years. IBM, then the major manufacturer of punched-card processing equipment, de- livered its first electronic stored-program computer, the 701, in 1953. The 701 was in- tended primarily for scientific applications [BASH81]. In 1955, IBM introduced the companion 702 product, which had a number of hardware features that suited it to business applications. These were the first of a long series of 700/7000 computers that established IBM as the overwhelmingly dominant computer manufacturer. The Second Generation: Transistors The first major change in the electronic computer came with the replacement of the vacuum tube by the transistor. The transistor is smaller, cheaper, and dissipates less heat than a vacuum tube but can be used in the same way as a vacuum tube to con- struct computers. Unlike the vacuum tube, which requires wires, metal plates, a glass capsule, and a vacuum, the transistor is a solid-state device, made from silicon. The transistor was invented at Bell Labs in 1947 and by the 1950s had launched an electronic revolution. It was not until the late 1950s, however, that fully transis- torized computers were commercially available. IBM again was not the first 3 Also called downward compatible. The same concept, from the point of view of the older system, is referred to as upward compatible, or forward compatible. 26 CHAPTER 2 / COMPUTER EVOLUTION AND PERFORMANCE Table 2.2 Computer Generations Approximate Typical Speed Generation Dates Technology (operations per second) 1 1946–1957 Vacuum tube 40,000 2 1958–1964 Transistor 200,000 3 1965–1971 Small and medium scale 1,000,000 integration 4 1972–1977 Large scale integration 10,000,000 5 1978–1991 Very large scale integration 100,000,000 6 1991– Ultra large scale integration 1,000,000,000 company to deliver the new technology. NCR and, more successfully, RCA were the front-runners with some small transistor machines. IBM followed shortly with the 7000 series. The use of the transistor defines the second generation of computers. It has be- come widely accepted to classify computers into generations based on the fundamen- tal hardware technology employed (Table 2.2). Each new generation is characterized by greater processing performance, larger memory capacity, and smaller size than the previous one. But there are other changes as well. The second generation saw the introduc- tion of more complex arithmetic and logic units and control units, the use of high- level programming languages, and the provision of system software with the computer. The second generation is noteworthy also for the appearance of the Digital Equipment Corporation (DEC). DEC was founded in 1957 and, in that year, deliv- ered its first computer, the PDP-1. This computer and this company began the mini- computer phenomenon that would become so prominent in the third generation. THE IBM 7094 From the introduction of the 700 series in 1952 to the introduction of the last member of the 7000 series in 1964, this IBM product line underwent an evolution that is typical of computer products. Successive members of the product line show increased performance, increased capacity, and/or lower cost. Table 2.3 illustrates this trend. The size of main memory, in multiples of 210 36-bit words, grew from 2K (1K = 210) to 32K words,4 while the time to access one word of memory, the memory cycle time, fell from 30 ms to 1.4 ms. The number of opcodes grew from a modest 24 to 185. The final column indicates the relative execution speed of the central process- ing unit (CPU). Speed improvements are achieved by improved electronics (e.g., a transistor implementation is faster than a vacuum tube implementation) and more complex circuitry. For example, the IBM 7094 includes an Instruction Backup Reg- ister, used to buffer the next instruction. The control unit fetches two adjacent words 4 A discussion of the uses of numerical prefixes, such as kilo and giga, is contained in a supporting docu- ment at the Computer Science Student Resource Site at WilliamStallings.com/StudentSupport.html. Table 2.3 Example members of the IBM 700/7000 Series I/O Instruc- CPU Memory Cycle Number Number Hardwired Overlap tion Speed Model First Tech- Tech- Time Memory of of Index Floating- (Chan- Fetch (relative Number Delivery nology nology ( Ms) Size (K) Opcodes Registers Point nels) Overlap to 701) 701 1952 Vacuum Electrostatic 30 2–4 24 0 no no no 1 tubes tubes 704 1955 Vacuum Core 12 4–32 80 3 yes no no 2.5 tubes 709 1958 Vacuum Core 12 32 140 3 yes yes no 4 tubes 7090 1960 Transistor Core 2.18 32 169 3 yes yes no 25 7094 I 1962 Transistor Core 2 32 185 7 yes (double yes yes 30 precision) 7094 II 1964 Transistor Core 1.4 32 185 7 yes (double yes yes 50 precision) 27 28 CHAPTER 2 / COMPUTER EVOLUTION AND PERFORMANCE Mag tape units CPU Card Data punch channel Line printer Card reader Drum Multi Data plexor channel Disk Data Disk channel Hyper tapes Memory Data Teleprocessing channel equipment Figure 2.5 An IBM 7094 Configuration from memory for an instruction fetch. Except for the occurrence of a branching in- struction, which is typically infrequent, this means that the control unit has to access memory for an instruction on only half the instruction cycles. This prefetching sig- nificantly reduces the average instruction cycle time. The remainder of the columns of Table 2.3 will become clear as the text proceeds. Figure 2.5 shows a large (many peripherals) configuration for an IBM 7094, which is representative of second-generation computers [BELL71]. Several differ- ences from the IAS computer are worth noting. The most important of these is the use of data channels. A data channel is an independent I/O module with its own processor and its own instruction set. In a computer system with such devices, the CPU does not execute detailed I/O instructions. Such instructions are stored in a main memory to be executed by a special-purpose processor in the data channel it- self.The CPU initiates an I/O transfer by sending a control signal to the data channel, instructing it to execute a sequence of instructions in memory. The data channel per- forms its task independently of the CPU and signals the CPU when the operation is complete. This arrangement relieves the CPU of a considerable processing burden. Another new feature is the multiplexor, which is the central termination point for data channels, the CPU, and memory. The multiplexor schedules access to the memory from the CPU and data channels, allowing these devices to act independently. The Third Generation: Integrated Circuits A single, self-contained transistor is called a discrete component. Throughout the 1950s and early 1960s, electronic equipment was composed largely of discrete 2.1 / A BRIEF HISTORY OF COMPUTERS 29 components—transistors, resistors, capacitors, and so on. Discrete components were manufactured separately, packaged in their own containers, and soldered or wired together onto masonite-like circuit boards, which were then installed in computers, oscilloscopes, and other electronic equipment. Whenever an electronic device called for a transistor, a little tube of metal containing a pinhead-sized piece of silicon had to be soldered to a circuit board. The entire manufacturing process, from transistor to circuit board, was expensive and cumbersome. These facts of life were beginning to create problems in the computer industry. Early second-generation computers contained about 10,000 transistors. This figure grew to the hundreds of thousands, making the manufacture of newer, more power- ful machines increasingly difficult. In 1958 came the achievement that revolutionized electronics and started the era of microelectronics: the invention of the integrated circuit. It is the integrated circuit that defines the third generation of computers. In this section we provide a brief introduction to the technology of integrated circuits. Then we look at perhaps the two most important members of the third generation, both of which were intro- duced at the beginning of that era: the IBM System/360 and the DEC PDP-8. MICROELECTRONICS Microelectronics means, literally, “small electronics.” Since the beginnings of digital electronics and the computer industry, there has been a persistent and consistent trend toward the reduction in size of digital electronic cir- cuits. Before examining the implications and benefits of this trend, we need to say something about the nature of digital electronics. A more detailed discussion is found in Chapter 20. The basic elements of a digital computer, as we know, must perform storage, movement, processing, and control functions. Only two fundamental types of com- ponents are required (Figure 2.6): gates and memory cells. A gate is a device that im- plements a simple Boolean or logical function, such as IF A AND B ARE TRUE THEN C IS TRUE (AND gate). Such devices are called gates because they control data flow in much the same way that canal gates do. The memory cell is a device that can store one bit of data; that is, the device can be in one of two stable states at any time. By interconnecting large numbers of these fundamental devices, we can con- struct a computer. We can relate this to our four basic functions as follows: Data storage: Provided by memory cells. Data processing: Provided by gates. Boolean Binary Input logic Output Input storage Output function cell Read Activate Write signal (a) Gate (b) Memory cell Figure 2.6 Fundamental Computer Elements 30 CHAPTER 2 / COMPUTER EVOLUTION AND PERFORMANCE Data movement: The paths among components are used to move data from memory to memory and from memory through gates to memory. Control: The paths among components can carry control signals. For example, a gate will have one or two data inputs plus a control signal input that activates the gate. When the control signal is ON, the gate performs its function on the data inputs and produces a data output. Similarly, the memory cell will store the bit that is on its input lead when the WRITE control signal is ON and will place the bit that is in the cell on its output lead when the READ control sig- nal is ON. Thus, a computer consists of gates, memory cells, and interconnections among these elements. The gates and memory cells are, in turn, constructed of simple digi- tal electronic components. The integrated circuit exploits the fact that such components as transistors, re- sistors, and conductors can be fabricated from a semiconductor such as silicon. It is merely an extension of the solid-state art to fabricate an entire circuit in a tiny piece of silicon rather than assemble discrete components made from separate pieces of silicon into the same circuit. Many transistors can be produced at the same time on a single wafer of silicon. Equally important, these transistors can be connected with a process of metallization to form circuits. Figure 2.7 depicts the key concepts in an integrated circuit. A thin wafer of silicon is divided into a matrix of small areas, each a few millimeters square. The identical circuit pattern is fabricated in each area, and the wafer is broken up into chips. Each chip consists of many gates and/or memory cells plus a number of input and output attachment points. This chip is then packaged in housing that protects it and provides pins for attachment to devices beyond the chip. A number of these packages can then be interconnected on a printed circuit board to produce larger and more complex circuits. Initially, only a few gates or memory cells could be reliably manufactured and packaged together. These early integrated circuits are referred to as small-scale in- tegration (SSI). As time went on, it became possible to pack more and more com- ponents on the same chip. This growth in density is illustrated in Figure 2.8; it is one of the most remarkable technological trends ever recorded.5 This figure reflects the famous Moore’s law, which was propounded by Gordon Moore, cofounder of Intel, in 1965 [MOOR65]. Moore observed that the number of transistors that could be put on a single chip was doubling every year and correctly predicted that this pace would continue into the near future. To the surprise of many, including Moore, the pace continued year after year and decade after decade. The pace slowed to a doubling every 18 months in the 1970s but has sustained that rate ever since. The consequences of Moore’s law are profound: 1. The cost of a chip has remained virtually unchanged during this period of rapid growth in density. This means that the cost of computer logic and mem- ory circuitry has fallen at a dramatic rate. 5 Note that the vertical axis uses a log scale. A basic review of log scales is in the math refresher document at the Computer Science Student Support Site at WilliamStallings.com/StudentSupport.html. 2.1 / A BRIEF HISTORY OF COMPUTERS 31 Wafer Chip Gate Packaged chip Figure 2.7 Relationship among Wafer, Chip, and Gate 2. Because logic and memory elements are placed closer together on more densely packed chips, the electrical path length is shortened, increasing operating speed. 3. The computer becomes smaller, making it more convenient to place in a variety of environments. 4. There is a reduction in power and cooling requirements. 5. The interconnections on the integrated circuit are much more reliable than solder connections. With more circuitry on each chip, there are fewer interchip connections. IBM SYSTEM/360 By 1964, IBM had a firm grip on the computer market with its 7000 series of machines. In that year, IBM announced the System/360, a new family of computer products. Although the announcement itself was no surprise, it con- tained some unpleasant news for current IBM customers: the 360 product line was incompatible with older IBM machines. Thus, the transition to the 360 would be dif- ficult for the current customer base. This was a bold step by IBM, but one IBM felt 32 CHAPTER 2 / COMPUTER EVOLUTION AND PERFORMANCE 1 billion transistor CPU 109 108 107 Transistors per chip 106 105 104 103 1970 1980 1990 2000 2010 Figure 2.8 Growth in CPU Transistor Count [BOHR03] was necessary to break out of some of the constraints of the 7000 architecture and to produce a system capable of evolving with the new integrated circuit technology [PADE81, GIFF87]. The strategy paid off both financially and technically. The 360 was the success of the decade and cemented IBM as the overwhelmingly dominant computer vendor, with a market share above 70%.And, with some modifications and extensions, the architecture of the 360 remains to this day the architecture of IBM’s mainframe6 computers. Examples using this architecture can be found throughout this text. The System/360 was the industry’s first planned family of computers. The fam- ily covered a wide range of performance and cost. Table 2.4 indicates some of the key characteristics of the various models in 1965 (each member of the family is dis- tinguished by a model number). The models were compatible in the sense that a program written for one model should be capable of being executed by another model in the series, with only a difference in the time it takes to execute. The concept of a family of compatible computers was both novel and ex- tremely successful. A customer with modest requirements and a budget to match could start with the relatively inexpensive Model 30. Later, if the customer’s needs grew, it was possible to upgrade to a faster machine with more memory without 6 The term mainframe is used for the larger, most powerful computers other than supercomputers. Typical characteristics of a mainframe are that it supports a large database, has elaborate I/O hardware, and is used in a central data processing facility.