(The Morgan Kaufmann Series in Computer Architecture and Design) David A. Patterson, John L. Hennessy - Computer Organization and Design RISC-V Edition_ The Hardware Software Interface-Morgan Kaufmann-24-101-54-65.pdf
Document Details
Uploaded by MagnanimousCloisonnism
Vilnius University
Tags
Full Transcript
1.13 Historical Perspective and Further Reading 54.e1 An active field of...
1.13 Historical Perspective and Further Reading 54.e1 An active field of science is like an 1.13 Historical Perspective and Further immense anthill; the individual almost Reading vanishes into the mass of minds tumbling over For each chapter in the text, a section devoted to a historical perspective can each other, carrying be found online. We may trace the development of an idea through a series of information from place machines or describe some important projects, and we provide references in case to place, passing it you are interested in probing further. around at the speed of The historical perspective for this chapter provides a background for some of the light. key ideas presented therein. Its purpose is to give you the human story behind the Lewis Thomas, “Natural technological advances and to place achievements in their historical context. By Science,” in The Lives of learning the past, you may be better able to understand the forces that will shape a Cell, 1974 computing in the future. Each historical perspective section ends with suggestions for additional reading, which are also collected separately in the online section “Further Reading.” The First Electronic Computers J. Presper Eckert and John Mauchly at the Moore School of the University of Pennsylvania built what is widely accepted to be the world’s first operational electronic, general-purpose computer. This machine, called ENIAC (Electronic Numerical Integrator and Calculator), was funded by the United States Army and started working during World War II but was not publicly disclosed until 1946. ENIAC was a general-purpose machine used for computing artillery-firing tables. Figure e1.13.1 shows the U-shaped computer, which was 80 feet long by 8.5 feet FIGURE e1.13.1 ENIAC, the world’s first general-purpose electronic computer. 54.e2 1.13 Historical Perspective and Further Reading high and several feet wide. Each of the 20 10-digit registers was 2 feet long. In total, ENIAC used 18,000 vacuum tubes. In size, ENIAC was two orders of magnitude bigger than machines built today, yet it was more than eight orders of magnitude slower, performing 1900 additions per second. ENIAC provided conditional jumps and was programmable, clearly distinguishing it from earlier calculators. Programming was done manually by plugging cables and setting switches, and data were entered on punched cards. Programming for typical calculations required from half an hour to a whole day. ENIAC was a general-purpose machine, limited primarily by a small amount of storage and tedious programming. In 1944, John von Neumann was attracted to the ENIAC project. The group wanted to improve the way programs were entered and discussed storing programs as numbers; von Neumann helped crystallize the ideas and wrote a memo proposing a stored-program computer called EDVAC (Electronic Discrete Variable Automatic Computer). Herman Goldstine distributed the memo and put von Neumann’s name on it, much to the dismay of Eckert and Mauchly, whose names were omitted. This memo has served as the basis for the commonly used term von Neumann computer. Several early pioneers in the computer field believe that this term gives too much credit to von Neumann, who wrote up the ideas, and too little to the engineers, Eckert and Mauchly, who worked on the machines. For this reason, the term does not appear elsewhere in this book or in the online sections. In 1946, Maurice Wilkes of Cambridge University visited the Moore School to attend the latter part of a series of lectures on developments in electronic computers. When he returned to Cambridge, Wilkes decided to embark on a project to build a stored-program computer named EDSAC (Electronic Delay Storage Automatic Calculator). EDSAC started working in 1949 and was the world’s first full-scale, operational, stored-program computer [Wilkes, 1985]. (A small prototype called the Mark-I, built at the University of Manchester in 1948, might be called the first operational stored-program machine.) Section 2.5 explains the stored-program concept. In 1947, Eckert and Mauchly applied for a patent on electronic computers. The dean of the Moore School demanded that the patent be turned over to the university, which may have helped Eckert and Mauchly conclude that they should leave. Their departure crippled the EDVAC project, delaying completion until 1952. Goldstine left to join von Neumann at the Institute for Advanced Study (IAS) at Princeton in 1946. Together with Arthur Burks, they issued a report based on the memo written earlier [Burks et al., 1946]. The paper was incredible for the period; reading it today, you would never guess this landmark paper was written more than 70 years ago, because it discusses most of the architectural concepts seen in modern computers. (We quote from that text liberally in Chapter 2.) This paper led to the IAS machine built by Julian Bigelow. It had a total of 1024 40-bit words and was roughly 10 times faster than ENIAC. The group thought about uses for 1.13 Historical Perspective and Further Reading 54.e3 the machine, published a set of reports, and encouraged visitors. These reports and visitors inspired the development of a number of new computers. Recently, there has been some controversy about the work of John Atanasoff, who built a small-scale electronic computer in the early 1940s. His machine, designed at Iowa State University, was a special-purpose computer that was never completely operational. Mauchly briefly visited Atanasoff before he built ENIAC. The presence of the Atanasoff machine, together with delays in filing the ENIAC patents (the work was classified and patents could not be filed until after the war) and the distribution of von Neumann’s EDVAC paper, was used to break the Eckert- Mauchly patent. Though controversy still rages over Atanasoff ’s role, Eckert and Mauchly are usually given credit for building the first working, general-purpose, electronic computer [Stern, 1980]. Another pioneering computer that deserves credit was a special-purpose machine built by Konrad Zuse in Germany in the late 1930s and early 1940s. Although Zuse had the design for a programmable computer ready, the German government decided not to fund scientific investigations taking more than 2 years because the bureaucrats expected the war would be won by that deadline. Across the English Channel, during World War II, special-purpose electronic computers were built to decrypt intercepted German messages. A team at Bletchley Park, including Alan Turing, built the Colossus in 1943. The machines were kept secret until 1970; after the war, the group had little impact on commercial British computers. While work on ENIAC went forward, Howard Aiken was building an electro- mechanical computer called the Mark-I at Harvard (a name that Manchester later adopted for its machine). He followed the Mark-I with a relay machine, the Mark-II, and a pair of vacuum tube machines, the Mark-III and Mark-IV. In contrast to earlier machines like EDSAC, which used a single memory for instructions and data, the Mark-III and Mark-IV had separate memories for instructions and data. The machines were regarded as reactionary by the advocates of stored-program computers; the term Harvard architecture was coined to describe machines with distinct memories. Paying respect to history, this term is used today in a different sense to describe machines with a single main memory but with separate caches for instructions and data. The Whirlwind project was begun at MIT in 1947 and was aimed at applications in real-time radar signal processing. Although it led to several inventions, its most important innovation was magnetic core memory. Whirlwind had 2048 16-bit words of magnetic core. Magnetic cores served as the main memory technology for nearly 30 years. Commercial Developments In December 1947, Eckert and Mauchly formed Eckert-Mauchly Computer Corporation. Their first machine, the BINAC, was built for Northrop and was shown in August 1949. After some financial difficulties, their firm was acquired by Remington-Rand, where they built the UNIVAC I (Universal Automatic 54.e4 1.13 Historical Perspective and Further Reading Computer), designed to be sold as a general-purpose computer (Figure e1.13.2). Originally delivered in June 1951, UNIVAC I sold for about $1 million and was the first successful commercial computer—48 systems were built! This early computer, along with many other fascinating pieces of computer lore, may be seen at the Computer History Museum in Mountain View, California. FIGURE e1.13.2 UNIVAC I, the first commercial computer in the United States. It correctly predicted the outcome of the 1952 presidential election, but its initial forecast was withheld from broadcast because experts doubted the use of such early results. IBM had been in the punched card and office automation business but didn’t start building computers until 1950. The first IBM computer, the IBM 701, shipped in 1952, and eventually 19 units were sold. In the early 1950s, many people were pessimistic about the future of computers, believing that the market and opportunities for these “highly specialized” machines were quite limited. In 1964, after investing $5 billion, IBM made a bold move with the announcement of the System/360. An IBM spokesman said the following at the time: We are not at all humble in this announcement. This is the most important product announcement that this corporation has ever made in its history. It’s not a computer in any previous sense. It’s not a product, but a line of products … that spans in performance from the very low part of the computer line to the very high. 1.13 Historical Perspective and Further Reading 54.e5 FIGURE e1.13.3 IBM System/360 computers: models 40, 50, 65, and 75 were all introduced in 1964. These four models varied in cost and performance by a factor of almost 10; it grows to 25 if we include models 20 and 30 (not shown). The clock rate, range of memory sizes, and approximate price for only the processor and memory of average size: (a) model 40, 1.6 MHz, 32 KB–256 KB, $225,000; (b) model 50, 2.0 MHz, 128 KB–256 KB, $550,000; (c) model 65, 5.0 MHz, 256 KB–1 MB, $1,200,000; and (d) model 75, 5.1 MHz, 256 KB–1 MB, $1,900,000. Adding I/O devices typically increased the price by factors of 1.8 to 3.5, with higher factors for cheaper models. Moving the idea of the architecture abstraction into commercial reality, IBM announced six implementations of the System/360 architecture that varied in price and performance by a factor of 25. Figure e1.13.3 shows four of these models. IBM bet its company on the success of a computer family, and IBM won. The System/360 and its successors dominated the large computer market. Its descendants are still at $10 billion annual business for IBM, making the IBM System/360 the oldest surviving instruction set architecture. Fred Brooks, Jr., won the 1999 ACM A. M. Turing Award in part for leading the IBM System/360 Project. About a year later, Digital Equipment Corporation (DEC) unveiled the PDP-8, the first commercial minicomputer, in 1965. This small machine was a breakthrough in low-cost design, allowing DEC to offer a computer for under $20,000. Minicomputers were the forerunners of microprocessors, with Intel inventing the first microprocessor in 1971—the Intel 4004. 54.e6 1.13 Historical Perspective and Further Reading In 1963 came the announcement of the first supercomputer. This announcement came neither from the large companies nor even from the high-tech centers. Seymour Cray led the design of the Control Data Corporation CDC 6600 in Minnesota. This machine included many ideas that are beginning to be found in the latest microprocessors. Cray later left CDC to form Cray Research, Inc., in Wisconsin. In 1976, he announced the Cray-1 (Figure e1.13.4). This machine was simultaneously the fastest in the world, the most expensive, and the computer with the best cost/performance for scientific programs. The Cray-1 would be on any list of the greatest computers of all time, although the total lifetime sales was a little over 80 supercomputer. FIGURE e1.13.4 Cray-1, the first commercial vector supercomputer, announced in 1976. This machine had the unusual distinction of being both the fastest computer for scientific applications and the computer with the best price/performance for those applications. Viewed from the top, the computer looks like the letter C. Seymour Cray passed away in 1996 because of injuries sustained in an automobile accident. At the time of his death, this 70-year-old computer pioneer was working on his vision of the next generation of supercomputers. (See www.cray.com for more details.) While Seymour Cray was creating the world’s most expensive computer, other designers around the world were looking at using the microprocessor to create a computer so cheap that you could have it at home. There is no single fountainhead for the personal computer, but in 1977, the Apple IIe (Figure e1.13.5) from Steve Jobs and Steve Wozniak set standards for low cost, high volume, and high reliability that defined the personal computer industry. Between 1977 and 1993, Apple produced about six million of these computers. 1.13 Historical Perspective and Further Reading 54.e7 FIGURE e1.13.5 The Apple IIc Plus. Designed by Steve Wozniak, the Apple IIc set standards of cost and reliability for the industry. However, even with a 4-year head start, Apple’s personal computers finished second in popularity. The IBM Personal Computer, announced in 1981, became the best-selling computer of any kind; its success gave Intel the most popular microprocessor and Microsoft the most popular operating system. In the next decade, the most popular CD was the Microsoft operating system, even though it costs many times more than a music CD! Of course, over the more than 30 years that the IBM-compatible personal computer has existed, it has evolved greatly. In fact, the first personal computers had 16-bit processors and 64 kilobytes of memory, and a low-density, slow floppy disk was the only nonvolatile storage! Floppy disks were originally developed by IBM for loading diagnostic programs in mainframes, but were a major I/O device in personal computers for almost 20 years before the advent of CDs and networking made them obsolete as a method for exchanging data. Naturally, Intel microprocessors have also evolved since the first PC, which used a 16-bit processor with an 8-bit external interface! In Chapter 2, we write about the evolution of the Intel architecture. The first personal computers were quite simple, with little or no graphics capability, no pointing devices, and primitive operating systems compared to those of today. The computer that inspired many of the architectural and software concepts that characterize the modern desktop machines was the Xerox Alto, shown in Figure e1.13.6. The Alto was created as an experimental prototype of a future computer; there were several hundred Altos built, including a significant 54.e8 1.13 Historical Perspective and Further Reading FIGURE e1.13.6 The Xerox Alto was the primary inspiration for the modern desktop computer. It included a mouse, a bit-mapped scheme, a Windows-based user interface, and a local network connection. number that were donated to universities. Among the technologies incorporated in the Alto were: a bit-mapped graphics display integrated with a computer (earlier graphics displays acted as terminals, usually connected to larger computers) a mouse, which was invented earlier, but included on every Alto and used extensively in the user interface a local area network (LAN), which became the precursor to the Ethernet a user interface based on Windows and featuring a WYSIWYG (what you see is what you get) editor and interactive drawing programs 1.13 Historical Perspective and Further Reading 54.e9 In addition, both file servers and print servers were developed and interfaced via the local area network, and connections between the local area network and the wide area ARPAnet produced the first versions of Internet-style networking. The Xerox Alto was incredibly influential and clearly affected the design of a wide variety of computers and software systems, including the Apple Macintosh, the IBM-compatible PC, MacOS and Windows, and Sun and other early workstations. Like the Cray-1, the Alto would be on any list of the greatest computers of all time, despite its total production being only about 2000. Chuck Thacker won the 2009 ACM A. M. Turing Award primarily for creating the Alto. Measuring Performance From the earliest days of computing, designers have specified performance goals— ENIAC was to be 1000 times faster than the Harvard Mark-I, and the IBM Stretch (7030) was to be 100 times faster than the fastest computer then in existence. What wasn’t clear, though, was how this performance was to be measured. The original measure of performance was the time required to perform an individual operation, such as addition. Since most instructions took the same execution time, the timing of one was the same as the others. As the execution times of instructions in a computer became more diverse, however, the time required for one operation was no longer useful for comparisons. To consider these differences, an instruction mix was calculated by measuring the relative frequency of instructions in a computer across many programs. Multiplying the time for each instruction by its weight in the mix gave the user the average instruction execution time. (If measured in clock cycles, average instruction execution time is the same as average CPI.) Since instruction sets were similar, this was a more precise comparison than add times. From average instruction execution time, then, it was only a small step to MIPS. MIPS had the virtue of being easy to understand; hence, it grew in popularity. The Quest for an Average Program As processors were becoming more sophisticated and relied on memory hierarchies (the topic of Chapter 5) and pipelining (the topic of Chapter 4), a single execution time for each instruction no longer existed; neither execution time nor MIPS, therefore, could be calculated from the instruction mix and the manual. Although it might seem obvious today that the right thing to do would have been to develop a set of real applications that could be used as standard benchmarks, this was a difficult task until the late 1980s. Variations in operating systems and language standards made it hard to create large programs that could be moved from computer to computer simply by recompiling. Instead, the next step was benchmarking using synthetic programs. The Whetstone synthetic program was created by measuring scientific programs written in Algol-60 (see Curnow and Wichmann’s description). This 54.e10 1.13 Historical Perspective and Further Reading program was converted to Fortran and was widely used to characterize scientific program performance. Whetstone performance is typically quoted in Whetstones per second—the number of executions of a single iteration of the Whetstone benchmark! Dhrystone is another synthetic benchmark that is still used in some embedded computing circles (see Weicker’s description and methodology). About the same time Whetstone was developed, the concept of kernel benchmarks gained popularity. Kernels are small, time-intensive pieces from real programs that are extracted and then used as benchmarks. This approach was developed primarily for benchmarking high-end computers, especially supercomputers. Livermore Loops and Linpack are the best-known examples. The Livermore Loops consist of a series of 21 small loop fragments. Linpack consists of a portion of a linear algebra subroutine package. Kernels are best used to isolate the performance of individual features of a computer and to explain the reasons for differences in the performance of real programs. Because scientific applications often use small pieces of code that execute for a long time, characterizing performance with kernels is most popular in this application class. Although kernels help illuminate performance, they frequently overstate the performance on real applications. SPECulating about Performance An important advance in performance evaluation was the formation of the System Performance Evaluation Cooperative (SPEC) group in 1988. SPEC comprises representatives of many computer companies—the founders being Apollo/ Hewlett-Packard, DEC, MIPS, and Sun—who have agreed on a set of real programs and inputs that all will run. It is worth noting that SPEC couldn’t have come into being before portable operating systems and the popularity of high-level languages. Now compilers, too, are accepted as a proper part of the performance of computer systems and must be measured in any evaluation. History teaches us that while the SPEC effort may be useful with current computers, it will not meet the needs of the next generation without changing. In 1991, a throughput measure was added, based on running multiple versions of the benchmark. It is most useful for evaluating timeshared usage of a uniprocessor or a multiprocessor. Other system benchmarks that include OS-intensive and I/O-intensive activities have also been added. Another change was the decision to drop some benchmarks and add others. One result of the difficulty in finding benchmarks was that the initial version of the SPEC benchmarks (called SPEC89) contained six floating-point benchmarks but only four integer benchmarks. Calculating a single summary measurement using the geometric mean of execution times normalized to a VAX-11/780 meant that this measure favored computers with strong floating-point performance. 1.13 Historical Perspective and Further Reading 54.e11 In 1992, a new benchmark set (called SPEC92) was introduced. It incorporated additional benchmarks, dropped matrix300, and provided separate means (SPEC INT and SPECFP) for integer and floating-point programs. In addition, the SPECbase measure, which disallows program-specific optimization flags, was added to provide users with a performance measurement that would more closely match what they might experience on their own programs. The SPECFP numbers show the largest increase versus the base SPECFP measurement, typically ranging from 15% to 30% higher. In 1995, the benchmark set was once again updated, adding some new integer and floating-point benchmarks, as well as removing some benchmarks that suffered from flaws or had running times that had become too small given the factor of 20 or more performance improvement since the first SPEC release. SPEC95 also changed the base computer for normalization to a Sun SPARC Station 10/40, since operating versions of the original base computer were becoming difficult to find! The most recent version of SPEC is SPEC2017. SPEC CPU needed 82 programs over its six generations, with 47 (57%) used just one generation and only 6 (7%) lasting three or more. The sole survivor from SPEC89 is the gcc compiler. SPEC has also added benchmark suites beyond the original suites targeted at CPU performance. In 2008, SPEC provided benchmark sets for graphics, high- performance scientific computing, object-oriented computing, file systems, Web servers and clients, Java, engineering CAD applications, and power. The Growth of Embedded Computing Embedded processors have been around for a very long time; in fact, the first minicomputers and the first microprocessors were originally developed for controlling functions in a laboratory or industrial application. For many years, the dominant use of embedded processors was for industrial control applications, and although this use continued to grow, the processors tended to be very cheap and the performance relatively low. For example, the best-selling processor in the world remains an 8-bit micro controller used in cars, some home appliances, and other simple applications. The late 1980s and early 1990s saw the emergence of new opportunities for embedded processors, ranging from more advanced video games and set-top boxes to cell phones and personal digital assistants. The rapidly increasing number of information appliances and the growth of networking have driven dramatic surges in the number of embedded processors, as well as the performance requirements. To evaluate performance, the embedded community was inspired by SPEC to create the Embedded Microprocessor Benchmark Consortium (EEMBC). Started in 1997, it consists of a collection of kernels organized into suites that address different portions of the embedded industry. They announced the second generation of these benchmarks in 2007. In 2019, a consortium of academics and practitioners developed a suite of free programs called Embench with the explicit goal of replacing widespread use of synthetic programs like Dhrystone and CoreMarks as benchmarks for embedded computing. 54.e12 1.13 Historical Perspective and Further Reading A Half-Century of Progress Since 1951, there have been thousands of new computers using a wide range of technologies and having widely varying capabilities. Figure e1.13.7 summarizes the key characteristics of some machines mentioned in this section and shows the dramatic changes that have occurred in just over 50 years. After adjusting for inflation, price/performance has improved by almost 100 billion in 55 years, or about 58% per year. Another way to say it is we’ve seen a factor of 10,000 improvement in cost and a factor of 10,000,000 improvement in performance. Adjusted Price/ Adjusted price/ Size Power Performance Memory performance price performance Year Name (cu. ft.) (watts) (adds/sec) (KB) Price vs. UNIVAC (2007 $) vs. UNIVAC 1951 UNIVAC I 1000 125,000 2000 48 $1,000,000 0,000,001 $7,670,724 00,000,001 1964 IBM S/360 60 10,000 500,000 64 $1,000,000 0,000,263 $6,018,798 00,000,319 model 50 1965 PDP-8 8 500 330,000 4 0,0$16,000 0,010,855 0,0$94,685 00,013,367 1976 Cray-1 58 60,000 166,000,000 32,000 $4,000,000 0,021,842 $13,509,798 00,047,127 1981 IBM PC 1 000,150 240,000 256 0,00 $3000 0,042,105 0,00 $6859 00,134,208 1991 HP 9000/ 2 000,500 50,000,000 16,384 0,00 $7400 3,556,188 0,00$11,807 16,241,889 model 750 1996 Intel PPro 2 000,500 400,000,000 16,384 0,00 $4400 47,846,890 $6211 247,021,234 PC (200 MHz) 2003 Intel Pentium 4 2 500 6,000,000,000 262,144 $1600 1,875,000,000 $2009 11,451,750,000 PC (3.0 GHz) 2007 AMD Barcelona 2 250 20,000,000,000 2,097,152 $800 12,500,000,000 $800 95,884,051,042 PC (2.5 GHz) FIGURE e1.13.7 Characteristics of key commercial computers since 1950, in actual dollars and in 2007 dollars adjusted for inflation. The last row assumes we can fully utilize the potential performance of the four cores in Barcelona. In contrast to Figure e1.13.3, here the price of the IBM S/360 model 50 includes I/O devices. (Source: The Computer History Museum and Producer Price Index for Industrial Commodities.) Readers interested in computer history should consult Annals of the History of Computing, a journal devoted to the history of computing. Several books describing the early days of computing have also appeared, many written by the pioneers including Goldstine , Metropolis et al. , and Wilkes.