Computer Organization and Design RISC-V Edition PDF

Document Details

Uploaded by Deleted User

David A. Patterson, John L. Hennessy

Summary

This textbook, Computer Organization and Design RISC-V Edition, by David A. Patterson and John L. Hennessy, covers computer architecture and design. It details the hardware/software interface, explaining concepts like compilers, assembly language, and high-level programming languages.

Full Transcript

16 Chapter 1 Computer Abstractions and Technology A compiler enables a programmer to write this high-level language expression: A + B The compiler would compile it into this as...

16 Chapter 1 Computer Abstractions and Technology A compiler enables a programmer to write this high-level language expression: A + B The compiler would compile it into this assembly language statement: add A, B As shown above, the assembler would translate this statement into the binary instructions that tell the computer to add the two numbers A and B. High-level programming languages offer several important benefits. First, they allow the programmer to think in a more natural language, using English words and algebraic notation, resulting in programs that look much more like text than like tables of cryptic symbols (see Figure 1.4). Moreover, they allow languages to be designed according to their intended use. Hence, Fortran was designed for scientific computation, Cobol for business data processing, Lisp for symbol manipulation, and so on. There are also domain-specific languages for even narrower groups of users, such as those interested in machine learning, for example. The second advantage of programming languages is improved programmer productivity. One of the few areas of widespread agreement in software development is that it takes less time to develop programs when they are written in languages that require fewer lines to express an idea. Conciseness is a clear advantage of high- level languages over assembly language. The final advantage is that programming languages allow programs to be independent of the computer on which they were developed, since compilers and assemblers can translate high-level language programs to the binary instructions of any computer. These three advantages are so strong that today little programming is done in assembly language. 1.4 Under the Covers Now that we have looked below your program to uncover the underlying software, let’s open the covers of your computer to learn about the underlying hardware. The input device underlying hardware in any computer performs the same basic functions: inputting A mechanism through data, outputting data, processing data, and storing data. How these functions are which the computer is performed is the primary topic of this book, and subsequent chapters deal with fed information, such as a different parts of these four tasks. keyboard. When we come to an important point in this book, a point so significant that output device we hope you will remember it forever, we emphasize it by identifying it as a Big A mechanism that Picture item. We have about a dozen Big Pictures in this book, the first being the conveys the result of a five components of a computer that perform the tasks of inputting, outputting, computation to a user, processing, and storing data. such as a display, or to Two key components of computers are input devices, such as the microphone, another computer. and output devices, such as the speaker. As the names suggest, input feeds the 1.4 Under the Covers 17 computer, and output is the result of computation sent to the user. Some devices, such as wireless networks, provide both input and output to the computer. Chapters 5 and 6 describe input/output (I/O) devices in more detail, but let’s take an introductory tour through the computer hardware, starting with the external I/O devices. The five classic components of a computer are input, output, memory, datapath, and control, with the last two sometimes combined and called the processor. Figure 1.5 shows the standard organization of a computer. This organization is independent of hardware technology: you can place every piece of every computer, past and present, into one of these five TheBIG categories. To help you keep all this in perspective, the five components of Picture a computer are shown on the front page of each of the following chapters, with the portion of interest to that chapter highlighted. FIGURE 1.5 The organization of a computer, showing the five classic components. The processor gets instructions and data from memory. Input writes data to memory, and output reads data from memory. Control sends the signals that determine the operations of the datapath, memory, input, and output. 18 Chapter 1 Computer Abstractions and Technology Through the Looking Glass liquid crystal display The most fascinating I/O device is probably the graphics display. Most personal (LCD) A display mobile devices use liquid crystal displays (LCDs) to get a thin, low-power display. technology using a thin The LCD is not the source of light; instead, it controls the transmission of light. layer of liquid polymers A typical LCD includes rod-shaped molecules in a liquid that form a twisting that can be used to helix that bends light entering the display, from either a light source behind the transmit or block light according to whether a display or less often from reflected light. The rods straighten out when a current is charge is applied. applied and no longer bend the light. Since the liquid crystal material is between two screens polarized at 90 degrees, the light cannot pass through unless it is bent. active matrix display Today, most LCDs use an active matrix that has a tiny transistor switch at each A liquid crystal display pixel to control current precisely and make sharper images. A red-green-blue using a transistor to control the transmission mask associated with each dot on the display determines the intensity of the three- of light at each individual color components in the final image; in a color active matrix LCD, there are three pixel. transistor switches at each point. The image is composed of a matrix of picture elements, or pixels, which can pixel The smallest be represented as a matrix of bits, called a bit map. Depending on the size of the individual picture screen and the resolution, the display matrix in a typical tablet ranges in size from element. Screens are composed of hundreds 1024 × 768 to 2048 × 1536. A color display might use 8 bits for each of the three of thousands to millions colors (red, blue, and green), for 24 bits per pixel, permitting millions of different of pixels, organized in a colors to be displayed. matrix. The computer hardware support for graphics consists mainly of a raster refresh buffer, or frame buffer, to store the bit map. The image to be represented onscreen Through computer is stored in the frame buffer, and the bit pattern per pixel is read out to the graphics displays I have landed display at the refresh rate. Figure 1.6 shows a frame buffer with a simplified design an airplane on the of just 4 bits per pixel. deck of a moving The goal of the bit map is to represent faithfully what is on the screen. The carrier, observed a challenges in graphics systems arise because the human eye is very good at detecting nuclear particle hit a even subtle changes on the screen. potential well, flown in a rocket at nearly the speed of light and watched a computer Frame buffer reveal its innermost workings. Raster scan CRT display Ivan Sutherland, the “father” of computer 1 01 graphics, Scientific Y0 0 1 Y0 American, 1984 10 Y1 1 Y1 X0 X1 X0 X1 FIGURE 1.6 Each coordinate in the frame buffer on the left determines the shade of the corresponding coordinate for the raster scan CRT display on the right. Pixel (X0, Y0) contains the bit pattern 0011, which is a lighter shade on the screen than the bit pattern 1101 in pixel (X1, Y1). 1.4 Under the Covers 19 Touchscreen While PCs also use LCDs, the tablets and smartphones of the post-PC era have replaced the keyboard and mouse with touch-sensitive displays, which has the wonderful user interface advantage of users pointing directly at what they are interested in rather than indirectly with a mouse. While there are a variety of ways to implement a touch screen, many tablets today use capacitive sensing. Since people are electrical conductors, if an insulator like glass is covered with a transparent conductor, touching distorts the electrostatic field of the screen, which results in a change in capacitance. This technology can allow multiple touches simultaneously, which recognizes gestures that can lead to attractive user interfaces. Opening the Box Figure 1.7 shows the contents of the Apple iPhone XS Max smart phone. Unsurprisingly, integrated circuit Also of the five classic components of the computer, I/O dominates this device. The called a chip. A device list of I/O devices includes a capacitive multitouch LCD display, front-facing combining dozens to camera, rear-facing camera, microphone, headphone jack, speakers, accelerometer, millions of transistors. FIGURE 1.7 Components of the Apple iPhone XS Max cell phone. At the left is the capacitive multitouch screen and LCD display. Next to it is the battery. To the far right is the metal frame that attaches the LCD to the back of the iPhone. The small components in the center are what we think of as the computer; they are not simple rectangles to fit compactly inside the case next to the battery. Figure 1.8 shows a close-up of the board to the left of the metal case, which is the logic printed circuit board that contains the processor and memory. (Courtesy TechIngishts, www.techIngishts.com) 20 Chapter 1 Computer Abstractions and Technology central processor unit gyroscope, Wi-Fi network, and Bluetooth network. The datapath, control, and (CPU) Also called memory are a tiny portion of the components. processor. The active part The small rectangles in Figure 1.8 contain the devices that drive our advancing of the computer, which contains the datapath and technology, called integrated circuits and nicknamed chips. The A12 package seen control and which adds in the middle of in Figure 1.8 contains two large and four little ARM processors numbers, tests numbers, that operate with a clock rate of 2.5 GHz. The processor is the active part of the signals I/O devices to computer, following the instructions of a program to the letter. It adds numbers, activate, and so on. tests numbers, signals I/O devices to activate, and so on. Occasionally, people call the processor the CPU, for the more bureaucratic-sounding central processor unit. datapath The Descending even lower into the hardware, Figure 1.9 reveals details of a component of the microprocessor. The processor logically comprises two main components: processor that performs datapath and control, the respective brawn and brain of the processor. The arithmetic operations. datapath performs the arithmetic operations, and control tells the datapath, Apple 338S00456 PMIC Apple A12 APL1W81 + Micron MT53D512M64D4SB-046 XT:E 4GB Mobile LPDDR4x SDRAM STMicroelectonics Apple STB601A0 PMIC 338S00375 PMIC Texas Instruments Apple 338S00411 SN2600B1 Audio Amplifier Battery Charger FIGURE 1.8 The logic board of Apple iPhone XS Max in Figure 1.7. The large integrated circuit in the middle is the Apple A12 chip, which contains two large and four small ARM processor cores that run at 2.5 GHz, as well as 2 GiB of main memory inside the package. Figure 1.9 shows a photograph of the processor chip inside the A12 package. A similar-sized chip on a symmetric board that attaches to the back is a 64 GiB flash memory chip for nonvolatile storage. The other chips on the board include the power management integrated controller and audio amplifier chips. (Courtesy TechIngishts, www.techIngishts.com) 1.4 Under the Covers 21 memory, and I/O devices what to do according to the wishes of the instructions of the program. Chapter 4 explains the datapath and control for a higher- control The component performance design. of the processor that The iPhone XS Max package in Figure 1.8 also includes a memory chip with commands the datapath, memory, and I/O 32 gibibits or 2 GiB of capacity. The memory is where the programs are kept devices according to when they are running; it also contains the data needed by running programs. the instructions of the The memory is a DRAM chip. DRAM stands for dynamic random-access program. memory. DRAMs are used together to contain the instructions and data of a program. In contrast to sequential-access memory, such as magnetic tapes, the memory The storage RAM portion of the term DRAM means that memory accesses take basically area in which programs the same amount of time no matter what portion of memory is read. are kept when they are running and that contains Descending into the depths of any component of the hardware reveals the data needed by the insights into the computer. Inside the processor is another type of memory— running programs. cache memory. Cache memory consists of small, fast memory that acts as a DRAM buffer. (The nontechnical definition of cache is a safe place for hiding dynamic random access things.) Cache is built using a different memory technology, static random- memory (DRAM) access memory (SRAM). SRAM is faster but less dense, and hence more Memory built as an expensive, than DRAM (see Chapter 5). SRAM and DRAM are two layers of integrated circuit; it the memory hierarchy. provides random access to any location. Access times As mentioned above, one of the great ideas to improve design is abstraction. are 50 nanoseconds and One of the most important abstractions is the interface between the hardware cost per gigabyte in 2012 and the lowest-level software. Software communicates to hardware via a vocabulary. was $5 to $10. The words of the vocabulary are called instructions, and the vocabulary iteself is called the instruction set architecture, or simply architecture, of a computer. The instruction set architecture includes anything programmers need to know to make a binary machine language program work correctly, including instructions, I/O devices, and so on. Typically, the operating system will encapsulate the details of doing I/O, allocating memory, and other low-level system functions so that application programmers do not need to worry about such details. The combination of the basic instruction set and the operating system interface provided for application programmers is called the application binary interface (ABI). cache memory A small, An instruction set architecture allows computer designers to talk about fast memory that acts as a functions independently from the hardware that performs them. For example, buffer for a slower, larger we can talk about the functions of a digital clock (keeping time, displaying the memory. time, setting the alarm) separately from the clock hardware (quartz crystal, LED displays, plastic buttons). Computer designers distinguish architecture from an static random access implementation of an architecture along the same lines: an implementation memory (SRAM) Also is hardware that obeys the architecture abstraction. These ideas bring us to memory built as an another Big Picture. integrated circuit, but faster and less dense than DRAM. 22 Chapter 1 Computer Abstractions and Technology instruction set architecture Also called architecture. An abstract interface between the hardware and the lowest-level software that encompasses all the information necessary to write a machine language program that will run correctly, including instructions, registers, memory access, I/O, and so on. application binary interface (ABI) The user portion of the instruction set plus the operating system interfaces used by application programmers. It defines a standard for binary portability across computers. implementation Hardware that obeys the architecture abstraction. FIGURE 1.9 The processor integrated circuit inside the A12 package. The size of chip is 8.4 by 9.91 mm, and it was manufactured originally in a 7-nm process (see Section 1.5). It has two identical ARM processors or cores in the lower middle of the chip, four small cores on the lower right of the chip, a graphics processing unit (GPU) on the far right (see Section 6.6), and a domain-specific accelerator for neural networks (see Section 6.7) called the NPU on the far left. In the middle are second-level cache memory (L2) banks for the big and small cores (see Chapter 5). At the top and bottom of the chip are interfaces to the main memory (DDR DRAM). (Courtesy TechInsights, www.techinsights.com) The BIG Both hardware and software consist of hierarchical layers using abstraction, Picture with each lower layer hiding details from the level above. One key interface between the levels of abstraction is the instruction set architecture—the interface between the hardware and low-level software. This abstract interface enables many implementations of varying cost and performance to run identical software. 1.4 Under the Covers 23 A Safe Place for Data volatile memory Storage, such as DRAM, Thus far, we have seen how to input data, compute using the data, and display that retains data only if it data. If we were to lose power to the computer, however, everything would be is receiving power. lost because the memory inside the computer is volatile—that is, when it loses power, it forgets. In contrast, a DVD disk doesn’t forget the movie when you nonvolatile memory turn off the power to the DVD player, and is therefore a nonvolatile memory A form of memory that retains data even in the technology. absence of a power source To distinguish between the volatile memory used to hold data and programs and that is used to store while they are running and this nonvolatile memory used to store data and programs between runs. A DVD disk is nonvolatile. programs between runs, the term main memory or primary memory is used for the former, and secondary memory for the latter. Secondary memory forms the next lower layer of the memory hierarchy. DRAMs have dominated main memory since 1975, but magnetic disks dominated secondary memory starting even earlier. Because of their size and form factor, personal mobile devices use flash memory, a nonvolatile semiconductor memory, instead of disks. Figure 1.8 shows the chip containing the 64 GiB flash memory of the iPhone Xs. While slower than DRAM, it is much cheaper than DRAM in addition to being nonvolatile. Although costing more per bit than disks, it is smaller, it comes in much smaller capacities, it is more rugged, and it is more power efficient than disks. Hence, flash memory is the standard secondary memory for PMDs. Alas, main memory Also unlike disks and DRAM, flash memory bits wear out after 100,000 to 1,000,000 called primary memory. Memory used to hold writes. Thus, file systems must keep track of the number of writes and have a programs while they are strategy to avoid wearing out storage, such as by moving popular data. Chapter 5 running; typically consists describes disks and flash memory in more detail. of DRAM in today’s computers. Communicating with Other Computers secondary memory We’ve explained how we can input, compute, display, and save data, but there is Nonvolatile memory still one missing item found in today’s computers: computer networks. Just as the used to store programs and data between runs; processor shown in Figure 1.5 is connected to memory and I/O devices, networks typically consists of flash interconnect whole computers, allowing computer users to extend the power of memory in PMDs and computing by including communication. Networks have become so popular that magnetic disks in servers. they are the backbone of current computer systems; a new personal mobile device or server without a network interface would be ridiculed. Networked computers have magnetic disk Also several major advantages: called hard disk. A form of nonvolatile secondary Communication: Information is exchanged between computers at high memory composed of speeds. rotating platters coated with a magnetic recording Resource sharing: Rather than each computer having its own I/O devices, material. Because they computers on the network can share I/O devices. are rotating mechanical devices, access times are Nonlocal access: By connecting computers over long distances, users need not about 5 to 20 milliseconds and cost per gigabyte in be near the computer they are using. 2020 was $0.01 to $0.02. 24 Chapter 1 Computer Abstractions and Technology flash memory Networks vary in length and performance, with the cost of communication A nonvolatile semi- increasing according to both the speed of communication and the distance that conductor memory. It is cheaper and slower information travels. Perhaps the most popular type of network is Ethernet. It than DRAM but more can be up to a kilometer long and transfer at up to 100 gigabits per second. Its expensive per bit and length and speed make Ethernet useful to connect computers on the same floor faster than magnetic disks. of a building; hence, it is an example of what is generically called a local area Access times are about 5 to 50 microseconds and network. Local area networks are interconnected with switches that can also cost per gigabyte in 2020 provide routing services and security. Wide area networks cross continents was $0.06 to $0.12. and are the backbone of the Internet, which supports the web. They are local area network typically based on optical fibers and are leased from telecommunication (LAN) A network companies. designed to carry data Networks have changed the face of computing in the last 40 years, both by within a geographically becoming much more ubiquitous and by making dramatic increases in performance. confined area, typically In the 1970s, very few individuals had access to electronic mail, the Internet and within a single building. web did not exist, and physically mailing magnetic tapes was the primary way to transfer large amounts of data between two locations. Local area networks were wide area network almost nonexistent, and the few existing wide area networks had limited capacity (WAN) A network and restricted access. extended over hundreds As networking technology improved, it became considerably cheaper and of kilometers that can had a significantly higher capacity. For example, the first standardized local span a continent. area network technology, developed about 40 years ago, was a version of Ethernet that had a maximum capacity (also called bandwidth) of 10 million bits per second, typically shared by tens of, if not a hundred, computers. Today, local area network technology offers a capacity of from 1 to 100 gigabits per second, usually shared by at most a few computers. Optical communications technology has allowed similar growth in the capacity of wide area networks, from hundreds of kilobits to gigabits and from hundreds of computers connected to a worldwide network to millions of computers connected. This dramatic rise in deployment of networking combined with increases in capacity have made network technology central to the information revolution of the last 30 years. For the last 15 decades, another innovation in networking is reshaping the way computers communicate. Wireless technology is widespread, which enabled the post-PC era. The ability to make a radio in the same low-cost semiconductor technology (CMOS) used for memory and microprocessors enabled a significant improvement in price, leading to an explosion in deployment. Currently available wireless technologies, called by the IEEE standard name 802.11ac allow for transmission rates from 1 to 1300 million bits per second. Wireless technology is quite a bit different from wire-based networks, since all users in an immediate area share the airwaves. Check Semiconductor DRAM memory, flash memory, and disk storage differ Yourself significantly. For each technology, list its volatility, approximate relative access time, and approximate relative cost compared to DRAM.

Computer Organization and Design RISC-V Edition PDF

Document Details

Tags

Related

Summary

Full Transcript