(The Morgan Kaufmann Series in Computer Architecture and Design) David A. Patterson, John L. Hennessy - Computer Organization and Design RISC-V Edition_ The Hardware Software Interface-Morgan Kaufmann-102-258-pages-3.pdf

2.3 Operands of the Computer Hardware 73 2.3 Operands of the Computer Hardware Unlike programs in high-level languages, the operands of arithmetic instructions are restricted; they must be from a limited number of special locations built directly in hardware called registers. Registers are primitives used in hardware design that are also visible to the programmer when the computer is completed, so you can think of registers as the bricks of computer construction. The size of a register in the RISC-V architecture is 32 bits; groups of 32 bits occur so frequently that they are given the name word in the RISC-V architecture. (Another popular size is a word A natural unit group of 64 bits, called a doubleword in the RISC-V architecture.) of access in a computer, One major difference between the variables of a programming language and registers usually a group of 32 bits; corresponds to the size of is the limited number of registers, typically 32 on current computers, like RISC-V. (See a register in the RISC-V Section 2.25 for the history of the number of registers.) Thus, continuing in our architecture. top-down, stepwise evolution of the symbolic representation of the RISC-V language, in this section we have added the restriction that the three operands of RISC-V doubleword Another arithmetic instructions must each be chosen from one of the 32-bit registers. natural unit of access in a The reason for the limit of 32 registers may be found in the second of our three computer, usually a group of 64 bits. underlying design principles of hardware technology: Design Principle 2: Smaller is faster. A very large number of registers may increase the clock cycle time simply because it takes electronic signals longer when they must travel farther. Guidelines such as “smaller is faster” are not absolutes; 31 registers may not be faster than 32. Even so, the truth behind such observations causes computer designers to take them seriously. In this case, the designer must balance the craving of programs for more registers with the designer’s desire to keep the clock cycle fast. Another reason for not using more than 32 is the number of bits it would take in the instruction format, as Section 2.5 demonstrates. Chapter 4 shows the central role that registers play in hardware construction; as we shall see in that chapter, effective use of registers is critical to program performance. Although we could simply write instructions using numbers for registers, from 0 to 31, the RISC-V convention is x followed by the number of the register, except for a few register names that we will cover later. Compiling a C Assignment Using Registers EXAMPLE It is the compiler’s job to associate program variables with registers. Take, for instance, the assignment statement from our earlier example: f = (g + h) − (i + j); 74 Chapter 2 Instructions: Language of the Computer The variables f, g, h, i, and j are assigned to the registers x19, x20, x21, x22, and x23, respectively. What is the compiled RISC-V code? The compiled program is very similar to the prior example, except we replace ANSWER the variables with the register names mentioned above plus two temporary registers, x5 and x6, which correspond to the temporary variables above: add x5, x20, x21 // register x5 contains g + h add x6, x22, x23 // register x6 contains i + j sub x19, x5, x6 f gets x5 – x6, which is (g + h) − (i + j) // Memory Operands Programming languages have simple variables that contain single data elements, as in these examples, but they also have more complex data structures—arrays and structures. These composite data structures can contain many more data elements than there are registers in a computer. How can a computer represent and access such large structures? Recall the five components of a computer introduced in Chapter 1 and repeated on page 67. The processor can keep only a small amount of data in registers, but computer memory contains billions of data elements. Hence, data structures data transfer instruction A command (arrays and structures) are kept in memory. that moves data between As explained above, arithmetic operations occur only on registers in RISC-V memory and registers. instructions; thus, RISC-V must include instructions that transfer data between memory and registers. Such instructions are called data transfer instructions. address A value used to To access a word in memory, the instruction must supply the memory address. delineate the location of Memory is just a large, single-dimensional array, with the address acting as the a specific data element index to that array, starting at 0. For example, in Figure 2.2, the address of the third within a memory array. data element is 2, and the value of memory is 10. 3 100 2 10 1 101 0 1 Address Data Processor Memory FIGURE 2.2 Memory addresses and contents of memory at those locations. If these elements were words, these addresses would be incorrect, since RISC-V actually uses byte addressing, with each word representing 4 bytes. Figure 2.3 shows the correct memory addressing for sequential word addresses. 2.3 Operands of the Computer Hardware 75 The data transfer instruction that copies data from memory to a register is traditionally called load. The format of the load instruction is the name of the operation followed by the register to be loaded, then register and a constant used to access memory. The sum of the constant portion of the instruction and the contents of the second register forms the memory address. The real RISC-V name for this instruction is lw, standing for load word. Compiling an Assignment When an Operand Is in Memory Let’s assume that A is an array of 100 words and that the compiler has associated the variables g and h with the registers x20 and x21 as before. Let’s also assume EXAMPLE that the starting address, or base address, of the array is in x22. Compile this C assignment statement: g = h + A; Although there is a single operation in this assignment statement, one of the ANSWER operands is in memory, so we must first transfer A to a register. The address of this array element is the sum of the base of the array A, found in register x22, plus the number to select element 8. The data should be placed in a temporary register for use in the next instruction. Based on Figure 2.2, the first compiled instruction is lw x9, 8(x22) // Temporary reg x9 gets A (We’ll be making a slight adjustment to this instruction, but we’ll use this simplified version for now.) The following instruction can operate on the value in x9 (which equals A) since it is in a register. The instruction must add h (contained in x21) to A (contained in x9) and put the sum in the register corresponding to g (associated with x20): add x20, x21, x9 // g = h + A The register added to form the address (x22) is called the base register, and the constant in a data transfer instruction (8) is called the offset. In addition to associating variables with registers, the compiler allocates data Hardware/ structures like arrays and structures to locations in memory. The compiler can then Software place the proper starting address into the data transfer instructions. Since 8-bit bytes are useful in many programs, virtually all architectures today Interface address individual bytes. Therefore, the address of a word matches the address of one of the 4 bytes within the word, and addresses of sequential words differ by 76 Chapter 2 Instructions: Language of the Computer 4. For example, Figure 2.3 shows the actual RISC-V addresses for the words in Figure 2.2; the byte address of the third word is 8. Computers divide into those that use the address of the leftmost or “big end” byte as the word address versus those that use the rightmost or “little end” byte. RISC-V belongs to the latter camp, referred to as little-endian. Since the order matters only if you access the identical data both as a word and as four individual bytes, few need to be aware of the “endianness.” Byte addressing also affects the array index. To get the proper byte address in the code above, the offset to be added to the base register x22 must be 8 × 4, or 32, so that the load address will select A and not A[8/4]. (See the related Pitfall on page 172 of Section 2.22.) The instruction complementary to load is traditionally called store; it copies data from a register to memory. The format of a store is similar to that of a load: the name of the operation, followed by the register to be stored, then the base register, and finally the offset to select the array element. Once again, the RISC-V address is specified in part by a constant and in part by the contents of a register. The actual RISC-V name is sw, standing for store word. 12 100 8 10 4 101 0 1 Byte Address Data Processor Memory FIGURE 2.3 Actual RISC-V memory addresses and contents of memory for those words. The changed addresses are highlighted to contrast with Figure 2.2. Since RISC-V addresses each byte, word addresses are multiples of 4: there are 4 bytes in a doubleword. alignment restriction Elaboration: In many architectures, words must start at addresses that are multiples of A requirement that data 4. This requirement is called an alignment restriction. (Chapter 4 suggests why alignment be aligned in memory on leads to faster data transfers.) RISC-V and Intel x86 do not have alignment restrictions, but natural boundaries. MIPS does. Hardware/ As the addresses in loads and stores are binary numbers, we can see why the DRAM for main memory comes in binary sizes rather than in decimal sizes. That is, in gibibytes Software (230) or tebibytes (240), not in gigabytes (109) or terabytes (1012); see Figure 1.1. Interface 2.3 Operands of the Computer Hardware 77 Compiling Using Load and Store Assume variable h is associated with register x21 and the base address of the array A is in x22. What is the RISC-V assembly code for the C assignment EXAMPLE statement below? A = h + A; Although there is a single operation in the C statement, now two of the operands are in memory, so we need even more RISC-V instructions. The first ANSWER two instructions are the same as in the prior example, except this time we use the proper offset of 32 for byte addressing in the load register instruction to select A, and the add instruction places the sum in x9: lw x9, 32(x22) // Temporary reg x9 gets A add x9, x21, x9 // Temporary reg x9 gets h + A The final instruction stores the sum into A, using 48 (4 × 12) as the offset and register x22 as the base register. sw x9, 48(x22) // Stores h + A back into A Load word and store word are the instructions that copy words between memory and registers in the RISC-V architecture. Some brands of computers use other instructions along with load and store to transfer data. An architecture with such alternatives is the Intel x86, described in Section 2.17. Many programs have more variables than computers have registers. Consequently, Hardware/ the compiler tries to keep the most frequently used variables in registers and places Software the rest in memory, using loads and stores to move variables between registers and memory. The process of putting less frequently used variables (or those needed Interface later) into memory is called spilling registers. The hardware principle relating size and speed suggests that memory must be slower than registers, since there are fewer registers. This suggestion is indeed the case; data accesses are faster if data are in registers instead of memory. Moreover, data are more useful when in a register. A RISC-V arithmetic instruction can read two registers, operate on them, and write the result. A RISC-V data transfer instruction only reads one operand or writes one operand, without operating on it. 78 Chapter 2 Instructions: Language of the Computer Thus, registers take less time to access and have higher throughput than memory, making data in registers both considerably faster to access and simpler to use. Accessing registers also uses much less energy than accessing memory. To achieve the highest performance and conserve energy, an instruction set architecture must have enough registers, and compilers must use registers efficiently. Elaboration: Let’s put the energy and performance of registers versus memory into perspective. Assuming 32-bit data, registers are roughly 200 times faster (0.25 vs. 50 nanoseconds) and are 10,000 times more energy efficient (0.1 vs. 1000 picoJoules) than DRAM in 2020. These large differences led to caches, which reduce the performance and energy penalties of going to memory (see Chapter 5). Constant or Immediate Operands Many times a program will use a constant in an operation—for example, incrementing an index to point to the next element of an array. In fact, more than half of the RISC-V arithmetic instructions have a constant as an operand when running the SPEC CPU2006 benchmarks. Using only the instructions we have seen so far, we would have to load a constant from memory to use one. (The constants would have been placed in memory when the program was loaded.) For example, to add the constant 4 to register x22, we could use the code lw x9, AddrConstant4(x3) // x9 = constant 4 add x22, x22, x9 // x22 = x22 + x9 (where x9 == 4) assuming that x3 + AddrConstant4 is the memory address of the constant 4. An alternative that avoids the load instruction is to offer versions of the arithmetic instructions in which one operand is a constant. This quick add instruction with one constant operand is called add immediate or addi. To add 4 to register x22, we just write addi x22, x22, 4 // x22 = x22 + 4 Constant operands occur frequently; indeed, addi is the most popular instruction in most RISC-V programs. By including constants inside arithmetic instructions, operations are much faster and use less energy than if constants were loaded from memory. The constant zero has another role, which is to simplify the instruction set by offering useful variations. For example, you can negate the value in a register by using the sub instruction with zero for the first operand. Hence, RISC-V dedicates register x0 to be hard-wired to the value zero. Using frequency to justify the inclusions of constants is another example of the great idea from Chapter 1 of making the common case fast. 2.3 Operands of the Computer Hardware 79 Given the importance of registers, what is the rate of increase in the number of Check registers in a chip over time? Yourself 1. Very fast: They increased as fast as Moore’s Law, which predicted doubling the number of transistors on a chip every 24 months. 2. Very slow: Since programs are usually distributed in the language of the computer, there is inertia in instruction set architecture, and so the number of registers increases only as fast as new instruction sets become viable. Elaboration: Although the RISC-V registers in this book are 32 bits wide, the RISC-V architects conceived multiple variants of the ISA. In addition to this variant, known as RV32, a variant named RV64 has 64-bit registers, whose larger addresses make RV64 better suited to processors for servers and smart phones. Elaboration: The RISC-V offset plus base register addressing is an excellent match to structures as well as arrays, since the register can point to the beginning of the structure and the offset can select the desired element. We’ll see such an example in Section 2.13. Elaboration: The register in the data transfer instructions was originally invented to hold an index of an array with the offset used for the starting address of an array. Thus, the base register is also called the index register. Today’s memories are much larger, and the software model of data allocation is more sophisticated, so the base address of the array is normally passed in a register since it won’t fit in the offset, as we shall see. Elaboration: The migration from 32-bit address computers to 64-bit address computers left compiler writers a choice of the size of data types in C. Clearly, pointers should be 64 bits, but what about integers? Moreover, C has the data types int, long int, and long long int. The problems come from converting from one data type to another and having an unexpected overflow in C code that is not fully standard compliant, which unfortunately is not rare code. The table below shows the two popular options: Operating System pointers int long int long long int Microsoft Windows 64 bits 32 bits 32 bits 64 bits Linux, Most Unix 64 bits 32 bits 64 bits 64 bits

(The Morgan Kaufmann Series in Computer Architecture and Design) David A. Patterson, John L. Hennessy - Computer Organization and Design RISC-V Edition_ The Hardware Software Interface-Morgan Kaufmann-102-258-pages-3.pdf

Document Details

Tags

Related

Full Transcript