(The Morgan Kaufmann Series in Computer Architecture and Design) David A. Patterson, John L. Hennessy - Computer Organization and Design RISC-V Edition_ The Hardware Software Interface-Morgan Kaufmann-102-258-pages-4.pdf
Document Details
Uploaded by MagnanimousCloisonnism
Vilnius University
Tags
Related
- (The Morgan Kaufmann Series in Computer Architecture and Design) David A. Patterson, John L. Hennessy - Computer Organization and Design RISC-V Edition_ The Hardware Software Interface-Morgan Kaufmann-24-101-1-9 copy.pdf
- (The Morgan Kaufmann Series in Computer Architecture and Design) David A. Patterson, John L. Hennessy - Computer Organization and Design RISC-V Edition_ The Hardware Software Interface-Morgan Kaufmann-24-101-9-11.pdf
- Digital Design and Computer Architecture RISC-V Edition PDF
- RISC-V Processor Design PDF
- Building a Datapath (RISC-V) PDF
- RISC-V Processor Implementation PDF
Full Transcript
80 Chapter 2 Instructions: Language of the Computer 2.4 Signed and Unsigned Numbers First, let’s quickly review how a computer represents numbers. Because people have 10 fing...
80 Chapter 2 Instructions: Language of the Computer 2.4 Signed and Unsigned Numbers First, let’s quickly review how a computer represents numbers. Because people have 10 fingers, we are taught to think in base 10, but numbers may be represented in any base. For example, 123 base 10 = 1111011 base 2. Numbers are kept in computer hardware as a series of high and low electronic signals, and so they are considered base 2 numbers. (Just as base 10 numbers are called decimal numbers, base 2 numbers are called binary numbers.) A single digit of a binary number is thus the “atom” of computing, since all binary digit Also information is composed of binary digits or bits. This fundamental building block called bit. One of can be one of two values, which can be thought of as several alternatives: high or the two numbers in low, on or off, true or false, or 1 or 0. base 2, 0 or 1, that are Generalizing the point, in any number base, the value of ith digit d is the components of information. d Basei where i starts at 0 and increases from right to left. This representation leads to an obvious way to number the bits in the doubleword: simply use the power of the base for that bit. We subscript decimal numbers with ten and binary numbers with two. For example, 1011two represents (1 × 23) + (0 × 22) + (1 × 21) + (1 × 20)ten = (1 × 8) + (0 × 4) + (1 × 2) + (1 × 1)ten = 8 + 0 + 2 + 1ten = 11ten We number the bits 0, 1, 2, 3, … from right to left in a doubleword. The drawing below shows the numbering of bits within a RISC-V word and the placement of the number 1011two: least significant bit The Since words are drawn vertically as well as horizontally, leftmost and rightmost rightmost bit in an may be unclear. Hence, the phrase least significant bit is used to refer to the RISC-V word. rightmost bit (bit 0 above) and most significant bit to the leftmost bit (bit 31). The RISC-V word is 32 bits long, so it can represent 232 different 32-bit patterns. most significant bit The leftmost bit in an RISC-V It is natural to let these combinations represent the numbers from 0 to 232 −1 word. (4,294,967,295ten): 2.4 Signed and Unsigned Numbers 81 00000000 00000000 00000000 00000000two = 0ten 00000000 00000000 00000000 00000001two = 1ten 00000000 00000000 00000000 00000010two = 2ten... ... 11111111 11111111 11111111 11111101two = 4,294,967,293ten 11111111 11111111 11111111 11111110two = 4,294,967,294ten 11111111 11111111 11111111 11111111two = 4,294,967,295ten That is, 32-bit binary numbers can be represented in terms of the bit value times a power of 2 (here xi means the ith bit of x): For reasons we will shortly see, these positive numbers are called unsigned numbers. Hardware/ Base 2 is not natural to human beings; we have 10 fingers and so find base 10 natural. Why didn’t computers use decimal? In fact, the first commercial Software computer did offer decimal arithmetic. The problem was that the computer still Interface used on and off signals, so a decimal digit was simply represented by several binary digits. Decimal proved so inefficient that subsequent computers reverted to all binary, converting to base 10 only for the relatively infrequent input/output events. Keep in mind that the binary bit patterns above are simply representatives of numbers. Numbers really have an infinite number of digits, with almost all being 0 except for a few of the rightmost digits. We just don’t normally show leading 0s. Hardware can be designed to add, subtract, multiply, and divide these binary bit patterns, as we’ll see in Chapter 3. If the number that is the proper result of such operations cannot be represented by these rightmost hardware bits, overflow is overflow when the said to have occurred. It’s up to the programming language, the operating system, results of an operation are and the program to determine what to do if overflow occurs. larger than be represented Computer programs calculate both positive and negative numbers, so we need a in a register representation that distinguishes the positive from the negative. The most obvious solution is to add a separate sign, which conveniently can be represented in a single bit; the name for this representation is sign and magnitude. Alas, sign and magnitude representation has several shortcomings. First, it’s not obvious where to put the sign bit. To the right? To the left? Early computers tried both. Second, adders for sign and magnitude may need an extra step to set the sign 82 Chapter 2 Instructions: Language of the Computer because we can’t know in advance what the proper sign of the sum will be. Finally, a separate sign bit means that sign and magnitude has both a positive and a negative zero, which can lead to problems for inattentive programmers. Because of these shortcomings, sign and magnitude representation was soon abandoned. In the search for a more attractive alternative, the question arose as to what would be the result for unsigned numbers if we tried to subtract a large number from a small one. The answer is that it would try to borrow from a string of leading 0s, so the result would have a string of leading 1s. Given that there was no obvious better alternative, the final solution was to pick the representation that made the hardware simple: leading 0s mean positive, and leading 1s mean negative. This convention for representing signed binary numbers is called two’s complement representation (an Elaboration on page 81 explains this unusual name): 00000000 00000000 00000000 00000000two = 0ten 00000000 00000000 00000000 00000001two = 1ten 00000000 00000000 00000000 00000010two = 2ten... ... 01111111 11111111 11111111 11111101two = 2,147,483,645ten 01111111 11111111 11111111 11111110two = 2,147,483,646ten 01111111 11111111 11111111 11111111two = 2,147,483,647ten 10000000 00000000 00000000 00000000two = − 2,147,483,648ten 10000000 00000000 00000000 00000001two = − 2,147,483,647ten 10000000 00000000 00000000 00000010two = − 2,147,483,646ten … ... 11111111 11111111 11111111 11111101two = − 3ten 11111111 11111111 11111111 11111110two = − 2ten 11111111 11111111 11111111 11111111two = − 1ten The positive half of the numbers, from 0 to 2,147,483,647ten (231−1), use the same representation as before. The following bit pattern (1000 … 0000two) represents the most negative number -2,147,483,648ten (−231). It is followed by a declining set of negative numbers: -2,147,483,647ten (1000 … 0001two) down to −1ten (1111 … 1111two). Two’s complement does have one negative number that has no corresponding positive number: -2,147,483,648ten. Such imbalance was also a worry to the inattentive programmer, but sign and magnitude had problems for both the programmer and the hardware designer. Consequently, every computer today uses two’s complement binary representations for signed numbers. 2.4 Signed and Unsigned Numbers 83 Two’s complement representation has the advantage that all negative numbers have a 1 in the most significant bit. Thus, hardware needs to test only this bit to see if a number is positive or negative (with the number 0 is considered positive). This bit is often called the sign bit. By recognizing the role of the sign bit, we can represent positive and negative 32-bit numbers in terms of the bit value times a power of 2: The sign bit is multiplied by −231, and the rest of the bits are then multiplied by positive versions of their respective base values. Binary to Decimal Conversion EXAMPLE What is the decimal value of this 32-bit two’s complement number? 11111111 11111111 11111111 11111100two Substituting the number’s bit values into the formula above: ANSWER We’ll see a shortcut to simplify conversion from negative to positive soon. Just as an operation on unsigned numbers can overflow the capacity of hardware to represent the result, so can an operation on two’s complement numbers. Overflow occurs when the leftmost retained bit of the binary bit pattern is not the same as the infinite number of digits to the left (the sign bit is incorrect): a 0 on the left of the bit pattern when the number is negative or a 1 when the number is positive. 84 Chapter 2 Instructions: Language of the Computer Hardware/ Signed versus unsigned applies to loads as well as to arithmetic. The function of a Software signed load is to copy the sign repeatedly to fill the rest of the register—called sign Interface extension—but its purpose is to place a correct representation of the number within that register. Unsigned loads simply fill with 0s to the left of the data, since the number represented by the bit pattern is unsigned. When loading a 32-bit word into a 32-bit register, the point is moot; signed and unsigned loads are identical. RISC-V does offer two flavors of byte loads: load byte unsigned (lbu) treats the byte as an unsigned number and thus zero-extends to fill the leftmost bits of the register, while load byte (lb) works with signed integers. Since C programs almost always use bytes to represent characters rather than consider bytes as very short signed integers, lbu is used practically exclusively for byte loads. Hardware/ Unlike the signed numbers discussed above, memory addresses naturally start Software at 0 and continue to the largest address. Put another way, negative addresses make no sense. Thus, programs want to deal sometimes with numbers that can Interface be positive or negative and sometimes with numbers that can be only positive. Some programming languages reflect this distinction. C, for example, names the former integers (declared as int in the program) and the latter unsigned integers (unsigned int). Some C style guides even recommend declaring the former as signed int to keep the distinction clear. Let’s examine two useful shortcuts when working with two’s complement numbers. The first shortcut is a quick way to negate a two’s complement binary number. Simply invert every 0 to 1 and every 1 to 0, then add one to the result. This shortcut is based on the observation that the sum of a number and its inverted representation must be 111 … 111two, which represents −1. Since x x 1 , therefore x x 1 0 or x 1 x. (We use the notation x to mean invert every bit in x from 0 to 1 and vice versa.) Negation Shortcut EXAMPLE Negate 2ten, and then check the result by negating −2ten. 2ten = 00000000 00000000 00000000 00000010two ANSWER 2.4 Signed and Unsigned Numbers 85 Negating this number by inverting the bits and adding one, Going the other direction, 11111111 11111111 11111111 11111110two is first inverted and then incremented: Our next shortcut tells us how to convert a binary number represented in n bits to a number represented with more than n bits. The shortcut is to take the most significant bit from the smaller quantity—the sign bit—and replicate it to fill the new bits of the larger quantity. The old nonsign bits are simply copied into the right portion of the new doubleword. This shortcut is called sign extension. Sign Extension Shortcut EXAMPLE Convert 16-bit binary versions of 2ten and −2ten to 32-bit binary numbers. The 16-bit binary version of the number 2 is ANSWER 00000000 00000010two = 2ten It is converted to a 32-bit number by making 16 copies of the value in the most significant bit (0) and placing that in the left of the word. The right part gets the old value: 00000000 00000000 00000000 00000010two = 2tenM 86 Chapter 2 Instructions: Language of the Computer Let’s negate the 16-bit version of 2 using the earlier shortcut. Thus, 0000 0000 0000 0010two becomes 1111 1111 1111 1101two + 1two = 1111 1111 1111 1110two Creating a 32-bit version of the negative number means copying the sign bit 16 times and placing it on the left: 11111111 11111111 11111111 11111110two = −2ten This trick works because positive two’s complement numbers really have an infinite number of 0s on the left and negative two’s complement numbers have an infinite number of 1s. The binary bit pattern representing a number hides leading bits to fit the width of the hardware; sign extension simply restores some of them. Summary The main point of this section is that we need to represent both positive and negative integers within a computer, and although there are pros and cons to any option, the unanimous choice since 1965 has been two’s complement. Elaboration: For signed decimal numbers, we used “−” to represent negative because there are no limits to the size of a decimal number. Given a fixed data size, binary and hexadecimal (see Figure 2.4) bit strings can encode the sign; therefore, we do not normally use “+” or “−” with binary or hexadecimal notation. Check What is the decimal value of this 64-bit two’s complement number? Yourself 11111111 11111111 11111111 11111111 11111111 11111111 11111111 11111000two 1) −4ten 2) −8ten 3) −16ten 4) 18,446,744,073,709,551,608ten What is the decimal value if it is instead a 64-bit unsigned number? 2.5 Representing Instructions in the Computer 87 Elaboration: Two’s complement gets its name from the rule that the unsigned sum of an n-bit number and its n-bit negative is 2n; hence, the negation or complement of a number x is 2n − x, or its “two’s complement.” one’s complement A A third alternative representation to two’s complement and sign and magnitude is called notation that represents one’s complement. The negative of a one’s complement is found by inverting each bit, from the most negative value by 10 … 000two and the 0 to 1 and from 1 to 0, or x. This relation helps explain its name since the complement of most positive value by x is 2n − x − 1. It was also an attempt to be a better solution than sign and magnitude, and 01 … 11two, leaving an several early scientific computers did use the notation. This representation is similar to equal number of negatives two’s complement except that it also has two 0s: 00 … 00two is positive 0 and 11 … 11two and positives but ending is negative 0. The most negative number, 10 … 000two, represents −2,147,483,647ten, up with two zeros, one and so the positives and negatives are balanced. One’s complement adders did positive (00 … 00two) and need an extra step to subtract a number, and hence two’s complement dominates today. one negative (11 … 11two). A final notation, which we will look at when we discuss floating point in Chapter 3, The term is also used to is to represent the most negative value by 00 … 000two and the most positive value by mean the inversion of 11 … 11two, with 0 typically having the value 10 … 00two. This representation is called a every bit in a pattern: 0 to biased notation, since it biases the number such that the number plus the bias has a 1 and 1 to 0. non-negative representation. biased notation A notation that represents the most negative value by 00 … 000two and the 2.5 Representing Instructions in the Computer most positive value by 11 … 11two, with 0 typically having the We are now ready to explain the difference between the way humans instruct value 10 … 00two, thereby computers and the way computers see instructions. biasing the number such Instructions are kept in the computer as a series of high and low electronic signals and that the number plus the bias has a non-negative may be represented as numbers. In fact, each piece of an instruction can be considered representation. as an individual number, and placing these numbers side by side forms the instruction. The 32 registers of RISC-V are just referred to by their number, from 0 to 31. EXAMPLE Translating a RISC-V Assembly Instruction into a Machine Instruction Let’s do the next step in the refinement of the RISC-V language as an example. We’ll show the real RISC-V language version of the instruction represented symbolically as add x9, x20, x21 first as a combination of decimal numbers and then of binary numbers. ANSWER The decimal representation is 0 21 20 0 9 51