Computer Organization and Architecture Lecture 2: Data Representation - Fall 2024 - Mansoura University PDF
Document Details
Uploaded by LavishMoonstone
Mansoura University
2024
DR. Muhammad Haggag Zayyan
Tags
Summary
This document is a lecture on computer architecture and data representation. It covers topics such as number systems, integer, and floating-point representations. The document was likely provided as a learning aid in the Fall 2024 semester at Mansoura University.
Full Transcript
Mansoura University Faculty of Computers and Information Department of Computer Science Fall 2024 Computer Organization and Architecture Lecture 2: Data Representation Programs: AI - 2nd Level , MI & SE – 3rd Level DR. Muhammad Haggag Zayyan CS De...
Mansoura University Faculty of Computers and Information Department of Computer Science Fall 2024 Computer Organization and Architecture Lecture 2: Data Representation Programs: AI - 2nd Level , MI & SE – 3rd Level DR. Muhammad Haggag Zayyan CS Department LECTURE TOPICS Number Systems Integer Representation Floating Representation Character Representation Endianess (or byte-order) 2 NUMBER SYSTEMS Human beings use decimal (base 10) number systems for counting and measurements. Computers use binary (base 2) number system Decimal (Base 10) Number System 735 = 7×102 + 3×101 + 5×100 Binary (Base 2) Number System 10110B = 1×24 + 0×23 + 1×22 + 1×21 + 0×20 Hexadecimal (Base 16) Number System A3EH = 10×162 + 3×161 + 14×160 3 CONVERSION-1 Conversion from Hexadecimal to Binary A3C5H = 1010 0011 1100 0101B Conversion from Binary to Hexadecimal 1001001010B = 0010 0100 1010B = 24AH Conversion from Base r to Decimal (Base 10) (multiplication rule) Given a n-digit base r number: dn-1 dn-2..d2 d1 d0 (base r), the decimal equivalent is given by dn-1 × rn-1 + dn-2 × rn-2 +... + d1 × r1 + d0 × r0 A1C2H = 10×163 + 1×162 + 12×161 + 2 = 41410 4 CONVERSION-2 Conversion from Decimal (Base 10) to Base r Use repeated division/remainder. For example, To convert 261D to hexadecimal: 261/16 => quotient=16 remainder=5 16/16 => quotient=1 remainder=0 1/16 => quotient=0 remainder=1 (quotient=0 stop) Hence, 261D = 105H 5 INTEGER REPRESENTATION Integers are fixed-point numbers with the radix point fixed after the least-significant bit. They are contrast to real numbers or floating-point numbers. The commonly-used bit-lengths for integers are 8-bit, 16-bit, 32-bit or 64-bit. Schemes for integers: Unsigned Integers: can represent zero and positive integers. Signed Integers: can represent zero, positive and negative integers. Sign-Magnitude representation 1's Complement representation 2's Complement representation 6 N-BIT SIGN INTEGERS IN SIGN-MAGNITUDE REPRESENTATION The most-significant bit (msb) is the sign bit, with value of 0 representing positive integer and 1 representing negative integer. The remaining n-1 bits represents the magnitude. Example 1: Suppose that n=8 and the binary representation is 0 100 0001B. Sign bit is 0 ⇒ positive Absolute value is 100 0001B = 65D Hence, the integer is +65D Example 2: Suppose that n=8 and the binary representation is 1 000 0001B. Sign bit is 1 ⇒ negative Absolute value is 000 0001B = 1D Hence, the integer is -1D 7 N-BIT SIGN INTEGERS IN 1'S COMPLEMENT REPRESENTATION For negative integers, the absolute value of the integer is equal to "the magnitude of the complement (inverse) of the (n-1)-bit binary pattern" (hence called 1's complement). Example 1: Suppose that n=8 and the binary representation 0 100 0001B. Sign bit is 0 ⇒ positive Absolute value is 100 0001B = 65D Hence, the integer is +65D Example 2: Suppose that n=8 and the binary representation 1 000 0001B. Sign bit is 1 ⇒ negative Absolute value is the complement of 000 0001B, i.e., 111 1110B = 126D Hence, the integer is -126D 8 N-BIT SIGN INTEGERS IN 2'S COMPLEMENT REPRESENTATION For negative integers, the absolute value of the integer is equal to "the magnitude of the complement of the (n-1)-bit binary pattern plus one" (hence called 2's complement). Example 1: Suppose that n=8 and the binary representation 0 100 0001B. Sign bit is 0 ⇒ positive Absolute value is 100 0001B = 65D Hence, the integer is +65D Example 2: Suppose that n=8 and the binary representation 1 000 0001B. Sign bit is 1 ⇒ negative Absolute value is the complement of 000 0001B plus 1, i.e., 111 1110B + 1B = 127D Hence, the integer is -127D 9 PROS & CONS Drawbacks of sign-magnitude and 1's complement: 4-bit adder/subtractor There are two representations of zero: Adder When S = 0: A + B Substractor When S = 1: A - B Sign-magnitude: +ve: 0000 0000B , -ve: 1000 0000B = A + (2's B) = A + (1's of B) + 1 1’s complement: +ve: 0000 0000B , -ve: 1111 1111B The positive integers and negative integers need to be processed separately. Benefits of 2's complement There is only one representation for the number zero. Positive and negative integers can be treated together in addition and subtraction. Subtraction can be carried out using the "addition logic". 10 Cascaded full adders OVERFLOW OR UNDERFLOW Because of the fixed precision (i.e., fixed number of bits), an n-bit 2's complement signed integer has a certain range. Overflow happens if the carry in does not equal the carry out. XOR Example: Subtraction is treated as Addition of a Positive and a Negative Integers: Suppose that n=8, 5D - 5D = 65D + (-5D) = 60D 65D → 0100 0001B -5D → 1111 1011B(+ 0011 1100B → 60D (carry in =1, carry out=1: discard carry - OK) Example: Overflow: Suppose that n=8, 127D + 2D = 129D (overflow - beyond the range) 127D → 0111 1111B 2D → 0000 0010B(+ 1000 0001B → -127D (wrong) Example: Underflow: Suppose that n=8, -125D - 5D = -130D (underflow - below the range) -125D → 1000 0011B -5D → 1111 1011B(+ 0111 1110B → +126D (wrong) 11 BASE 2 SYSTEMS WITH FRACTIONAL PART 1. Separate the integral and the fractional parts. Real number can be represented in 2. For the integral part, divide by the target radix repeatedly, and collect the remainder in reverse order. Fixed point, 3. For the fractional part, multiply the fractional part by the target radix repeatedly, and collect the which has a specific integral part in the same order. number of digits reserved for the Convert 18.6875D to binary integer part and Fractional Part =.6875D fractional part. Integral Part = 18D.6875*2=1.375 => whole number is 1 18/2 => quotient=9 remainder=0.375*2=0.75 => whole number is 0 9/2 => quotient=4 remainder=1.75*2=1.5 => whole number is 1 4/2 => quotient=2 remainder=0.5*2=1.0 => whole number is 1 2/2 => quotient=1 remainder=0 (fraction=0 stop) 1/2 => quotient=0 remainder=1 Hence.6875D =.1011B (quotient=0 stop) Hence, 18D = 10010B Therefore, 18.6875D = 10010.1011B 12 Fixed-Point SCIENTIFIC NOTATION Assume you want to save this number: int.MaxValue.ToString().Length=10 long.MaxValue.ToString().Length=19 99999888887777766666555554444433333 Consists of 35 digits If “long” variable is initialized with this number, we got “Integral constant is too large”. As it exceeds the number of digits that can represented by “long”. Fix: utilize small number of digits (less space) using E notation Scientific notation is a method of expressing numbers that are too large or too small in a compact form. There is loss in precision (double) 99999888887777766666555554444433333.0 when getting the original 9.9999888887777759E+34 value. But Double is more = 99999888887777759000000000000000000 precise than Float (float) 99999888887777766666555554444433333.0 9.999989E+34 = 99999890000000000000000000000000000 As we notice (float) uses small number of “digits” (less precision) to represent the fraction part than (double). 13 Scientific Notation: has a single digit to the left of the decimal point. Computer arithmetic that supports such numbers is called Floating Point FLOATING POINT NUMBERS Real numbers can represent in: Floating point refers to a binary decimal representation Fixed point format where there is not a fixed number of digits before and Floating point format after the decimal place (contrast “fixed point”). Representation of Floating-Point numbers A Single-Precision floating-point number occupies 32-bits. ± 10-38... 1038 A Double-Precision floating-point number occupies 64-bits. ± 10-308... 10308 14 IEEE 754 FLOATING-POINT STANDARD Since the mantissa “significant digits” is always 1.xxxxxxxxx in the normalized form, no need to represent the leading 1. So, effectively: Single Precision: mantissa → 1 bit + 23 bits (fraction) mantissa is in the interval Double Precision: mantissa → 1 bit + 52 bits (fraction) 1 ≤ m < 2, so the leading digit doesn't need to be stored(since it is always 1) General format: -1S × (1 + Mantissa) × 2E Zero (0.0) = 0000...0000 Example (Normalization) : (-1.001)2 = -1.001 x 20 = -11 x (1 + 0.001) x 20 15 (0.01)2 = 1.0 x 2-2 = -10 x (1 + 0.0) x 2-2 WHY USE BIAS IN EXPONENT? Negative exponents could pose a problem in comparisons. With this representation, the first exponent (-1) shows a "larger" binary number, making direct comparison more difficult. To avoid this, Biased Notation is used for exponents. Bias benefit: If the real exponent of a number is X then it is represented as (X + bias) Simplified Comparisons, with a biased exponent, comparisons between Found higher E: start from Left, and check each corresponding bits till found a bit with floating-point numbers 1 and other with 0. +0.5=0.10 become more Due to -ve E has 1 in MSB. That’ s an issue +2 =10.0 straightforward.You can when comparing to +ve E. compare the exponent values directly as unsigned IEEE single-precision uses a bias of 127. Therefore, an exponent of integers. For example, if two -1 is represented as -1 + 127 = 126 = 011111102 Using Bias, we can easily find floating-point numbers have the higher E. 0 is represented as 0 + 127 = 127 = 011111112 the same mantissa, the +1 E has 1 in MSB, while -1 E number with the larger +1 is represented as +1 + 127 = 128 = 100000002 has 0 in MSB. Hence +1 > -1 biased exponent is the larger number, simplifying Actual exponent is found by subtracting the bias from the stored exponent sorting and comparison 16 operations CONVERTING A NUMBER TO FLOATING POINT Set the sign bit - 0 → +ve, 1 → -ve Divide your number into two sections – the integer part and the fraction part. Convert to binary - convert the two parts into binary then join them together with a binary point. Work out the exponent - how many spaces the binary point needs to be moved so that it is just after the first 1 in the result. If you move the binary point to the left then this number is positive. If you move it to the right then the number is negative. Add 127 to this number then convert to binary. Format the mantissa - This is done by dropping the first 1 in the number and recording the next 23 bits. Exponent Bias value: Single precision: 127 Double precision: 1023 General: 2exp bits -1 -1 17 EXAMPLE 1 Example: Convert decimal number 3.5 into IEEE-754 32-bit floating-point representation. Solution: 1. S =0 , positive number 2. Convert 3.5 to binary = 0011.1 3. Normalize, 001.11 x 21 M = 11 [The mantissa is taken from the normalized binary number, dropping the leading 1] E =1 + 127 = 128 [biased exponent] 4. Combine parts 0 10000000 110 0000 0000 0000 0000 0000 = 4060 0000H 18 EXAMPLE 2 Suppose that IEEE-754 32-bit floating-point is BEC0 0000H, find the corresponding decimal number. Solution: 1. Convert to binary: 1 01111101 100 0000 0000 0000 0000 0000 = BEC0 0000H 2. Identify components: Sign bit S = 1 ⇒ negative number E = 0111 1101B = 125D (biased), real E = 125-127 = -2 Mantissa is 0.1B = 2-1 = 0.5D (convert to decimal) Use the formula : -1S × (1 + Mantissa) × 2E 19 = -11 x (1 + 0.5) x 2-2 = -1.5/4 = -0.375D IEEE-754 FLOATING POINT CONVERTER - ONLINE You can validate your results using this online tool: https://www.h- schmidt.net/FloatConverter/IEE E754.html 20 SPECIAL CASES 1/0.0 → Infinity -1/0.0 → -Infinity 0/0.0 → NaN 21 FP - EFFICIENT MEMORY USAGE The maximum +ve in single precision. (1111111111111111111111100000000000000000000000000000000000000000 0000000000000000000000000000000000000000000000000000000000000000)2 11111111111111111111111 → 23-bit mantissa, decimal point can move up to 28 = 128 digits. Conversion of any base to This number size is 128 bit binary base In the conventional way using fixed-point (integer), log(10d) = log(2b) This requires 16 bytes in memory (not efficient usage). d*log10 = b*log2 d = b*log2 Using floating point, or This requires 4 bytes, while the number can be rewritten as b = d/log2 1.111111111111111111111110 x 2127 (1 is implied for mantissa = 1.M) Note: Log10 = 1 (base 10) 111111111111111111111111.0 x 2104 = (224 -1) x 2104 ≈ = 3.4e+38 (float type) Another way: To get the corresponding decimal range of single-precision floating point, where b= 128, hence d = 38.5 22 namely, ±10-38 …. 1038 OTHER FLOATING-POINT BASIC FORMATS Quadruple precision (128-bit) floating-point numbers can store 113-bit fixed-point numbers or integers accurately without losing precision 10 Decimal E max = 2E max d = b / log2 Decimal E max = E max × log10 2. This gives an approximate value of the maximum decimal exponent. Significant bits including the implicit bit (which always equals 1). This implicit bit is not stored in memory), but not the sign bit. Decimal digits = digits × log10 2. This gives an approximate precision in number of decimal digits. float.MaxValue = 3.40282347E+38 23 double.MaxValue = 1.7976931348623157E+308 FLOATING POINT ADDITION Example : Add the following numbers: x = 9.75, y = 0.5625 Converting into floating point format General format: -1S × (1 + Mantissa) × 2E 9.75 = (1001.11)2 = Normalize = 1.00111 x 23 , S=0, M=00111, E=3+127 =130 0.5625 = (0.1001)2 = Normalize = 1.001 x 2-1 , S=0, M=001, E=-1+127=126 Align the exponent we get the difference of exponents to know how much shifting is required. (130-126) = 4 26 FP -ADDITION -CONTINUE Now, we shift the mantissa of lesser number right side by 4 units. (shift less exponent) (note that there is 1 (implied bit) before decimal point, then apply shifting) Mantissa of 0.5625 = 1.00100000000000000000000 Shifting right by 4 units Mantissa of 0.5625 = 0.00010010000000000000000 Mantissa of 9.75 = 1. 00111000000000000000000 (no change) Adding (significant) mantissa of both If addition result in carry to “hidden 1” (left 0.00010010000000000000000 of decimal point), then shift right “mantissa” +1.00111000000000000000000 and increment “Exponent” ————————————————————————— 1.01001010000000000000000 27 FP -ADDITION -CONTINUE In final answer, we take exponent of bigger number , So, final answer consists of: Sign bit = 0 Exponent of bigger number = 130 = 10000010 Mantissa = 01001010000000000000000 (ignore 1 before decimal point) 32 bit representation of answer = x + y = 0 10000010 01001010000000000000000 Verification step = Interpret the result E = 10000010 – bias = 130 – 127 = 3 Checking for overflow/underflow: -126