08 Representation of numbers.pptx

Document Details

PlentifulMonkey

Uploaded by PlentifulMonkey

Universidad Autónoma de Nuevo León

Tags

floating point numbers binary numbers number representation

Full Transcript

Understanding Base-N, Binary, and Floating Point Numbers Decimal System Representation of Numbers List of digits from 0 to 9 Each digit represents a coefficient Coefficient for a power of 10 Bas...

Understanding Base-N, Binary, and Floating Point Numbers Decimal System Representation of Numbers List of digits from 0 to 9 Each digit represents a coefficient Coefficient for a power of 10 Base10 and Base3 Decimal System Base3 Number (Base10) Based on 10 digits: 0 to 9 System Uses digits 0, 1, and 2 Example: 121(base3) = 1 · 3^2 + 2 · 3^1 + 1 · 3^0 = 16(base10) Numbers are represented in base 2 Binary Number Only digits 0 and 1 are Representation used Binary Each digit is a coefficient of a power of Numbers 2 Digits in binary numbers Binary Digits are called bits Adding and multiplying Operations on Binary binary numbers follow Numbers the same processes as in grade school Binary Addition Fixed Number of Bits Computers have a fixed number of bits for storage 32-bit computers can represent 32-digit binary numbers 32-bit Limited Number Representation Computers 32-bit computers can represent 4,294,967,296 numbers Insufficient for complex calculations Integer-Only Representation All bits are dedicated to integers Cannot compute sums like 0.5 + 1.25 Represented using digits 0 Binary Numbers and 1 in Computing Arithmetic operations can be performed Binary Numbers Logical AND operation OR operation and Operations NOT operation Computers Quick Operations can be computed Computation quickly Fixed Number of Bits in Computers Binary representation gives insufficient range and precision Floating point numbers used to achieve needed range Introduction Components of Floating Point to Floating Numbers Point Sign indicator (s): positive or negative Exponent (e): power of 2 Numbers Fraction (f): coefficient of the exponent IEEE754 Double Precision Python floats mapped to IEEE754 double precision Total of 64 bits Representation of Floating Point Numbers Representation of a Float ◦ Formula: n = (−1)s 2e−1023(1 + f ) for 64-bit Illustration of −12.0 in 64-bit ◦ Each square represents one bit ◦ Green square represents 1 ◦ Grey square represents 0 Gap Between Numbers ◦ Distance from one number to the next ◦ Gap grows as the number represented grows Exponent is 0 Leading 1 in the fraction takes the value 0 Result is a subnormal number Computed by n = (−1)s 2−1022(0 + f) Special Cases in Exponent is 2047 and f is nonzero Floating Result is “Not a Number” Number is undefined Point Numbers Exponent is 2047, f = 0, and s = 0 Result is positive infinity Exponent is 2047, f = 0, and s = 1 Result is minus infinity Overflow in Python Occurs when numbers are larger than the largest Overflow floating point number and Result is assigned to inf Underflow Underflow in Python Occurs when numbers are smaller than the smallest subnormal number Result is assigned to zero In real life, overflow can cause significant problems in various fields: 1.Financial Systems: Overflow can lead to incorrect calculations, resulting in financial losses or errors in transactions. 2.Medical Devices: Inaccurate data due to overflow can lead to incorrect diagnoses or treatment plans. 3.Engineering: Overflow in calculations can result in structural failures or design flaws. 4.Software Development: Overflow can cause software crashes or unexpected behavior, leading to security vulnerabilities. In computing, overflow occurs when a calculation exceeds the maximum limit that can be represented within a given number of bits. This can lead to incorrect results or system crashes. For example, in a 32-bit system, the maximum value that can be represented is 4,294,967,295. If a calculation exceeds this value, it will wrap around to a negative number or zero, causing errors. Number Both IEEE754 and binary Representation use 64-bit representation Binary numbers have IEEE75 constant spacing Constant Spacing in Cannot achieve both range Binary and precision 4 vs simultaneously High precision at small Binary IEEE754 Precision numbers Low precision at large numbers Gap at large numbers is small relative to the number size Acceptable Limitation Irrelevant to normal calculations for very large numbers Introduction to Round-Off Errors Floating-Point Round-Off Error Truncation Error Representation in Computers Represented as base2 Difference between Difference between fractions approximation and true truncating an infinite sum Stored as approximations value and approximating it by a Common in numerical finite sum calculations Representation Error Representation Error in Floating-Point Numbers ◦ Example: π is an infinite number, typically approximated as 3.14159265 ◦ Example: 1/3 is approximated as 0.333333333... Accumulation of Errors ◦ Multiple rounds of rounding increase the error ◦ Example: 4.845 rounded to two decimal places is 4.85, then to one decimal place is 4.9 ◦ Total error in this case is 0.55 ◦ Single rounding to one decimal place results in 4.8 with an error of 0.045 Round-Off Error by Floating-Point Arithmetic Floating-point Errors occur due to approximation in floating- arithmetic errors point representation Example: 4.9 - Expected result: 0.055 4.845 Actual result: 0.055000000000000604 Example: 0.1 + 0.2 Expected result: 0.6 + 0.3 Actual result: not equal to 0.6 Solution: Use round Post-rounding makes results with inexact values function comparable Round-off Errors due to inexact representation Errors in Calculations Errors can be magnified or accumulated Accumulati on of Example of Adding and Initial number is 1 Add and subtract 1/3 multiple times Round-Off Subtracting 1/3 Result deviates from 1 Errors Accumulati More iterations lead to greater error on of Errors accumulation

Use Quizgecko on...
Browser
Browser