Boolean, Character, and Real Data (PDF)

Summary

These lecture notes provide an overview of Boolean, character, and real data types in programming. The document covers concepts such as logical data, type specifications, representation, and operations on these data types, presented in the context of common programming languages.

Full Transcript

Boolean Data Character Data Named Constants Real Data Khaled Nagi Department of Computer and Systems Engineering, Faculty of Engineering, Alexandria Univ...

Boolean Data Character Data Named Constants Real Data Khaled Nagi Department of Computer and Systems Engineering, Faculty of Engineering, Alexandria University, Egypt. Boolean _Char_Constants [email protected] 1 Boolean Data Boolean-Char_Constants [email protected] 2 Boolean Specification  This type is named Boolean after the originator of logical calculus George Boole.  Logical data is a primitive data type that assumes two values; true or false Boolean-Char_Constants [email protected] 3 Operations on Logical Data Boolean_Char_Constants [email protected] 4 Operations on Logical Data Boolean-Char_Constants [email protected] 5 Relations Boolean_Char_Constants [email protected] 6 Relations (ǀ)  A relation between two data objects of the same type yields logical value. = , ≠ are valid for all data types , ≤, ≥ are valid iff values in the domain are ordered.  In some programming languages, short-circuit forms of the operations and and or are provided: the right operand is evaluated only if it has to be. In C, the operation && (and), and | | (or) are short circuit operations PΛQ PQ knowing P is sometimes sufficient to know the results  AND: False yields short circuit operation  OR: True yields short circuit operation Boolean_Char_Constants [email protected] 7 Relations (ǁ)  The short-circuit operations are useful in simplifying coding in situations where two conditions are to be examined whereas, the second is undefined if the first is not satisfied:  Example 1: if has a job and his salary is > 500 then ….  Example 2: if there is next item in the list and next item ≠ X then … Does JAVA provide short circuit operations? Boolean_Char_Constants [email protected] 8 Logical data type in C  C has no separate data type for logical data. Type integer is overloaded to represent logical data too  zero value represents false  nonzero value represents true  example: if (n%d)... // true if n is not multiple of d.  Pitfalls in C:  writing, if (x = 0) …… instead of if (x == 0) ……  the statement is syntactically correct and have the meaning "assign zero to x, then use this value zero, which represents the logical value false, in the conditional" !!  The statement: if (X ≤ Y ≤ Z) …… is syntactically correct, but have the less apparent meaning:  evaluate (X ≤ Y), yielding a value 1 if condition is satisfied and 0 if not ,  then evaluate ( 0 or 1 ≤ Z ) , as the second condition to examine !! Boolean_Char_Constants [email protected] 9 Priority of Operators  Programming languages, either keep same precedence levels as in mathematics, or use less number of levels:  C-Language  Arithmetic operators  relational operators  logical operators Boolean_Char_Constants [email protected] 10 Priority of Operators  In Java Boolean_Char_Constants [email protected] 11 Realization: External Representation of Logical Values Boolean_Char_Constants [email protected] 12 Realization: Internal Representation of Logical Values  Use of single bit  A Boolean data object is represented by a single bit (1 for true, 0 for false).  Most machines do not have instructions for direct access of a given bit in memory.  So, the gain in space is offset by the loss of speed and simplicity.  Consequently, other storage representations have been adopted.  Use of entire word  Such representation facilitates fast access of logical information  but is extremely wasteful of memory space: For 1 word=n bits, only 1/n is used  Use of a byte  This storage representation has the advantages of quick access and relatively small storage requirements. Boolean_Char_Constants [email protected] 13 Realization: Internal Representation of Logical Values  When bytes or words are used as storage structure representation of the values, true and false can be performed in two ways:  A particular bit is used for representation of the value;  often the sign-bit of number representation, with 0 = false, and 1= true in that bit and the rest of the word or byte ignored.  A zero value in the entire storage unit represents false and any other nonzero value represents true. Boolean_Char_Constants [email protected] 14 Boolean_Char_Constants [email protected] 15 Character Data Boolean_Char_Constants [email protected] 16 Type Specifications: Domain of Values  Character data denotes a finite, ordered set of characters.  Wide variety of character sets are used by various computers.  A character set is specified by  The characters included  The order of characters in the set, (the collating sequence of characters)  The code for each character Boolean_Char_Constants [email protected] 17 Type Specifications: Domain of Values (ǀ) Boolean_Char_Constants [email protected] 18 Type Specifications: Domain of Values (ǁ)  Control characters are classified into  Communication control characters (used for data transmission)  Layout characters (used for physical organization of data  Logical separators (used in data storage)  Ignore characters  Escape characters  Medium control characters Boolean_Char_Constants 2016 [email protected] 19 Control Characters in the ASCII code Boolean_Char_Constants [email protected] 20 Type Specifications: Operations on Characters Data  Assignment: character × character → void  comparison: character × character → Boolean  concatenation: character × character → string character × string → string string × character → string  Input / Output: character → void  Conversion functions :  ORD (c) ⇒ ordinal number of character c  CHR (n) ⇒ character with ordinal number n  Notes:  C1 < C2 ↔ ORD (C1) < ORD (C2)  CHR (ORD (C)) = C Reflexive relation  ORD (CHR (n) ) = n Boolean_Char_Constants [email protected] 21 Type Realization: External Representation  Character data in programs are represented by either of the following:  enclosing the character value in ´ , or ˝ ,  use of the CHR function , i.e., CHR (n). This representation is suitable for use with control characters. Boolean_Char_Constants [email protected] 22 Type Realization: External Representation  The storage structure for a character is the byte (one byte for each character ),  The most widely used character sets are :  EBCDIC Character set (8–bit code ): (256 char)  EBCDIC stands for; Extended Binary - Coded – Decimal Interchange Code.  used primarily on the IBM computers (PC is an exception),  IBM began with the BCD code (which in turn originated from the Hollerith code for punched cards)  ASCII character set (7–bit code): (128 char)  ASCII stands for, American Standard Code for Information Interchange  Formally known as the ISO code.  Developed as a standard coding scheme, and is used on many non-IBM machines.  Extended ASCII character set (8–bit code): (256 char)  This character set is used on personal computers.  Unicode (16–bit code):  Used in Java, the common language for use on networks, to allow for characters in alphabets of many natural languages. Boolean_Char_Constants [email protected] 23 Character data in Java Boolean_Char_Constants [email protected] 24 Problems due to Different Character Sets  Compatibility of devices:  all input / output devices are character oriented  all input and output devices must be based on same character set used in the computer system. input memory output  Portability of data files or programs stored on external storage media (magnetic media), regarding code used and density of recording  Provision for new characters: e.g.,  Graphic characters,  Arabic characters,  Special characters, e.g. Boolean_Char_Constants [email protected] 25 Problems due to Different Character Sets: Processing Problems (ǀ)  Letters form a contiguous block in ASCII, but not in EBCDIC ,  Thus, if one has to test whether a character is a letter, he has to modify his program, when run on machine with different character set.  Example: test if ch is a letter or not  ASCII if A ≤ ch ≤ Z  EBCDIC if (A ≤ ch ≤ I) OR (if J ≤ ch ≤ R) OR (S ≤ ch ≤ Z)  Sorting of data: Upper case letters come before lower case letters ASCII, the reverse in EBCDIC. Same problem happens with decimal digit characters and the letters.  To get the numeric value of a numeral, different code is required according to character set used:  value ( C ) = ORD ( C ) - 48 with ASCII , but  value ( C ) = ORD ( C ) - 240 with EBCDIC Boolean_Char_Constants [email protected] 26 Problems due to Different Character Sets: Processing Problems (ǁ)  Note: a better way is to use the code:  value( C ) = ORD( C ) – ORD ('0') which gives the right answer irrespective of the character set used Boolean_Char_Constants [email protected] 27 Named Constants Boolean_Char_Constants [email protected] 28 Named Constants (ǀ)  Use of magic numbers (numbers to represent values of some parameters of the problem) in a program is a bad habit.  On the other hand, use of a variable that is assigned the required value, may lead to illegal assignment to a "constant“ by changing its value later in the program  "Modern" programming languages allow us to name constants in the same way as we name variables.  Constant definition in Java has the form  final type constant_identifier = value ;  Constant definition in C has the form  const type constant_identifier = value Boolean_Char_Constants [email protected] 29 Named Constants (ǁ)  Concept of named constants can be used for:  definition of universal constants, e.g. ;  const PI = 3.14159  production of parameterized programs, e.g.,  const number_of_items = 20 ;  array_size = 100 ;  Advantages of Named Constants:  Leads to more readable programs,  Ease of maintenance of programs (since there is no use of magic numbers)  Illegal updates can be detected. Boolean_Char_Constants [email protected] 30 User-Defined Types Boolean_Char_Constants [email protected] 31 Enumerated Types (ǀ) Boolean_Char_Constants [email protected] 32 Enumerated Types (ǀI) Boolean_Char_Constants [email protected] 33 Enumerated Types (III) Boolean_Char_Constants [email protected] 34 Operations on Enumeration (ǀ)  assignment : e.g., Day d = Day.MON;  comparison : e.g., if d = MON then...  Built-in functions : valueOf, which returns the enum value that is the same as a given string. public class DayTripper { public enum Day {MON, TUE, WED, THU, FRI, SAT, SUN} public static void main(String[] args) { Day d = Day.MON; System.out.println("Initially d is " + d); d = Day.WED; Day t = Day.valueOf("WED"); System.out.println("I say d and t are the same: " + (d == t)); } }  The output from this program is:  Initially d is MON  I say d and t are the same: true Boolean_Char_Constants [email protected] 35 Operations on Enumeration (ǀI)  NO input or output: enumerated types can not be input or output.  To input an enumerated type, we may enter a numeric code (e.g., its ordinal number), then use a part of code to assign to the enumerated variable the corresponding value.  Instead of outputting a variable of enumerated type , we can output Its ordinal number or a meaningful string  Bounds of an array, control variables in for loops, selectors of a switch statement may be an enumerated expression. Boolean_Char_Constants [email protected] 36 Real Data Boolean_Char_Constants [email protected] 37 Domain (ǀ)  The real axis forms a continuum of values (every arbitrary small interval on the axis of real numbers contains infinitely many values).  In computers, the type real is a finite set of representatives of intervals on the real axis  The infinite uncountable set of real numbers is represented by a finite set of representative of intervals called model numbers. Real Numbers [email protected] 38 Domain (ǁ)  The values of model numbers form an ordered sequence from some hardware-determined minimum negative value to a maximum  Each model number is a representation of infinite number of real numbers.  The replacement of the real continuum by a finite set of representatives causes errors in computation with real values.  approximations instead of true values Real Numbers [email protected] 39 Errors due to Finite Representation of Real Numbers  The real number system consists of a continuum of numbers, whereas, the numbers handled by a computer form a discrete number system.  There are two mappings commonly used to map real numbers into the discrete number system (system of model numbers):  Chopping  Rounding  The difference between a number, x, and its mapping, x', is called the representation error. Real Numbers [email protected] 40 Chopping  A real number is represented by the first model number obtained from truncation towards zero.  The magnitude of this model number never exceeds the magnitude of the real number. Real Numbers [email protected] 41 Rounding  A real number is represented by the nearest model number.  The magnitude of this model number may be ≤ or > that of the real number. Real Numbers [email protected] 42 Operations on Real Data  Arithmetic Operations  Binary arithmetic operations have the specification : binary operation : real × real → real  where binary operation may be: addition + subtraction - multiplication * division / exponentiation (raising to power) ^ ↑ **  Unary arithmetic operations : have the specification : unary operation : real → real  where unary operation may be: identity + negation - 43 Real Numbers [email protected] Operations on Real Data  Relational Operations The relational operations have the specification: relational operation : real × real → Boolean where relational operation may be : = , ≠ , > , < , ≥ , or ≤  Assignment Operations Other operations (built-in library function): function f : real → real e.g., sin , cos , sqrt, abs , … Real Numbers [email protected] 44 Operations on Real Data  Conversion functions  Map a real number into an integer  Truncation towards zero:  Trunc (x) = integer obtained by truncating the fractional part of the number  rounding: getting the nearest integer,  Round (x) = trunc ( x + 0.5 ) if x ≥ 0 trunc ( x - 0.5 ) if x < 0  Truncation towards − ∞  Entier (x) = largest integer not larger than x  Examples:  Trunc (5.9) = 5 , trunc (- 4.2) = - 4  Round (5.9) = 6 , round (- 4.2) = - 4  Entier (5.9) = 5 , entier (- 4.2) = - 5 Real Numbers [email protected] 45 Realization of Real Data  Fixed-point representation  Floating-point representation Real Numbers [email protected] 46 Fixed-point Representation  A number is represented using  Fixed number of places (digits) for the integer part, say n, and  Fixed number of places for the fraction, say m.  where  R is the radix ( R ≥ 2 )  m and n are integers ≥ 0 and  0 ≤ dı ≤ R - 1 Real Numbers [email protected] 47 Characteristics of Fixed-point System  The resolution  The interval between two adjacent model numbers = R⁻ᵐ  The precision  The number of significant radix-R digits used in the representation of a number = (m + n) radix-R digits  Least positive number = R⁻ᵐ  Largest positive value:  = (R-1)(R-1)... (R-1) (R-1) (R-1)... (R-1)  = Rⁿ - R⁻ᵐ = Rⁿ (1.0 – R⁻ⁿ⁻ᵐ)  Example: for R=10, n=2, m=2 Largest positive value = 99.99 = 100 – 0.01 = 10² – 10ˉ² Real Numbers [email protected] 48 Characteristics of Fixed-point System  The absolute chopping error Consider a number  Represented as:  The absolute chopping error (N – N’)  There is a bound on the absolute chopping error in representation of a number (R⁻ᵐ) Real Numbers [email protected] 49 Characteristics of Fixed-point System Real Numbers [email protected] 50 Floating-point Representation  This method of representation can handle very small or very large real numbers; with small number of significant digits.  The general form of a real number in radix R expressed in floating-point notation is:  The digits d1 d2.... dn are called the fractional part or the mantissa.  e which is an integer, is called the exponent  Any given number, can be denoted by many pairs (mantissa& exponent).  Example Π = 3.14 = 0.314 * 10¹ = 0.0314 * 10² = 0.00314 * 10³  Π = 0.314 * 10¹ with no zero after dot. This is a normalized form Real Numbers [email protected] 51 Floating-point Representation  A normalized form is defined by the additional relation Rˉ¹ ≤ fraction < 1.0 , i.e., d1 ≥ 1 give max precision valid for all numbers except zero Thus, a real number in normalized form is represented as: Real Numbers [email protected] 52 Characteristics of Floating-Point System  The resolution: the interval between two adjacent model numbers  The resolution Rᵉ⁻ⁿ is not fixed but increases as the exponent, е, increases.  The precision: the number of significant radix-R digits used in the representation of a number  n radix-R digits  Least positive number =  Largest positive value: maximum fraction * maximum exponent Real Numbers [email protected] 53 Characteristics of Floating-Point System Real Numbers [email protected] 54 Equivalent Floating-Point systems Real Numbers [email protected] 55 Overflow and Underflow Real Numbers [email protected] 56 Storage Structure for Real Data  Floating-point representation is the most common storage structure used for real data.  The radix and number of digit positions represented within floating point format may vary from one computer to another.  Floating-point representation consists of :  The sign  0 denotes positive number  1 denotes negative number  The characteristic part (the biased exponent)  Two's complement notation is not good representation for the exponent  The exponent is expressed as a non-zero integer using excess notation, the biased-exponent , where Real Numbers [email protected] 57 Implementation of Operations  Most machines have double–precision floating-point representation.  Both single and double precision (if available) are generally supported by the hardware- arithmetic operations.  Exponentiation operation and the built-in functions are usually software simulated. Real Numbers [email protected] 58 Effect of Radix on Bound of Error  Consider a machine that stores real numbers in 32-bits :  1-bit for the sign  7-bits for the characteristic  24-bits for the normalized mantissa  There are 7-bits for the characteristic 0 ≤ biased exponent ≤ 127  The bias is - 64 - 64 ≤ true exponent ≤ 63  What happens to the characteristics (error, smallest number, largest number, etc), if the radix is changed, while keeping the number of bits unchanged? Real Numbers [email protected] 59 Real Numbers [email protected] 60 Floating-Point Operations  Floating – point operations are different from normal arithmetic operations.  Results of operations undergo post-normalization and chopping in order to be represented as a floating-point value.  Floating-point operation: floating-point Χ floating-point → floating- point  Thus, all axioms for arithmetic operations studied in mathematics are not valid for computation with floating-point data  For example ; (A+B)+C may not equal A+(B+C) (A*B)*C may not equal A*(B*C) A*(B+C) may not equal A*B+A* A+B-B may not equal A A*B=A*C does not imply B =C Real Numbers [email protected] 61 Computational Errors Real Numbers [email protected] 62 Sources of Round off Errors  There are Three sources of errors  Representation errors: error due to representation of a real value as a floating-point number  Error due to performing floating–point operations instead of normal arithmetic operations  Errors in (1) & (2) are not serious since, the relative error is bounded by R-(n -1), which is acceptable.  Propagation of errors Real Numbers [email protected] 63 Propagation of Errors  Relative error in performing multiplications and /or divisions with imprecise operands is acceptable since it is due to  The addition or subtraction of the relative error of the operands  Performing a floating–point operation  All of which is bounded to a very small value.  Relative error in performing additions or subtractions with imprecise operands may be unbounded in case the magnitude of two operands is to be subtracted, and these magnitudes are very near to each other.  Bound of relative error in X ± Y is  The relative error is not bounded, if X ± Y tends to zero, and all digits of the result of operation are in error. Such problem is known as the problem of “cancellation”, “loss of all significance” or “catastrophic subtraction ” Real Numbers [email protected] 64 Catastrophic Subtraction (ǀ)  Consider the subtraction of two real numbers X and Y, with values nearly equal.  Some of the rightmost digits of each are not correct due to computation errors.  When we subtract these numbers, the correct digits in both will cancel each other, since the two numbers are nearly equal  And the result of the operation is actually the result of subtracting the erroneous digits from both Real Numbers [email protected] 65 Catastrophic Subtraction (ǁ)  Therefore, when the two numbers are very near to each other, none of the exact digits from the two numbers contribute to the result  Post-normalization occurs after performing the operation, removing the leading zeros in the result, leaving there only the erroneous digits Real Numbers [email protected] 66 Cases to consider in Manipulation of Real Data  Avoid catastrophic subtraction  Replace the expression to be evaluated by another equivalent expression that does not lead to this problem.  For example √(1+x ) - √(1 -x ), for very small x , can replaced by the equivalent expression 2 x / [ √(1+x) +√(1-x)]  Never test equality of two real numbers:  Testing equality of two real numbers in a statement to control flow in a program, may lead to logical errors in the program.  Instead, one can reformulate the test condition. For example,  if | X-Y | ≤ 10⁻⁸ do something  if P ≤ 2.5 do something  Avoid addition or subtraction of terms of different magnitudes  Some of the least significant digits from the smaller number may not contribute to the result Real Numbers [email protected] 67 Example  Program to demonstrate CPU computation may be using precision higher than that used in representation of values. The same expression is evaluated in 3 different ways:  z = value when two intermediate results are to be stored  b = value when one intermediate result is to be stored  c = computation without storage of intermediate results  c is most accurate, z is least accurate. Why? Real Numbers [email protected] 68 Example Real Numbers [email protected] 69 Example Real Numbers [email protected] 70 Example Real Numbers [email protected] 71 Example Real Numbers [email protected] 72

Use Quizgecko on...
Browser
Browser