Summary

This document is about various coding theories. It contains information and examples about Shannon coding, Fano coding, Huffman coding, and arithmetic coding. It is intended for an undergraduate-level course in electronic engineering.

Full Transcript

MODULE #9 CODING THEORY It is the study of the properties of “codes” and their respective fitness for specific applications. Codes are studied by various scientific disciplines, but mostly for the purpose of designing efficient and reliable data transmission methods. This typically invol...

MODULE #9 CODING THEORY It is the study of the properties of “codes” and their respective fitness for specific applications. Codes are studied by various scientific disciplines, but mostly for the purpose of designing efficient and reliable data transmission methods. This typically involves the removal of redundancy and the correction or detection of errors in the transmitted data. Application of Coding Theory Alan M. Turing Data compression (1912–1954) Cryptography Mathematician and Cryptanalyst Error Detection and Correction “Father of Modern Computing” Data transmission One of the most influential Data storage codebreakers in World War II Etc. TYPES OF CODING 1.Data compression (or source coding)  Shannon-Fano Coding  Huffman Coding  Arithmetic Coding 2.Error Control / ECC-FEC (or channel coding)  Block Codes (Reed-Solomon Codes, Hamming Codes, Cyclic Redundancy Codes, etc.)  Convolutional Codes (Trellis, Viterbi, etc.) 3.Line coding  Unipolar / Polar / Bipolar (ex. NRZ, AMI, etc.)  Multilevel / Multi-transition (ex. 2B1Q, MLT3, etc.) 4.Cryptographic coding  Hash Codes (ex. MD5, SHA, etc.)  Cipher Codes (ex. AES, DES, RSA, etc.) SHANNON CODING Shannon Coding is a lossless data compression technique for constructing a prefix code based on a set of symbols and their probabilities (estimated or measured). The method was the first of its type, the technique was used to prove Shannon's noiseless coding theorem in his 1948 article "A Mathematical Theory of Communication". Claude Elwood Shannon Sample: (1916–2001) li = [-log2 0.36] “The father of information theory” li = [1.474] li = 2 li = [-log2 0.18] li = [2.474] li = 3 FANO CODING In Fano's method, the symbols are arranged in order of decreasing probability, and then divided into two sets whose total probabilities are as close as possible to being equal. All symbols then have the first digits of their codes assigned; symbols in the first set receive "0" and symbols in the second set receive "1". As long as any sets with more than one member remain, the same process is repeated on those sets, to determine successive digits of their codes. When a set has been reduced to one symbol this means the symbol's code is complete and will not form the prefix of any other symbol’s code. Robert Mario Fano (1917–2016) 1976 “Shannon Award” Recipient HUFFMAN CODING In 1952, David A. Huffman published a paper, entitled "A Method for the Construction of Minimum- Redundancy Codes". The paper describes his algorithm as a variable- length code for encoding a source symbol. The algorithm derives the table from the estimated probability or frequency of occurrence (weight) for each possible value of the source symbol. “this is an example of a huffman tree" David A. Huffman (1925–1999) Electrical Engineer Inventor of the Huffman Code minimum-length lossless data- compression code HUFFMAN ENCODING EXAMPLE Find the Huffman tree, code words, code string, and average bits/symbol for the text string: “A DEAD DAD CEDED A BAD BABE A BEADED ABACA BED” # of occurrence (descending order) Find the frequency (probability) Step 1 of occurrence for each symbol in the text string. (Total = 46) HUFFMAN ENCODING EXAMPLE Find the Huffman code words for the text string: “A DEAD DAD CEDED A BAD BABE A BEADED ABACA BED” Combine the symbols starting with symbols with lowest frequency/occurrence. Step 2 Step 3 HUFFMAN ENCODING EXAMPLE Find the Huffman code words for the text string: “A DEAD DAD CEDED A BAD BABE A BEADED ABACA BED” Step 4 Step 5 Final Step HUFFMAN ENCODING EXAMPLE Huffman Code Word FINAL ANSWERS: Huffman Tree Code Words Summary: Total # of Symbols: 46 Total # of bits: 115 bits Average: 2.5 bits per symbol Hartley’s Equation H = log2 M H = log2 (6) H = 2.584963 bits HUFFMAN DECODING EXAMPLE Decode the Huffman Code below: Given Code Words ARITHMETIC CODING Arithmetic Coding is a form of entropy encoding used in lossless data compression. Arithmetic coding differs from other forms of entropy encoding, such as Huffman coding, in that rather than separating the input into component symbols and replacing each with a code, arithmetic coding encodes the entire message into a single number or an arbitrary- precision fraction q, where 0.0 ≤ q < 1.0 It represents the current information as a range, defined by two numbers. Example: ISAT! = 0.5298010 to 0.5296210 = 0.5297110 (median) = 0.100001111001101100012 ISAT!code = 10000111100110110001 Entropy encoding = data compression encoding schemes that are considered “lossless encoding” ARITHMETIC CODING EXAMPLE Given information/message = “ISAT!” Where “!” = signify the end-of-message (EOM) / end-of-transmission (EOT) / end-of-file (EOF) Find the arithmetic code string and average bits/symbol for the text string The probabilities for each symbol are given in the table below for our convenience: symbol assumed probability A 0.1 I 0.3 S 0.2 T 0.3 ! 0.1 TOTAL 1.00 EXAMPLE (ARITHMETIC CODING) ARITHMETIC ENCODING EXAMPLE Upper bound: 0.52980 Middle bound: 0.52971 Lower bound: 0.52962 Convert middle bound to binary: Middle bound: 0.100001111001101100012 ISAT!code = 10000111100110110001 How to check: Covert to back to decimal: 0.100001111001101100012 Decimal = 0.5297098159790039062510 *Decimal value must range between the upper bound and lower bound values. Ave. bits/symbol = 20 / 5 Ave. bits/symbol = 4 bits/symbol ARITHMETIC DECODING EXAMPLE RECEIVED CODE = “10000111100110110001” Decimal: 0.52970981597900390625 Limitation: Each code word must have a symbol to mark the end of transmission: “!” “End of File” = EOF marks the end of data transmission TRELLIS CODING (REVIEW) A combination of coding and modulation. It uses concepts of modulation, coding, dynamic programming, lattice structure and matrices in mathematics. Trellis Coded Modulation (TCM) uses a convolutional code. CODE TREE Watch Tutorial Video on TCM encoding using Code Tree! VITERBI ALGORITHM The Viterbi algorithm is a dynamic programming algorithm for obtaining the maximum a posteriori probability estimate of the most likely sequence of hidden states — called the Viterbi path. This algorithm was proposed by Andrew J. Viterbi in 1967, and has been applied in decoding TCM and convolutional codes used in: - CDMA and GSM digital cellular - Dial-up modems Andrew J. Viterbi - Satellite and deep-space communications (1935–present) - Wi-Fi 802.11 wireless LAN Electrical Engineer - Speech recognition and synthesis American electrical engineer and - Computational linguistics businessman who co-founded Qualcomm Inc. and invented the - Bioinformatics Viterbi algorithm for decoding - Etc. convolutional encoded data VITERBI ALGORITHM State Machine Diagram Trellis Diagram Trellis Diagram Viterbi Diagram (blank) VITERBI DECODING ALGORITHM Decode the received TCM code word: “1101001111” Solution: 1.) Determine the Hamming Distance for every state transition. 2.) Write the Hamming Distance associated with each path. Trellis Diagram 11 01 00 11 11 2 1 0 2 2 2 0 0 0 1 2 0 0 0 2 2 0 1 1 1 1 2 1 2 1 1 1 1 1 1 Viterbi Diagram 3.) Accumulate the Hamming Distance for every state transition. 4.) Find the path with the least Hamming Distance associated to that path. 11 01 00 11 11 2 1 0 2 2 2 0 0 0 1 2 0 0 0 2 2 0 1 1 1 1 2 1 2 1 1 1 1 1 1 11 01 00 11 11 3 3 4 5 2 3 1 4 1 3 0 3 5 2 5 2 3 3 5 3 0 4 1 3 2 3 4 2 VITERBI DECODING ALGORITHM Decode the received TCM code word: “1101001111” FINAL ANSWER: Trellis Diagram 11 01 00 11 11 3 3 4 5 2 3 1 4 1 3 0 3 5 2 5 2 3 3 5 3 0 4 1 3 2 3 4 2 CODE: MESSAGE: CRYPTOGRAPHY Cryptography is the study of secure communications techniques that allow only the sender and intended recipient of the “data” or “message” to view its contents. The two main processes are encryption and decryption. Encryption is the process of converting information or data into a code (cipher), especially to prevent unauthorized access. Encryption/Decryption is performed using an encryption algorithm. Most common symmetric algorithms:  AES = Advanced Encryption Standard is a specification for the encryption of electronic data established by the U.S. National Institute of Standards and Technology (NIST) in 2001.  DES = Data Encryption Standard is a symmetric-key algorithm for the encryption of digital data. Although its short key length of 56 bits makes it too insecure for modern applications. It has been highly influential in the advancement of cryptography and was developed in the early 1970s at IBM. ENCRYPTION AND DECRYPTION Symmetric Encryption Using the same “key” for encryption and decryption is called symmetric encryption. Only the two parties (sender and recipient) can read and access the data. Other terms used are: -secret key encryption The encrypted message is called -secret key cryptography “ciphertext” or “cipher” for short. -private key cryptography -symmetric cryptography One main advantage of symmetric -symmetric key encryption encryption is its speed because keys are much shorter, and the overall process is quicker. ENCRYPTION AND DECRYPTION Asymmetric Encryption Using a different “key” for encryption and decryption is called asymmetric encryption. Asymmetric encryption, also known as public-key cryptography, uses the concept of a key pair. The sender and the recipient use a pair of keys, i.e. a private key and a public key during the encryption and decryption process. It works like this: Private keys are kept secret by the senders and recipients to encrypt/decrypt messages. Only public keys are shared across the public at large. The private key and public key pairs are mathematically related to each other, but you cannot generate the private key using a public key. Common asymmetric encryption algorithms examples include: -RSA = named after those who invented it in 1978: Ron Rivest, Adi Shamir, and Leonard Adleman -DSA = Digital Signature Algorithm, endorsed by the NIST based on the mathematical concept of modular exponentiation and the discrete logarithm problem. APPLICATIONS OF CRYPTOGRAPHY Cryptography is not only about encrypting and decrypting messages, it is also about solving real- world problems that require information security. 1. Confidentiality: encryption and decryption 2. Data Integrity: detection of corruption/alteration/ tampering 3. Authentication: security and authenticity 4. Non-repudiation: sender cannot deny sending the message 5. Secret Sharing: using multiple keys to open a single message Ex. GSM(2G) Security Management MSC-AC VLR BTS Air Interface IMEI SIM RAND IMSI SRES Authentication SRES Ki Ki A3 COMPARING A3 Request of IMEI EIR IMEI Checking ME Provide IMEI Traffic Traffic Ciphering Kc Kc A8 A5 EncryptedData Encrypted Data A5 A8 Kc = cipher key Ki = authentication key Rand = random number A3,A5,A8 – GSM encryption algorithms SRES = signed response END OF MODULE #9

Use Quizgecko on...
Browser
Browser