Final Exam of "Introduction to Information Theory" 2024 PDF
Document Details
Uploaded by MagicHawk3023
Jacobs University Bremen
2024
Tags
Summary
This is an exam paper for "Introduction to Information Theory" from 2024. It covers topics such as entropy, quantization, security, and Gaussian distributions. The questions assess the students' understanding of these principles.
Full Transcript
Final Exam of “Introduction to Information Theory” 30.05.2024 Prof. Dr.-Ing. W. Henkel................................................................................................................................................... Name Matricul...
Final Exam of “Introduction to Information Theory” 30.05.2024 Prof. Dr.-Ing. W. Henkel................................................................................................................................................... Name Matriculation No. Signature individual scores sum final grade 1 2 3 4 5 6 7 13 9 8 12 4 9 8 63 1.) Entropy, quantization, security a) Let us assume a 2-dimensional circular symmetric Gaussian density with one- dimensional variance of σ2. Determine the differential entropy. b) You are asked to quantize the value s with that 2-D Gaussian distribution to be binary. How would you ideally do this for security purposes and what would you expect from a Lloyd-Max quantizer? You are now selecting a 4-region quantization suitable for security purposes. Will it be the one from Lloyd-Max quantization? If so, sketch the quantization bound- aries. If you are in favor of different regions, draw them, as well. c) If the two quantization solutions would not be used for security purposes, but just as a usual source quantizer with a corresponding code-book representation. What would be the code-book entries for both (2 and 4 regions, your region choices!) cases? Write the mathematical starting expression for one region, only! You also do not need to determine the final solution! d) What will be the entropies after the chosen quantizations (2 and 4 regions). e) Which trivial operations (at least, two) would not change the entropies after quantization? 1 Solution: a) H(X) = 2 · 21 log2 (2πeσ2 ) (1) b) 2 regions: You would split in halves through the origin and also Lloyd-Max alg. would lead to this result. (2) 4 regions: For security purposes, one needs equal probability for all regions, e.g., split in quadrants or in a “cake” shape with a round piece in the middle and 3 cake pieces split in radiants starting from the inner circle. Lloyd-Max would not lead to a uniform distribution! (2) R ∞ x − x2 c) 2 regions with split in the middle: c+ 2 = 2· 0 2πσ e 2σ2 (1) 2 R∞ − r2 For quadrants: c2 (r) = 4 · r=0 r2πre 2σ dr, (2) angle of codebook entry in the middle of the quadrants. (1) d) 1 bit for 2 regions, 2 bit for 4 regions (2) e) rotation, scaling, also translation, but quantization grid has to be shifted with it (2) 2 2.) Gaussian multi-user interference channel, secrecy capacity Imagine to have N users each with signal power spectral density Si with inter- ference relations given by a transfer function Hi j from the jth towards the ith terminal. Each terminal also experiences thermal noise with AWGN PSD Ni. a) Determine the overall capacity from jx to ix when no interference cancellation in place. b) Since one would now have multiple eavesdroppers to listen to a legitimate conversation, how would you formulate the secrecy capacity for the same pairing, if the eavesdroppers are independent, not cooperating. c) Why can you hope that despite of a diminishing secrecy capacity on average, to still get secret messages across in the case of a mobile wireless channel. Why? What is the practical requirement to make use of this possibility? Solution: a) ! 1 S jx |Hix jx |2 Cix jx = log 1 + (3) 2 ∑ j\ jx S j |Hix , j |2 + Nix b) ! 1 |2 S jx |Hix jx Cix jx = log 1 + 2 ∑ j\ jx S j |Hix , j |2 + Nix ( !) S |H |2 1 jx i jx − max log 1 + (4) i\ix 2 ∑ j\ jx S j |Hx, j |2 + Ni c) Although not on average, but at times, the secrecy capacity may be non-zero. Both sides have to be able to adjust to the time variation (known CSI, channel state information, also knowing the transmission quality to the eavesdropper!). (2) 3 3.) Markov model and LDPC codes a) Imagine variable nodes of an LDPC code to be fed with components that show a dependency from a Markov model of first order, i.e., having memory one, only. Through our decoding process, we get estimates for the probability of a certain variable node to be +1 or −1, which we label P+ and P− , respectively. Assuming to have +1 and −1 at one variable node, what would be the log- likelihood ratios for the following variable node given a Markov dependency ac- cording to the following model. 1-P1 P1 1 1 P2 1-P2 Now, weight those expressions with P+ and P− to obtain the overall LLR. b) What is the entropy conditioned on being in a certain state “+1” and “-1”? What is the entropy averaging over the two states? If we do not consider support by the decoding to determine P+ and P− , how big would they be, if we just use the Markov model itself. c) When would you have P+ = P− ? Then you can study some effect on the entropy separately. How would you describe this property that one may like to concentrate on, only. Solution: P1 a) L+ = ln 1−P 1 , L− = ln 1−P 2 P2 , (2) P1 Laver = P+ ln 1−P 1 + P− ln 1−P P2 2 (1) b) HX|S = +1) = h(P1 ) , HX|S = −1) = h(P2 ) (1) + − H(X) = P · h(P1 ) + P · h(P2 ) (1) 1−P2 1−P1 From Markov model: P+ = 2−P 1 −P2 , P− = 2−P1 −P2 (1) c) P1 = P2 , allows for studying memory alone, without having an entropy effect caused by unequal probabilities of the two values. (2) 4 4.) Coding, Hamming weight, Hamming distance, serial concatenation, Union bound Let us consider a simple product code (array code), a 3 × 3 matrix where the rows and columns both are even parity-check codes. a) Is this a linear or a non-linear code? What is the necessary condition to be linear and is it fulfilled? b) What can you say about the relation of minimum Hamming weight and mini- mum Hamming distance? c) What is the minimum Hamming distance of this product code? d) How many errors and erasures could you correct? Note: in the case of erasures, the result depends on the locations of the erasures. How many can you definitely correct, how many at max? e) What are the dimensions of the H-matrix of the whole scheme? f) Determine the H-matrix, seeing the codeword as a concatenation of the rows of the 3 × 3 matrix. g) Determine the first term of the Union Bound assume a binary symmetric chan- nel with a BER of p. Solution: a) Linear, contains the all-zero word (2) b) wH min = dH min (1) c) wH min = dH min = 4 (1) Just check the smallest set of ones, e.g., in the upper left square, i.e., all informa- tion components to be one. This is weight 4! d) terror = 1, 3 ≤ terasure ≤ 5 (2) j k −1 terror = dH min 2 , terasure ≥ dH min − 1, e.g., having erasures at positions (1,1); (1,2); (2,1) can be corrected. Having erasures at all parity positions can also be corrected. This represents the upper limit of 5. e) 5 × 9 (1) The number of elements in the 3 × 3 matrix is, of course, N = 9. The number of parities in rows and columns is 5. 5 1 1 1 0 0 0 0 0 0 0 0 0 1 1 1 0 0 0 1 0 0 1 0 0 1 0 0 f) (3) 0 1 0 0 1 0 0 1 0 0 0 1 0 0 1 0 0 1 Note: there are, of course, other equivalent H-matrices. This one checks 1. 1st row 2. 2nd row 3. 1st col. 4. 2nd col. 5. 3rd col. of the 3 × 3 array code. g) pW / 12 Admin p2 (1 − p)7 , Admin = 4 + 4 + 1 (2) Explanation: Pairwise error probability for minimum distance 4. For even dis- tance, at half distance with 2 errors, one could flip a coin, explaining the 1/2 factor. Admin = [single ones in the four info positions] + [neighboring pairs of ones in rows or columns of the 2 × 2 info field] + [all ones in the info field] 6 5.) Galois fields Let the primitive polynomial of GF(26 ) be α6 + α + 1. a) Do a first simplification step for (1 + α)70 = b) What is the order of β = α3 ? c) You would ideally like to use some kind of FFT for this Galois field GF(26 ). Is this possible in principle? Explain! d) What will be the length N of the DFT/FFT using the element β of order N? Solution: a) (1 + α)70 = (1 + α)7 (1) b) 21 (1) c) Prime factorization: 63 = 7 · 3 · 3. In principle, FFT with 3 stages would be possible, but not too useful! (1) d) 21 (1) 7 6.) Capacity, Multi-user communication, Rate-Distortion theory a) Let us consider the following channel with two input and three output values. We first assume ε = 0. First add the missing probability. Then, what is the channel capacity? b) Now, you slightly lower the added probability by some ε > 0. Without com- puting anything, will the capacity increase or decrease? Explain! c) Now, you are cascading two of those channels, again with ε = 0. What will be the capacity? d) Now, you take a binary symmetric channel with BER of q in front of the channel with ε = 0 or behind. What will be both capacities? e) The concatenation of two channels can be seen as a special broadcast channel. Which one? How can you introduce one of the two user streams for capacity calculations? At first sight, one would otherwise only see one input. f) Again consider ε = 0 and see the figure as rate-distortion arrangement. Let us assume identical probabilities of the “upper” and “lower” source values. Deter- mine the rate-distortion function in this case. Solution: a) Added: 1 − p, C = 1 − p (2) b) Capacity will decrease, since flipping output will not solve the issue completely, hence, this has a BSC component. (1) c) C = (1 − p) 2 (1) d) C = (1 − p) · h(q) (1) e) Degraded broadcast channel, user can be introduces as BSC (2) 8 f) I(X; X̂) = H(X) − H(X|X̂) H(X) = ∑i P(xi ) log2 P(xi ) = 2 · 12 log2 [(1 − p)/2] + p log2 p (1) H(X|X̂) = h(p) (1) Directly: I(X; X̂) = 1 − p (same number of points) 7.) Some mixed questions a) Classify the source coding schemes into two different classes. Specify what the classes stand for. List Huffman, LZ77, LZ78, LZW84, Shannon-Fano-Elias, Arithmetic Coding. (2) b) Assume to have a density over a voltage. What is the unit of the density? How could you avoid units and what would then happen when applying a non-linear function to the random variable? (2) c) How can you use RSA pubic key encryption for authentication? (1) d) Using physical-layer key generation, Alice (A) and Bob (B) get keys that are closely matching, however, still having slight deviations. What is the amount of redundancy to provide to make them fully match, ignoring transmission errors? What is the coding scheme to align them? (2) e) How would you explain the Maximum-Entropy approach to determine un- known parameters? (1) Solution: a) Entropy codes based on symbol probabilities: Huffman, Shannon-Fano-Elias, Arithmetic coding; dictionary-based: LZ77, LZ78, LZW84. b) Unit: 1/V; subdivide the range into small intervals and determine the probabil- ities inside the intervals by integration. Probability mass functions do not carry a unit. Functions lead to a move of the interval limits, keeping the probabilities within the intervals given by the limits. c) One can send authentication data by using the public key, which can be checked by the private one. This works, of course, also in the opposite direction. d) H(A|B) or H(B|A), Slepian-Wolf bound. Slepian-Wolf coding to align the keys. e) Keeping the maximum uncertainty to parameters that are not known. 9