Hash Functions and Authentication Codes PDF

Summary

This document explores hash functions, discussing their properties like preimage and collision resistance, and examining the Merkle-Damgard construction for handling arbitrary length inputs. It's relevant for understanding cryptographic principles and data integrity.

Full Transcript

```markdown $\int \frac{x^4}{x^{10}+2x^5+2}dx$ $z = u(x) = x^5 +1$ ### Hash functions, Message Authentication codes - A cryptographic hash function H is a function which takes arbitrary length bit strings as inputs and produces a fixed-length bit string as output. The output is often called a diges...

```markdown $\int \frac{x^4}{x^{10}+2x^5+2}dx$ $z = u(x) = x^5 +1$ ### Hash functions, Message Authentication codes - A cryptographic hash function H is a function which takes arbitrary length bit strings as inputs and produces a fixed-length bit string as output. The output is often called a digest or a hash value. - A crucial difference between a standard hash function and a cryptographic hash function should at least have the property of being one-way. In other words, given any string y from the codomain of H, it should be computationally infeasible to find any value $x$ in the domain of $H$ such that $H(x) = y$ - Another way to describe a hash function which has the one-way property is that it is preimage resistant. Given a hash function as: - Like a function for which produces outputs of t bits, we would like a function for which finding preimages requires $O(2^t)$ time. - A cryptographic hash function should also be second preimage resistant. This is the property that given $m$ it shoul be hard to find an $m'\neq m$ with $H(m') = H(m)$. In particular a cryptographic hash function with $t$ bit outputs should require about $2^t$ queries before one can find a second preimage. - In practice we need in addition a property called collision resistance. First consider a function $H$ mapping elements in a domain $D$ to a codomain $C$. For collision resistance to make any sense we assume that the domain $D$ is much larger than the codomain $C$. A function is called collision resistant if it is infeasible to find 2 distinct values $m$ and $m'$ such that $H(m) = H(m')$. **Definition** A function $H$ is said to be collision resistant or HI-CR secure if it is believed to be infeasible to write down a collision for the function (two elements in the domain mapping to the same element in the codomain). - It is harder to construct collision resistant hash functions than one-way hash function due to the birthday paradox. To find a collision of a hash function $H$, we can keep computing $H(m_1), H(m_2), H(m_3)$ until we get a collision. - If the function has an output size of $t$ bits then the probability of obtaining a collision after $q$ queries to $H$ is $\frac{q^2}{2t}$. So we expect to obtain a collision after about $2^{t/2}$ queries. This should be compared with the number of steps needed to find a preimage, which should be about $2^t$ for a well designed hash function. Hence to achieve a security level of 128 bits for a collision resistant hash function we need roughly 256 bits of output. - In summary cryptographic hash functions needs to satisfy the following 3 properties: 1. It should be hard to find a message with a given hash value. (preimage resistant) 2. Second preimage resistant: Given one message it should be hard to find another message with the same hash value. 3. Collision Resistant: It should be hard to find 2 messages with the same hash value. **Lemma 14.2** Assuming a function $H$ is preimage resistant for every element of the range of $H$ is a weaker assumption than assuming it is either collision resistant or second preimage resistant. **Lemma 14.3** Assuming a function is second preimage resistant is a weaker assumption than assuming it is collision resistant. **Padding** Now given an input message $m$ which is lbits in lingth we will want to make it a message of $b$ bits in length. This is done by padding, or extending, the message of adding extra bits onto the message until it is of the required length. ### Merkle-Damgard Construction - In this section we describes the basic Merkle-Damgard construction of a hash function taking inputs of ARBITRARY LENGTH from a hash function which takes inputs of a fisted length. - The construction is based on a family of compression functions $f$, for example $f:\{0,1\}^{\ell + n}\rightarrow \{0,1\}^{n}$ - The construction also makes use of an internal state variable $s_i$ which is updated by the application of the compression **Algorithm 14.1** - Pad the input messagem using a padding scheme, so that the output is a multiple of $l$ bits in length. - Divide the input m into r blocks of length $l$ bits $(m_1 \dots m_t)$ **Algorithm** ``` s₀ <- iV for i = 1 to t do sᵢ = f(sᵢ₋₁, mᵢ) return sₜ ``` The problem with the MD[f,s] construction is that collision resistance depends on the padding scheme being used. ```