FFT Theory PDF

The Fast Fourier transform (FFT) The FFT is a computationally efficient algorithm for computing the DFT. Note that the output of the FFT is exactly equal to the output of the DFT. The difference is that the FFT gives the same result with much less multiplications. In fact for large sequences (say N > 1000 which is not unusual in practice) the FFT is about 100 times more efficient than the DFT. For this reason it is always used in practice. Recall the DFT (see DFT & IDFT in the EE301 notes): 2𝜋𝜋 2𝜋𝜋 −𝑗𝑗( )𝑘𝑘𝑘𝑘 𝑆𝑆(𝑘𝑘) = ∑𝑁𝑁−1 𝑛𝑛=0 𝑠𝑠(𝑛𝑛)𝑒𝑒 𝑁𝑁 = ∑𝑁𝑁−1 𝑘𝑘𝑘𝑘 𝑛𝑛=0 𝑠𝑠(𝑛𝑛)𝑊𝑊𝑁𝑁 where 𝑊𝑊𝑁𝑁 =𝑒𝑒 −𝑗𝑗( 𝑁𝑁 ) To calculate one term of the output sequence S(k) requires N multiplications (each of these multiplications actually involves two real multiplications as it is one real number multiplied by one complex number). Therefore to calculate all N terms of S(k) requires N2 multiplications. If we can somehow split an N-point DFT into two (N/2)-point DFT’s then the total number of multiplications required would be (N/2)2 + (N/2)2 = N2/2 which represents a potential halving in computatonal complexity. The FFT works on this principle. In fact, it further splits each of the (N/2)-point DFTs into two (N/4)-point DFTs and so on until its left with N/2 2-point DFTs. We will now show how to split (or decompose or decimate) the N-point DFT into two (N/2)- point DFT’s and still generate the same result. This is often called one level of decomposition. Starting with an N-point DFT of an input sample sequence [s(n)]: N −1 S ( k ) = ∑ s( n )W Nkn n =0 Decompose the sequence [s(n)] into odd- and even-INDEXED terms i.e. [s(n)] = s(0), s(1), s(2), s(3), …, s(N-1) so we’re decomposing it into the two sub-sequences: [s1(n)] = s(2n) = s(0), s(2), …, s(N-2) and [s2(n)] = s(2n+1) = s(1), s(3), …, s(N-1) then S(k) can be rewritten as: N −1 N −1 2 2 =n 0= 2 kn N S(= k) n 0 ∑ s( 2n )W + ∑ s( 2n + 1 )WNk( 2 n+1 ) N −1 N −1 2 2 = 2 kn N n 0= k N n 0 = ∑ s( 2n )W +W ∑ s( 2n + 1 )W 2 kn N 1 But W N2 kn = W( knN / 2 ) (verify this!) And writing the summations as (N/2)-point DFT’s we get: 𝑁𝑁 𝑁𝑁 −1 −1 2 2 𝑘𝑘𝑘𝑘 𝑘𝑘𝑘𝑘 𝑆𝑆(𝑘𝑘) = 𝑠𝑠1 (𝑛𝑛)𝑊𝑊(𝑁𝑁/2) + 𝑊𝑊𝑁𝑁𝑘𝑘 𝑠𝑠2 (𝑛𝑛)𝑊𝑊(𝑁𝑁/2) 𝑛𝑛=0 𝑛𝑛=0 = 𝑆𝑆1 (𝑘𝑘) + 𝑊𝑊𝑁𝑁𝑘𝑘 𝑆𝑆2 (𝑘𝑘) (1) Note that this is true for the full range of k i.e. k = [0, 1, 2, … , N-1] 𝑁𝑁 (𝑘𝑘+ ) But 𝑊𝑊𝑁𝑁 2 = −𝑊𝑊𝑁𝑁𝑘𝑘 (verify this!). Note – this property of WN is the key to the computational saving. Therefore we use 𝑆𝑆(𝑘𝑘) = 𝑆𝑆1 (𝑘𝑘) + 𝑊𝑊𝑁𝑁𝑘𝑘 𝑆𝑆2 (𝑘𝑘) for k in the range [0, 1, 2, … , (N/2)-1] and because 𝑆𝑆(𝑘𝑘 + (𝑁𝑁/2)) = 𝑆𝑆1 (𝑘𝑘) − 𝑊𝑊𝑁𝑁𝑘𝑘 𝑆𝑆2 (𝑘𝑘) then by simply changing the sign of the W Nk term we automatically get the S(k) terms for k in the range [(N/2), (N/2)+1, (N/2)+2, … , N-1]. So to evaluate the DFT using one level of decomposition as shown above actually requires two (N/2)-point DFTs AND an additional N/2 multiplications for the W Nk terms i.e. a total of 𝑁𝑁 𝑁𝑁 𝑁𝑁 2 2. ( 2 )2 + 2 ≈ for large N. 2 If instead of computing the two (N/2)-point DFTs directly, we further decompose them and continue to decompose all the sub-sequences until we’re left with N/2 2-point DFTs then we can maximize the computational load saving associated with decomposition as shown above. Its also worth noting that for a 2-point DFT, the twiddle factors (complex exponential multiplier terms) become: 𝑊𝑊2𝑘𝑘.𝑛𝑛 = 𝑒𝑒 −𝑗𝑗𝑗𝑗.𝑘𝑘.𝑛𝑛 for 𝑘𝑘 and 𝑛𝑛 = [0,1] ⇒ 𝑊𝑊2𝑘𝑘.𝑛𝑛 = +1 or − 1 This means that only trivial products are required to calculate all of the above 2-point DFTs. This means that the only non-trivial products required in calculating equation (1) above are the products associated with the 𝑊𝑊𝑁𝑁𝑘𝑘 weighting factor which is multiplied by 𝑆𝑆2 (𝑘𝑘). Keep in mind, however, that at each level of decomposition there will be N/2 such products to be computed even if we are further decomposing the sub-sequences 𝑆𝑆1 (𝑘𝑘) and 𝑆𝑆2 (𝑘𝑘) and their sub-sequences all the way down to N/2 2-point DFTs. Also, at these earlier levels of decomposition (which involve greater than 2-point DFTs), the 𝑊𝑊𝑁𝑁𝑘𝑘 weighting factors will be non-trivial complex numbers which means that at each level of decomposition we’re looking at N/2 complex times real number products which is equal to N real number products. Ex 1. Determine an expression for the total number of multiplications required to compute an N-point FFT. Write your answer in terms of N. 2 Ans 1. First work out the number of levels of decomposition needed to get from one N-point DFT to N/2 2-point DFTs. Then multiply this number of levels by the number of products at each level associated with the 𝑊𝑊𝑁𝑁𝑘𝑘 weighting factors. As noted above, this number of products is N/2 complex times real number products at each level. One way to work out the number of levels of decomposition needed to get from one N-point DFT to N/2 2-point DFTs is to recognice that if we increase N by an integer power of 2, this adds an extra level of decomposition e.g. if we start with N = 4, then one level of decomposition will get us to two 2-point DFTs. Similarly, if we start with N = 2 x 4 = 8, then two levels of decomposition will get us to four 2-point DFTs etc. N = 16 ⇒ 3 levels, N = 32 ⇒ 4 levels etc and you can see the trend emerging. So it becomes apparent that the number of 𝑁𝑁 levels of decomposition is equal to 𝑙𝑙𝑙𝑙𝑙𝑙2 2. And as noted above, at each level there are the N/2 products associated with the 𝑊𝑊𝑁𝑁𝑘𝑘 weighting factors. 𝑁𝑁 𝑁𝑁 So our FFT computational load estimate becomes: 2 𝑙𝑙𝑙𝑙𝑙𝑙2 2 𝑙𝑙𝑙𝑙(𝑘𝑘) Recall from log maths that if 𝑥𝑥 = 𝑙𝑙𝑙𝑙𝑙𝑙2 𝑘𝑘 ⇒ 2𝑥𝑥 = 𝑘𝑘 ⇒ 𝑥𝑥. 𝑙𝑙𝑙𝑙(2) = 𝑙𝑙𝑙𝑙(𝑘𝑘) ⇒ 𝑥𝑥 = 𝑙𝑙𝑙𝑙(2) Where ln is the natural log or 𝑙𝑙𝑙𝑙𝑙𝑙𝑒𝑒. So to get a feel for the level of computational load saving by using the FFT to compute the DFT, let’s say for example N = 10,000 samples of some signal. Then we want to compare 𝑁𝑁 2 𝑁𝑁 𝑁𝑁 for the straight DFT with 2 𝑙𝑙𝑙𝑙𝑙𝑙2 2 for the FFT. 𝑁𝑁 𝑁𝑁 𝑙𝑙𝑙𝑙(5000) So for N = 10,000, 𝑁𝑁 2 = 108 and 2 𝑙𝑙𝑙𝑙𝑙𝑙2 2 = 5000. ≈ 61,438 𝑙𝑙𝑙𝑙(2) 61,438 = 0.00061438 = 0.06% 108 Or a computational load saving of 99.94%. Let us verify the above computational load saving associated with one level of decomposition by implememting it in Matlab and estimating the reduction in computational load. Recall the DFT (see DFT & IDFT): DFT: 𝑆𝑆(𝑘𝑘) = ∑𝑁𝑁−1 𝑛𝑛=0 𝑠𝑠(𝑛𝑛)𝑒𝑒 -j(2𝜋𝜋/𝑁𝑁)𝑘𝑘𝑘𝑘 Note that this summation can be computed as a matrix product i.e. 0.0 0.(𝑁𝑁−1) ⎡ 𝑊𝑊𝑁𝑁 𝑊𝑊𝑁𝑁0.1.... 𝑊𝑊𝑁𝑁 ⎤ 𝑠𝑠(0) 1.(𝑁𝑁−1) ⎡ ⎤ ⎢ 𝑊𝑊𝑁𝑁1.0 𝑊𝑊𝑁𝑁1.1.... 𝑊𝑊𝑁𝑁 ⎥ 𝑠𝑠(1) 𝑆𝑆(𝑘𝑘) = ⎢ 𝑊𝑊 2.0 ⎥ ⎢ 𝑠𝑠(2) ⎥⎥ ⎢ ⎢ 𝑁𝑁 :.... : ⎥⎢ : : :. : : ⎥ ⎢ (𝑁𝑁−1).0 (𝑁𝑁−1).1 ⎥ (𝑁𝑁−1).(𝑁𝑁−1) ⎣𝑠𝑠(𝑁𝑁 − 1)⎦ ⎣𝑊𝑊𝑁𝑁 𝑊𝑊𝑁𝑁.... 𝑊𝑊𝑁𝑁 ⎦ Recall the steps involved in coding the DFT as the above matrix product in Matlab are as follows: - Start your function m-file as follows: function op = dft(ip) 3 - To determine the length of the input sampled signal use: N = length(ip); - Generate the N-by-N matrix [ kn ] where [k] and [n] are both equal to {0,1,2,…N-1}.  2π  - Multiply the matrix [ kn ] by − j   and raise e to the power of the entire matrix using.^  N  - Finally multiply the resulting N-by-N matrix by the column vector form of ip. If ip is a row vector then it can be transposed using: ip' Now we saw above that by decomposing the original time-domain sequence into two sub- sequences and computing their DFTs separetly we can approximately halve the required number of multiplications. However, we also saw that the second of these (N/2)-point DFTs has a frequency-dependent weighting factor 𝑾𝑾𝒌𝒌𝑵𝑵 𝑵𝑵 𝑵𝑵 −𝟏𝟏 −𝟏𝟏 𝟐𝟐 𝟐𝟐 𝑺𝑺(𝒌𝒌) = 𝒔𝒔𝟏𝟏 (𝒏𝒏)𝑾𝑾𝒌𝒌𝒌𝒌 𝒌𝒌 𝒌𝒌𝒌𝒌 (𝑵𝑵/𝟐𝟐) + 𝑾𝑾𝑵𝑵 𝒔𝒔𝟐𝟐 (𝒏𝒏)𝑾𝑾(𝑵𝑵/𝟐𝟐) 𝒏𝒏=𝟎𝟎 𝒏𝒏=𝟎𝟎 Nonetheless we can still implement the two (N/2)-point DFT summations in the same way as above i.e. a matrix product i.e. 𝑁𝑁 0. −1 ⎡ 𝑊𝑊𝑁𝑁0.0 𝑊𝑊𝑁𝑁0.1.... 𝑊𝑊𝑁𝑁 2 ⎤ ⎢ 2 2 2 ⎥ 𝑆𝑆(0) ⎡ ⎤ ⎢ 1. −1 ⎥ 𝑁𝑁 𝑠𝑠(0) 𝑆𝑆(1) ⎥ ⎡ 𝑠𝑠(2) ⎤ 1.0 ⎢ ⎥ ⎢ 𝑊𝑊𝑁𝑁 𝑊𝑊𝑁𝑁1.1.... 𝑊𝑊𝑁𝑁 2 [𝑆𝑆𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿 ] = ⎢ 𝑆𝑆(2) ⎥=⎢ 2 2 2 ⎥ ⎢ 𝑠𝑠(4) ⎥ ⎢ ⋮ ⎥ ⎢ 𝑊𝑊𝑁𝑁2.0 :.... : ⎢ ⎥ ⎥⎢ : ⎥ ⎢ 𝑁𝑁 ⎥ ⎢ 2 ⎥ ⎣𝑠𝑠(𝑁𝑁 − 1)⎦ ⎣𝑆𝑆 2 − 1 ⎦ ⎢ 𝑁𝑁: : 𝑁𝑁 :. 𝑁𝑁 : 𝑁𝑁 ⎥ ⎢𝑊𝑊 2 −1.0 −1.1 𝑊𝑊𝑁𝑁 2.... −1. −1 𝑊𝑊𝑁𝑁 2 2 ⎥ 𝑁𝑁 ⎣ 2 2 2 ⎦ 𝑁𝑁 0. −1 ⎡ 𝑊𝑊𝑁𝑁0.0 𝑊𝑊𝑁𝑁0.1.... 𝑊𝑊𝑁𝑁 2 ⎤ ⎢ 2 2 2 ⎥ 𝑊𝑊𝑁𝑁0 0 0.. 0 𝑠𝑠(1) ⎡ ⎤⎢ 1. −1 ⎥ 𝑁𝑁 ⎢ 0 𝑊𝑊𝑁𝑁1 0.. 0 ⎥ ⎢ 𝑊𝑊𝑁𝑁1.0 𝑊𝑊𝑁𝑁1.1.... 𝑊𝑊𝑁𝑁 2 ⎥ ⎡ 𝑠𝑠(3) ⎤ +⎢ 0 0 𝑊𝑊𝑁𝑁2.. 0 ⎥⎢ 2 2 2 ⎥ ⎢ 𝑠𝑠(5) ⎥ 𝑊𝑊 2.0 :.... : ⎢ ⎥ ⎢ : : :. : ⎥ ⎢ 𝑁𝑁 ⎥⎢ : ⎥ ⎢ −1 ⎥ ⎢ 𝑁𝑁 2 ⎥ ⎣𝑠𝑠(𝑁𝑁 − 1)⎦ ⎣ 0 0 0.. 𝑊𝑊𝑁𝑁 2 ⎦ ⎢ 𝑁𝑁: 𝑁𝑁 : :. 𝑁𝑁 : 𝑁𝑁 ⎥ ⎢𝑊𝑊 2 −1.0 −1.1 𝑊𝑊𝑁𝑁 2.... −1. −1 𝑊𝑊𝑁𝑁 2 2 ⎥ 𝑁𝑁 ⎣ 2 2 2 ⎦ And 4 𝑁𝑁 0. −1 𝑁𝑁 ⎡ 𝑊𝑊𝑁𝑁 0.0 𝑊𝑊𝑁𝑁0.1.... 𝑊𝑊𝑁𝑁 2 ⎤ ⎡𝑆𝑆 + 1 ⎤ ⎢ 2 2 2 ⎥ ⎢ 2 ⎥ ⎢ 1. −1 ⎥ 𝑁𝑁 𝑠𝑠(0) 𝑁𝑁 ⎥ ⎡ 𝑠𝑠(2) ⎤ 1.0 ⎢𝑆𝑆 + 2 ⎥ ⎢ 𝑊𝑊𝑁𝑁 𝑊𝑊𝑁𝑁1.1.... 𝑊𝑊𝑁𝑁 2 [𝑆𝑆𝑈𝑈𝑈𝑈𝑈𝑈𝑈𝑈𝑈𝑈 ] = ⎢ 2 ⎥=⎢ 2 2 2 ⎥ ⎢ 𝑠𝑠(4) ⎥ ⎢ ⎥ ⎢ 𝑁𝑁 2.0 ⎥ ⎢ 𝑊𝑊𝑁𝑁 :.... : ⎥⎢ : ⎥ 𝑆𝑆 + 3 ⎢ 2 ⎥ ⎢ 2 ⎥ ⎣𝑠𝑠(𝑁𝑁 − 1)⎦ : : :. : ⎢ ⋮ ⎥ ⎢ 𝑁𝑁 𝑁𝑁 𝑁𝑁 𝑁𝑁 ⎥ −1.0 −1.1 −1. −1 ⎣ 𝑆𝑆(𝑁𝑁 − 1) ⎦ ⎢𝑊𝑊𝑁𝑁 2 𝑊𝑊𝑁𝑁 2.... 𝑊𝑊𝑁𝑁 2 2 ⎥ ⎣ 2 2 2 ⎦ 𝑁𝑁 0. −1 ⎡ 𝑊𝑊𝑁𝑁0.0 𝑊𝑊𝑁𝑁0.1.... 𝑊𝑊𝑁𝑁 2 ⎤ ⎢ 2 2 2 ⎥ 𝑊𝑊𝑁𝑁0 0 0.. 0 Note ⎡ ⎤⎢ 1. −1 ⎥ 𝑁𝑁 𝑠𝑠(1) change ⎢ 0 𝑊𝑊𝑁𝑁1 0.. 0 ⎥ ⎢ 𝑊𝑊𝑁𝑁1.0 𝑊𝑊𝑁𝑁1.1.... 𝑊𝑊𝑁𝑁 2 ⎥ ⎡ 𝑠𝑠(3) ⎤ −⎢ 0 0 𝑊𝑊𝑁𝑁2.. 0 ⎥⎢ 2 2 2 ⎥ ⎢ 𝑠𝑠(5) ⎥ of sign ⎢ : : :. : ⎥ ⎢ 𝑊𝑊𝑁𝑁 2.0 :.... : ⎥⎢ ⎢ ⎥ : ⎥ ⎢ −1 ⎥ ⎢ 𝑁𝑁 2 ⎥ ⎣𝑠𝑠(𝑁𝑁 − 1)⎦ ⎣ 0 0 0.. 𝑊𝑊𝑁𝑁 2 ⎦ ⎢ 𝑁𝑁: 𝑁𝑁 : :. 𝑁𝑁 : 𝑁𝑁 ⎥ ⎢𝑊𝑊 2 −1.0 −1.1 𝑊𝑊𝑁𝑁 2.... −1. −1 𝑊𝑊𝑁𝑁 2 2 ⎥ 𝑁𝑁 ⎣ 2 2 2 ⎦ DFT Computation with one level of decomposition Ex 2. Write an m-file to investigate the computational load saving associated with decomposition when computing the DFT. Use the tic and toc commands (see help tic in Matlab) in your m-file to first estimate the computational load associated with the straight DFT i.e. with no decomposition. Then use the tic and toc commands again to estimate the computational load associated with computing the DFT using one level of decomposition. Finally, use the tic and toc commands again, this time to estimate the computational load associated with the inbuilt Matlab fft function (see help fft) which repeatedly decomposes the original input sequence for maximum computational load saving. Ans 2. See below demo m-file dftc1.m and test2.m (script m-file for setting up a suitable test signal). Note that the above explanation only showed the computational saving associated with one level of decomposition of the DFT. Further computational load saving must be available if we apply the same decomposition process again to any interim DFTs. This is the basis upon which the FFT can achieve computational saving in excess of 99%. Another popular way to visualise how this computational load saving is achieved is via the so-called FFT butterfly diagram shown below. The large rectangle on the left represents an N-point DFT. If we computed this as a straight matrix product it would involve a computational load of the order of N2 multiplications. However, instead of computing this matrix product, we decompose the input sequence into two (N/2)-point sequences as indicated by the many arrows. So now we have two (N/2)-point DFTs one of which is multiplied by the frequency-dependent weighting factor WNk as discussed above. We have to multiply this weighting factor by one of the (N/2)-point DFTs even if we’re not going to 5 compute that DFT at this level as a matrix product. So the weighting factor requires N/2 complex products as indicated on the diagram. With the WNk taken account of we know that further computational load saving is still available if we decompose further so that’s what we do i.e. we decompose the two (N/2)- point DFTs each into two (N/4)-point DFTs as indicated in the diagram. Now two of these (N/4)-point DFTs will have similar frequency-dependent weighting factors which will need to be multiplied by their respective (N/4)-point DFTs at this second level of decomposition. These weighting factors will require 2 x N/4 = N/2 complex products as indicated on the diagram. So in order to maximize the computational load saving we keep decomposing and weighting in the above manner until we’re left with N/2 2-point DFTs. Note that the W-terms or twiddle factors associated with a 2-point DFT are +1 and -1 so they actually involve trivial products as indicated. The FFT Butterfly Diagram One N-point DFT Two N/2-pt DFTs Four N/4-pt DFTs N/2 2-pt DFTs N/2 complex products N/4 + N/4 N/2 complex complex products products N/2 complex (trivial) products Note – The above process is known as a radix-2 decimation in time (DIT) FFT Note also that the above massive computational load saving associated with the FFT is not limited to spectral analysis. In the next section we’ll see how this computational load saving can also be extended to other widespread applications such as digital filtering (fast convolution – there are several digital filters which run in realtime on your smart phone thanks to the FFT) and pattern matching (fast correlation) which is used in image analysis. 6 test2.m: % script m-file to generate a test signal fs = 10000; % Pick a sample rate N = 16*1024; % Pick a number of samples to display t = (0:N-1)/N; % set up a 1 sec time vector f = 100; % sine wave signal frequency - make sure it < Nyquist i.e. fs/2 s = sin(2*pi*f*t); plot(t/N,s), xlabel('time (sec)'), grid dftc1.m: function X=dftc1(x); % function X=dftc1(x) % demo m-file to investigate the computational load saving associated with the straight % DFT, the DFT with one level of decomosition and the full FFT. % % Note - In order to run this demo function m-file, you need to set up a % sufficiently long input sample sequence x. See for example test2.m hold off N=length(x); % First compute the straight DFT % Generate the W matrix. n=0:(N-1); k = n; w=k'*n;%, pause%; W=exp(w*(-2*pi*j/N)); % Calculate the straight N-point DFT disp('Straight DFT: ') tstart = tic; Xs=W*x'; t_elapsed = toc(tstart) % N-point FFT disp('FFT: ') tstart = tic; Xf=fft(x); t_elapsed = toc(tstart) figure(1) stem(abs(Xs)) xlabel('Frequency by index') ylabel('mag(X)') title('Straight DFT Spectrum') figure(2) stem(abs(Xf)) xlabel('Frequency by index') ylabel('mag(X)') title('FFT Spectrum') % Next compute the DFT with 1 level of decomposition % Split input into two interleaved halves. xd1=x(1:2:(N-1)); 7 xd2=x(2:2:N); No2=N/2; % Generate the first part of decomposed W matrix n=0:(No2-1); w=n'*n; Wa=exp(w*(-2*pi*j/No2)); % Generate the N/2-point weighting matrix Ww=zeros(No2:No2); for k=1:No2 Ww(k,k)=exp(-(k-1)*2*pi*j/N); end Wb = Ww*Wa; disp('DFT with one level of decomposition: ') tstart = tic; a=Wa*xd1'; b=Wb*xd2'; t_elapsed = toc(tstart) % Perform "butterfly". X1=a+b; X2=a-b; % Concatenate X1 and X2 to form complete DFT. X=[X1' X2']; figure(3) stem(abs(X)) xlabel('Frequency by index') ylabel('mag(X)') title('One level of decomposition DFT') 8

Document Details

Tags

Related

Summary

Full Transcript