Multimedia: Fourier Series and Transforms

ThoughtfulAcer avatar
ThoughtfulAcer
·
·
Download

Start Quiz

Study Flashcards

10 Questions

What is the frequency range of the human auditory system?

20 Hz to 20 kHz

What is the purpose of the filter bank in MPEG-1 Audio Encoding?

To divide the input into multiple sub-bands

What is the target bitrate for Layer 3 of MPEG-1 Audio Encoding?

64 kbps

What is the dynamic range of the human auditory system?

96 dB

What is the primary purpose of psycho-acoustic characteristics in audio compression?

To take advantage of the limitations of human hearing

What is the primary advantage of Joint-stereo coding in MPEG-1 Audio Encoding?

It takes advantage of the correlations between stereo channels

What is the purpose of Huffman coding in MPEG-1 Audio Encoding?

To assign shorter codes to more frequent samples

What is the sampling frequency of Layer 1 of MPEG-1 Audio Encoding?

Any of the above

What is the primary application of Layer 2 of MPEG-1 Audio Encoding?

Digital Audio and Digital Video Broadcasting

When was MPEG-1 Audio standardized?

1992

Study Notes

Fourier Series and Transform

  • Any periodic function can be expressed as the sum of a series of sines and cosines of varying amplitudes.
  • The Fourier Transform maps a time series (e.g., audio samples) into the series of frequencies (their amplitudes and phases) that compose the time series.
  • The Inverse Fourier Transform maps the series of frequencies (their amplitudes and phases) back into the corresponding time series.
  • The two functions are inverses of each other.

Discrete Fourier Transform (DFT)

  • The DFT takes a discrete signal in the time domain and transforms it into its discrete frequency domain representation.
  • The DFT is extremely important in the area of frequency (spectrum) analysis.

Fast Fourier Transform (FFT)

  • The FFT is a faster version of the DFT.
  • The FFT utilizes some algorithms to do the same thing as the DFT, but in much less time.

Discrete Cosine Transform (DCT)

  • The DCT is closely related to the DFT.
  • The DCT can often reconstruct a sequence very accurately from only a few DCT coefficients, a useful property for applications requiring data reduction.
  • The inverse DCT reconstructs a sequence from its DCT coefficients.

Audio Coding

  • Pulse Code Modulation (PCM): sends every sample.
  • Differential PCM (DPCM): sends differences between samples.
  • Adaptive Differential PCM (ADPCM): sends differences, but adapts how they are coded.
  • Sub-band ADPCM: uses ADPCM twice, once for lower frequencies, and again at a lower bitrate for upper frequencies.
  • MP3 (MPEG-1 Audio Layer 3): a compressed audio format.

Why Compression is Needed

  • Data rate = sampling rate * quantization bits * channels (+ control information).
  • Compression is necessary to reduce the large amount of data generated by audio samples.

Compression Ratio

  • Compression Ratio = (Original Data) / (Compressed Data).

Lossless and Lossy Compression

  • Lossless compression: decoded audio is mathematically equivalent to the original one.
  • Lossy compression: decoded audio is worse than the original one.

Pulse Code Modulation (PCM)

  • Each sample's amplitude is represented by an integer code-word.
  • Quantization error ("noise") occurs due to the limited number of code-words.

Linear PCM

  • Uses evenly spaced quantization levels.
  • Typically uses 16-bits per sample.

Telephony

  • 8-bit linear encoding is poor quality.
  • Solution: use 8 bits with an "logarithmic" encoding (non-linear sampling).

Non-linear Sampling

  • If we try to use 8 bits per sample, dynamic range is reduced significantly, and quantization noise can be heard.
  • Solution: sample more densely in the lower amplitudes and less densely for the higher amplitudes.

m-law and A-law

  • Non-linear sampling called "companding".
  • 8-bit companded provides dynamic range equivalent to 12-bits.
  • m-law and A-law are companding standards.

Differential PCM

  • Based on the fact that neighboring samples in a discrete audio sequence change slowly in many cases.
  • Normally, the difference between samples is relatively small and can be coded with less than 8 bits.

ADPCM (Adaptive Differential PCM)

  • Makes a simple prediction of the next sample, based on weighted previous n samples.
  • Lossy coding of the difference between the actual sample and the prediction.

Sub-band ADPCM

  • Codes the two frequency ranges (0-4KHz and 4-7KHz) separately.
  • Filter into two bands: 50Hz - 4 KHz (encode at 48Kb/s) and 4KHz - 7KHz (encode at 16Kb/s).

Human Auditory System

  • Human auditory system has limitations.
  • Frequency range: 20 Hz to 20 kHz, sensitive at 2 to 4 KHz.
  • Dynamic range (quietest to loudest) is about 96 dB.

MPEG-1 Audio

  • Lossy compression of audio.
  • In late 1980's ISO's MPEG group started to standardize audio compression for TV broadcasting and CD-ROM (later DVD).

MPEG-1 Audio Encoding

  • Characteristics: precision 16 bits, sampling frequency: 32KHz, 44.1 KHz, 48 KHz.
  • 3 compression layers: Layer 1, Layer 2, Layer 3.
  • Supports one or two audio channels in one of the four modes: monophonic, dual-monophonic, stereo, and joint-stereo.

Huffman Coding

  • Variable length coding, with most frequent codes using fewest bits and less frequent codes using more bits.
  • Encoding done by building an encoding tree.

Explore the principles of Fourier Series and Transforms in the context of multimedia, including the representation of periodic functions and time series analysis.

Make Your Own Quizzes and Flashcards

Convert your notes into interactive study material.

Get started for free

More Quizzes Like This

Use Quizgecko on...
Browser
Browser