Podcast
Questions and Answers
What is the frequency range of the human auditory system?
What is the frequency range of the human auditory system?
What is the purpose of the filter bank in MPEG-1 Audio Encoding?
What is the purpose of the filter bank in MPEG-1 Audio Encoding?
What is the target bitrate for Layer 3 of MPEG-1 Audio Encoding?
What is the target bitrate for Layer 3 of MPEG-1 Audio Encoding?
What is the dynamic range of the human auditory system?
What is the dynamic range of the human auditory system?
Signup and view all the answers
What is the primary purpose of psycho-acoustic characteristics in audio compression?
What is the primary purpose of psycho-acoustic characteristics in audio compression?
Signup and view all the answers
What is the primary advantage of Joint-stereo coding in MPEG-1 Audio Encoding?
What is the primary advantage of Joint-stereo coding in MPEG-1 Audio Encoding?
Signup and view all the answers
What is the purpose of Huffman coding in MPEG-1 Audio Encoding?
What is the purpose of Huffman coding in MPEG-1 Audio Encoding?
Signup and view all the answers
What is the sampling frequency of Layer 1 of MPEG-1 Audio Encoding?
What is the sampling frequency of Layer 1 of MPEG-1 Audio Encoding?
Signup and view all the answers
What is the primary application of Layer 2 of MPEG-1 Audio Encoding?
What is the primary application of Layer 2 of MPEG-1 Audio Encoding?
Signup and view all the answers
When was MPEG-1 Audio standardized?
When was MPEG-1 Audio standardized?
Signup and view all the answers
Study Notes
Fourier Series and Transform
- Any periodic function can be expressed as the sum of a series of sines and cosines of varying amplitudes.
- The Fourier Transform maps a time series (e.g., audio samples) into the series of frequencies (their amplitudes and phases) that compose the time series.
- The Inverse Fourier Transform maps the series of frequencies (their amplitudes and phases) back into the corresponding time series.
- The two functions are inverses of each other.
Discrete Fourier Transform (DFT)
- The DFT takes a discrete signal in the time domain and transforms it into its discrete frequency domain representation.
- The DFT is extremely important in the area of frequency (spectrum) analysis.
Fast Fourier Transform (FFT)
- The FFT is a faster version of the DFT.
- The FFT utilizes some algorithms to do the same thing as the DFT, but in much less time.
Discrete Cosine Transform (DCT)
- The DCT is closely related to the DFT.
- The DCT can often reconstruct a sequence very accurately from only a few DCT coefficients, a useful property for applications requiring data reduction.
- The inverse DCT reconstructs a sequence from its DCT coefficients.
Audio Coding
- Pulse Code Modulation (PCM): sends every sample.
- Differential PCM (DPCM): sends differences between samples.
- Adaptive Differential PCM (ADPCM): sends differences, but adapts how they are coded.
- Sub-band ADPCM: uses ADPCM twice, once for lower frequencies, and again at a lower bitrate for upper frequencies.
- MP3 (MPEG-1 Audio Layer 3): a compressed audio format.
Why Compression is Needed
- Data rate = sampling rate * quantization bits * channels (+ control information).
- Compression is necessary to reduce the large amount of data generated by audio samples.
Compression Ratio
- Compression Ratio = (Original Data) / (Compressed Data).
Lossless and Lossy Compression
- Lossless compression: decoded audio is mathematically equivalent to the original one.
- Lossy compression: decoded audio is worse than the original one.
Pulse Code Modulation (PCM)
- Each sample's amplitude is represented by an integer code-word.
- Quantization error ("noise") occurs due to the limited number of code-words.
Linear PCM
- Uses evenly spaced quantization levels.
- Typically uses 16-bits per sample.
Telephony
- 8-bit linear encoding is poor quality.
- Solution: use 8 bits with an "logarithmic" encoding (non-linear sampling).
Non-linear Sampling
- If we try to use 8 bits per sample, dynamic range is reduced significantly, and quantization noise can be heard.
- Solution: sample more densely in the lower amplitudes and less densely for the higher amplitudes.
m-law and A-law
- Non-linear sampling called "companding".
- 8-bit companded provides dynamic range equivalent to 12-bits.
- m-law and A-law are companding standards.
Differential PCM
- Based on the fact that neighboring samples in a discrete audio sequence change slowly in many cases.
- Normally, the difference between samples is relatively small and can be coded with less than 8 bits.
ADPCM (Adaptive Differential PCM)
- Makes a simple prediction of the next sample, based on weighted previous n samples.
- Lossy coding of the difference between the actual sample and the prediction.
Sub-band ADPCM
- Codes the two frequency ranges (0-4KHz and 4-7KHz) separately.
- Filter into two bands: 50Hz - 4 KHz (encode at 48Kb/s) and 4KHz - 7KHz (encode at 16Kb/s).
Human Auditory System
- Human auditory system has limitations.
- Frequency range: 20 Hz to 20 kHz, sensitive at 2 to 4 KHz.
- Dynamic range (quietest to loudest) is about 96 dB.
MPEG-1 Audio
- Lossy compression of audio.
- In late 1980's ISO's MPEG group started to standardize audio compression for TV broadcasting and CD-ROM (later DVD).
MPEG-1 Audio Encoding
- Characteristics: precision 16 bits, sampling frequency: 32KHz, 44.1 KHz, 48 KHz.
- 3 compression layers: Layer 1, Layer 2, Layer 3.
- Supports one or two audio channels in one of the four modes: monophonic, dual-monophonic, stereo, and joint-stereo.
Huffman Coding
- Variable length coding, with most frequent codes using fewest bits and less frequent codes using more bits.
- Encoding done by building an encoding tree.
Studying That Suits You
Use AI to generate personalized quizzes and flashcards to suit your learning preferences.
Description
Explore the principles of Fourier Series and Transforms in the context of multimedia, including the representation of periodic functions and time series analysis.