Digital Audio - 2 - Miguel Negrão PDF
Document Details
Uploaded by MagnificentLithium
Miguel Negrão
Tags
Summary
This document is a presentation on digital audio, covering topics like analog vs. digital signals, sampling, frequencies, quantization, and the Nyquist-Shannon sampling theorem. The presentation also explores digital audio formats (WAV, MP3, etc.) and the reasons for choosing specific sampling rates. Different aspects such as noise, dynamic range and bitrate are also analyzed.
Full Transcript
2 - Digital Audio Sound Design - Games and Multimedia Miguel Negrão ©2024 CC BY-NC-ND 4.0 ??????? 16bit vs 24bits vs 32bits sample resolution 44,1kHz vs 48kHz vs 192kHz sample frequency Float vs Int sample format ??????? Signals...
2 - Digital Audio Sound Design - Games and Multimedia Miguel Negrão ©2024 CC BY-NC-ND 4.0 ??????? 16bit vs 24bits vs 32bits sample resolution 44,1kHz vs 48kHz vs 192kHz sample frequency Float vs Int sample format ??????? Signals Signal: any time varying quantity. Analog Signal: a continuous signal which represents a physical quantity, such as pressure. Analog Signal Electric signal An electric current is a flow of electric charge, usually electrons. Voltage is the difference in electric potential between two points. An electric signal is a variation of voltage. Electric signal - water analogy Charge is water Voltage is the pressure difference in the water Current is the flow of water (litters per second; either more water, or the same amount of water moving faster produces more flow) image source: http://hyperphysics.phy- astr.gsu.edu/hbase/electric/watcir2.html Electric signal - water analogy A battery is like a water pump. A resistor is like very narrow pipe. image source: http://hyperphysics.phy- astr.gsu.edu/hbase/electric/watcir2.html Audio Signal: a signal with frequencies between 20Hz and 20.000Hz. (which humans can hear) Analog Audio Signals The signal at the output of a microphone. The signal at the output of a cable connected to an electric guitar. The signal sent to a loudspeaker. Analog Signals: more about this when we talk about electroacoustics. (microphones, loudspeakers) Digital Audio Why use digital audio ? In order to store and manipulate an audio signal in a computer it must be converted into a series of numbers. Sampling = converting a continuous signal to a series of numbers. Sampling x[t] 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 t x[t] 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 t Amount of time between each sample sampling interval or sampling period ( , unit s). Number of times signal is sampled each second sampling frequency ( , unit Hz). To convert an analog signal to digital we use an analog-to-digital converter or A/D converter or ADC. To convert a digital signal to analog we use an digital-to-analog converter or D/A converter or DAC. digital-to-analog conversion (top to bottom) f(t) digital signal t f(t) staircase signal t f(t) output of sound card (low-pass filter) t How many times per second do we need to sample a signal ? If we sample more times per second it must be better, right ? If we sample more times per second it must be better, right ? The science says NO ! Nyquist-Shannon sampling theorem If all frequencies of the signal are below sampling frequency only needs to be. bandwidth sampling freq. needed 500Hz 1000Hz 10.000Hz 20.000Hz 20.000Hz 40.000Hz bandwidth: the difference between the highest and lowest frequency. In this case the lowest is zero. Nyquist-Shannon sampling theorem When the condition is respected (in theory) an analog signal converted to digital and then back to analog will be exactly the same signal without any loss. In reality it is not entirely so... (there are no band-limited signals in the real world, quantization, anti-alising filter, limits of electronics, etc) signal contains frequencies equal or higher than and the sampling frequency is aliasing ! 0 0 1 2 3 4 5 6 7 8 9 10 Usually a signal has many frequencies. Those that are above will fold back (aliasing). When that happens, it sounds bad ! Let's listen ! Is analog better than digital ? Some people think of that analog is better than digital, because analog has smooth curves and digital has staircase signals. f(t) t Wrong! A digital signal is not defined between samples! f(t) t The grey curve is the only curve that fits those points and all frequencies bellow. For humans how high should be ? 20Khz 40Khz. Sampling frequencies for audio: 22.050 Hz 44.100 Hz (CD) 48.000 Hz (video) 96.000 Hz 192.000 Hz Which sampling frequency to use ? There is no evidence that most humans can hear frequencies above 20.000Hz. For recording or consumer formats (e.g. downloads) there is no reason to use anything higher than 44,1kHz or 48kHz. more detail:. Higher sampling-rates Higher-sampling rates give more headroom for the DAC analog filter... but these days most converters use oversampling reconstruction, which makes that irrelevant. Higher sampling-rates Can be usefull for certain types of audio processing: Pitch-shifting Any process which introduces new frequencies: digital distortion, ring-modulation, audio synthesis, etc. Also, if effects plug-ins don't use oversampling internally. Higher sampling rate higher CPU load. My advice: Record at 44.1kHz or 48kHz DAW at 44.1kHz, 48kHz (or 96kHz) Export at appropriate sampling-rate (44.1kHz, 48kHz or 22.050Hz). My advice: If you are doing work for video record and export at 48kHz Sampling resolution Sampling resolution: 16bits integer vs 24bits integer vs 32bits float Pulse-code modulation (PCM): the sampling method described before with each sample quantized to a grid. x[t] 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 t PCM is the standard format of digital audio in computers, CDs, digital telephony, etc. x[t] Linear pulse-code 7 modulation (LPCM) is 6 5 a specific type of PCM 4 where the 3 2 quantization levels are 1 linearly uniform. 0 1 2 3 4 5 6 7 8 9 10 11 12 13 t The quantized samples can be stored with integer numbers or floating point numbers. Recap on digital representation of numbers: N bits values 2 bits 4 values 00, 01, 10, 11 8 bits 256 values 00000000, 00000001,..., 11111111 16 bits 65.536 values 0000000000000000,... 24 bits 16.777.216 values... x[t] Higher number of 7 bits higher 6 5 precision when 4 3 recording the 2 1 samples. 0 1 2 3 4 5 6 7 8 9 10 11 12 13 t The quantization process adds noise. (source ) Higher number of bits less quantization noise. If you use more bits, then the difference between the original value and the sampled value is smaller, therefore the noise has less amplitude. Dynamic range: the difference between the smallest and largest value. The dynamic range of human hearing is between 120dB and 140 dB. Dynamic Range in audio is the loudest sound relative to the noise floor. This is also called the signal-to-noise ratio (SNR). Quantization noise limits the dynamic range of digital audio. The dynamic range of a digital system is around 6dB per bit. Dynamic range 16 bits 96 dB 24 bits 144 dB Comparison to analog Home made cassete: ~6 bits Vynil: ~10 bits Professional studio tape: ~13 bits There is a trick: dithering ! Pushes the noise into frequency regions we are less sensitive to. Perceived dynamic range using dithering 16 bits around 120dB That is good enough for reproduction (cds, etc). Converting 24bits to 16 bits use dithering to have higher dynamic range ! Dynamic range 16 bits 96 dB 16 bits with dithering 120 dB 24 bits 144 dB Comparison to analog Consumer cassete 60 to 70 dB Vynil 60 to 70 dB Best professional studio tape 78 dB When to use 24 bits ? Recording: more headroom. Applying additional processing (mixing and mastering): each processing step can increase the overall noise level by amplifying the quantization noise. When recording use 24 bits. DAWs internally use 32 bits, 48bits, 64bits floating point. The final audio file can be downsampled to 16bits (using dithering) or 24bits. Audio can also be saved using floating point numbers. The advantage is that clipping is almost impossible, so it can be usefull when recording synthesized sound in real-time from software. Recently sound cards which record at 32bit float have appeared (sound devices MixPre-10 II). 32 bits floating- point 1528 dB of dynamic range (source) Internally in audio software samples are usually processed as floating point numbers in the range When using integers, below -1.0 or above 1.0 → clipping When using ratios of amplitudes in digital signals we use decibels relative to full scale or dBFS 0 dBFS = 1.0 = highest possible value. All other values are negative. E.g. a signal that reaches half the maximum value (0.5) is at -6dBFS Other modulation techniques different from PCM: Pulse-width modulation (PWM) Delta-sigma modulation Super Audio CD - 1 bit, 2.8224 MHz Non-linear PCM With linear PCM all steps have the same size. But the quantization error affects more the quieter sounds. 1cm error is more problematic when measuring a 10cm object than when a 10m object. Using a non-linear scale can improve the situation. Non-linear PCM Non-linear PCM Example: G.711 (audio used on ISDN phone connections) Sample rate: 8 kHz Non-uniform (logarithmic) quantization with 8 bits ( μ-law or A- law) bit rate: 64 kbit/s Audio files Audio files are stored in computer disk They consist of A header containing information about the file A long list of numbers, the samples for each channel Audio files - Formats Uncompressed: WAV AIFF CAF Lossless compression: FLAC Lossy compression: mp3 aac ogg (vorbis codec) Audio files - Formats Use only uncompressed formats for recording and processing. In most situations final format will be: (16 or 24bits integer) and (44.1kHz or 48kHz) uncompressed format (WAV, AIFF) Final format can be compressed if that is required (e.g. digital game). recap of digital units A bit can be either 1 or 0 1 byte = 8 bits 1 KiB = bytes (kibibyte) 1 MiB = bytes (mebibyte) 1 kB = bytes (kilobyte) 1 MB = bytes (megabyte) 1 GB = bytes (gigabyte) Unfortunately most softwares still use MB for MiB, etc... :-( How large will a sound file be ? 10 minutes at 44.100Hz at 16bits in stereo. How large will a sound file be ? 10 minutes at 44.100Hz at 16bits in stereo: In bits: in mebibytes: Bit rate = number of bits that are sent or processed per unit of time Bit rate unit: bits per second (symbol: "bit/s") unit kilobit per second kbit/s megabit per second Mbit/s gigabit per second Gbit/s terabit per second Tbit/s Example: bit rate of CD (44.100 kHz, 16bit, stereo) Example: bit rate of CD (MP3 - 96 to 320 kbit/s) References: https://people.xiph.org/~xiphmont/demo/neil-young.html https://wiki.xiph.org/Videos/Digital_Show_and_Tell Note: Images without attribution are from wikipedia and have their own different CC license. Study materials: Main: Digital Sound and Music - Chapter 5 - Digitization (free, online and pdf in moodle) Study materials: Additional: Português: "Introdução à Engenharia de Som"; Nuno Fonseca; FCA; 2012 - Capítulo 6 - Áudio Digital English: "Modern Recording Techniques"; David Miles Huber, Robert E. Runstein; Focal Press; 8th edition; 2013 - Chapter 6 - Digital Audio Technology