Sound and Recording: An Introduction PDF
Document Details
Uploaded by ReformedDieBrücke
2006
John Smith
Tags
Summary
This is a textbook about the principles of sound and recording. It provides an introduction to the subject areas for those new to the concepts and is focused on an understanding of sound. The book is from 2006.
Full Transcript
Sound and Recording: An Introduction This page is intentionally left blank Sound and Recording: An Introduction Fifth edition Francis Rumsey BMus (Tonmeister), PhD Tim McCormick BA, LTCL AMSTERDAM BOSTON HEIDELBERG LONDON NEW YORK OXFORD PARIS SAN DIEGO S...
Sound and Recording: An Introduction This page is intentionally left blank Sound and Recording: An Introduction Fifth edition Francis Rumsey BMus (Tonmeister), PhD Tim McCormick BA, LTCL AMSTERDAM BOSTON HEIDELBERG LONDON NEW YORK OXFORD PARIS SAN DIEGO SAN FRANCISCO SINGAPORE SYDNEY TOKYO Focal Press is an imprint of Elsevier Focal Press is an imprint of Elsevier Linacre House, Jordan Hill, Oxford OX2 8DP 30 Corporate Drive, Suite 400, Burlington MA 01803 First published 1992 Reprinted 1994 Second edition 1994 Reprinted 1995, 1996 Third edition 1997 Reprinted 1998 (twice), 1999, 2000, 2001 Fourth edition 2002 Reprinted 2003, 2004 Fifth edition 2006 Copyright © 2006 Francis Rumsey and Tim McCormick. All rights reserved The right of Francis Rumsey and Tim McCormick to be identified as the authors of this work has been asserted in accordance with the Copyright, Designs and Patents Act 1988 No part of this publication may be reproduced in any material form (including photocopying or storing in any medium by electronic means and whether or not transiently or incidentally to some other use of this publication) without the written permission of the copyright holder except in accordance with the provisions of the Copyright, Designs and Patents Act 1988 or under the terms of a licence issued by the Copyright Licensing Agency Ltd, 90 Tottenham Court Road, London, England W1T 4LP. Applications for the copyright holder’s written permission to reproduce any part of this publication should be addressed to the publisher Permissions may be sought directly from Elsevier’s Science and Technology Rights Department in Oxford, UK: phone: (+44) (0) 1865 843830; fax: (+44) (0) 1865 853333; e-mail: [email protected]. You may also complete your request on-line via the Elsevier homepage (http:// www.elsevier.com), by selecting ‘Customer Support’ and then ‘Obtaining Permissions’ British Library Cataloguing in Publication Data A catalogue record for this book is available from the British Library Library of Congress Cataloguing in Publication Data A catalogue record for this book is available from the Library of Congress ISBN–13: 978-0-240-51996-8 ISBN–10: 0-240-51996-5 For information on all Focal Press publications visit our website at: www.focalpress.com Printed and bound in Great Britain 05 06 07 08 09 10 10 9 8 7 6 5 4 3 2 1 Contents Fact File Directory xi Preface to the Second Edition xv Preface to the Third Edition xvii Preface to the Fourth Edition xix Preface to the Fifth Edition xxi Chapter 1 What is sound? 1 A vibrating source 1 Characteristics of a sound wave 1 How sound travels in air 3 Simple and complex sounds 4 Frequency spectra of repetitive sounds 5 Frequency spectra of non-repetitive sounds 7 Phase 8 Sound in electrical form 11 Displaying the characteristics of a sound wave 13 The decibel 14 Sound power and sound pressure 16 Free and reverberant fields 18 Standing waves 21 Recommended further reading 24 Chapter 2 Auditory perception 25 The hearing mechanism 25 Frequency perception 26 Loudness perception 28 Practical implications of equal-loudness contours 30 Spatial perception 32 Recommended further reading 40 v vi Contents Chapter 3 Microphones 41 The moving-coil or dynamic microphone 41 The ribbon microphone 42 The capacitor or condenser microphone 45 Directional responses and polar diagrams 46 Specialised microphone types 54 Switchable polar patterns 56 Stereo microphones 57 Microphone performance 59 Microphone powering options 62 Radio microphones 65 Recommended further reading 73 Chapter 4 Loudspeakers 74 The moving-coil loudspeaker 74 Other loudspeaker types 76 Mounting and loading drive units 79 Complete loudspeaker systems 82 Active loudspeakers 85 Subwoofers 86 Loudspeaker performance 87 Setting up loudspeakers 93 Recommended further reading 95 Chapter 5 Mixers 96 A simple six-channel mixer 96 A multitrack mixer 102 Channel grouping 106 An overview of typical mixer facilities 107 Digital mixers 120 EQ explained 121 Stereo line input modules 127 Dedicated monitor mixer 127 Introduction to mixing approaches 128 Basic operational techniques 129 Technical specifications 131 Metering systems 135 Automation 140 Recommended further reading 153 Contents vii Chapter 6 Analogue recording 154 A short history of analogue recording 154 Magnetic tape 156 The magnetic recording process 159 The tape recorder 164 Track formats 169 Magnetic recording levels 170 What are test tapes for? 171 Tape machine alignment 172 Mechanical transport functions 176 The Compact Cassette 178 Recommended further reading 181 Chapter 7 Noise reduction 182 Why is noise reduction required? 182 Methods of reducing noise 182 Line-up of noise reduction systems 188 Operational considerations 190 Single-ended noise reduction 190 Recommended further reading 192 Chapter 8 Digital audio principles 193 Digital and analogue recording contrasted 193 Binary for beginners 195 The digital audio signal chain 199 Analogue to digital conversion 200 D/A conversion 221 Direct Stream Digital (DSD) 222 Changing the resolution of an audio signal (requantisation) 224 Introduction to digital signal processing 227 Audio data reduction 235 Recommended further reading 242 Chapter 9 Digital recording and editing systems 243 Digital tape recording 243 Disk-based systems 254 Sound file formats 264 Consumer digital formats 271 Solid state recording formats 276 Audio processing for computer workstations 276 Disk-based editing system principles 278 Recommended further reading 286 viii Contents Chapter 10 Digital audio applications 287 Editing software 287 Plug-in architectures 290 Advanced audio processing software and development tools 292 Mastering and restoration 294 Preparing for and understanding release media 300 Interconnecting digital audio devices 306 Recommended further reading 324 Websites 324 Chapter 11 Power amplifiers 325 Domestic power amplifiers 325 Professional amplifier facilities 327 Specifications 328 Coupling 333 Chapter 12 Lines and interconnection 334 Transformers 334 Unbalanced lines 337 Cable effects with unbalanced lines 337 Balanced lines 341 Working with balanced lines 342 Star-quad cable 343 Electronic balancing 344 100 volt lines 345 600 ohms 347 DI boxes 350 Splitter boxes 352 Jackfields (patchbays) 354 Distribution amplifiers 358 Chapter 13 Outboard equipment 359 The graphic equaliser 359 The compressor/limiter 362 Echo and reverb devices 363 Multi-effects processors 367 Frequency shifter 368 Digital delay 368 Miscellaneous devices 369 Connection of outboard devices 370 Recommended further reading 372 Contents ix Chapter 14 MIDI and synthetic audio control 373 Background 373 What is MIDI? 375 MIDI and digital audio contrasted 375 Basic principles 377 Interfacing a computer to a MIDI system 380 How MIDI control works 384 MIDI control of sound generators 397 General MIDI 406 Scalable polyphonic MIDI (SPMIDI) 408 RMID and XMF files 409 SAOL and SASL in MPEG 4 Structured Audio 409 MIDI and synchronisation 411 MIDI over USB 415 MIDI over IEEE 1394 416 After MIDI? 417 Recommended further reading 418 Websites 418 Chapter 15 Timecode and synchronisation 419 SMPTE/EBU timecode 419 Recording timecode 422 Synchronisers 424 Recommended further reading 428 Chapter 16 Two-channel stereo 429 Principles of loudspeaker stereo 430 Principles of binaural or headphone stereo 438 Loudspeaker stereo over headphones and vice versa 442 Two-channel signal formats 445 Two-channel microphone techniques 447 Binaural recording and ‘dummy head’ techniques 465 Spot microphones and two-channel panning laws 467 Recommended further reading 468 Chapter 17 Surround sound 469 Three-channel (3-0) stereo 469 Four-channel surround (3-1 stereo) 470 5.1 channel surround (3-2 stereo) 472 Other multichannel configurations 478 x Contents Surround sound systems 480 Matrixed surround sound systems 480 Digital surround sound formats 486 Ambisonics 492 Surround sound monitoring 497 Surround sound recording techniques 501 Multichannel panning techniques 516 Recommended further reading 520 Glossary of terms 521 Appendix 1 Understanding basic equipment specifications 529 Frequency response – technical 529 Frequency response – practical examples 531 Harmonic distortion – technical 533 Harmonic distortion – practical examples 535 Dynamic range and signal-to-noise ratio 536 Wow and flutter 537 Intermodulation (IM) distortion 538 Crosstalk 539 Appendix 2 Record players 541 Pickup mechanics 541 RIAA equalisation 545 Cartridge types 546 Connecting leads 547 Arm considerations 548 Laser pickups 549 Recommended further reading 549 General further reading 551 Index 553 Fact File Directory 1.1 Ohm’s law 13 1.2 The decibel 15 1.3 The inverse-square law 17 1.4 Measuring SPLs 19 1.5 Absorption, reflection and RT 20 1.6 Echoes and reflections 23 2.1 Critical bandwidth 28 2.2 Equal-loudness contours 29 2.3 Masking 31 2.4 The precedence effect 34 2.5 Reflections affect spaciousness 37 3.1 Electromagnetic transducers 42 3.2 Dynamic microphone – principles 43 3.3 Ribbon microphone – principles 43 3.4 Capacitor microphone – principles 44 3.5 Bass tip-up 45 3.6 Sum and difference processing 59 3.7 Microphone sensitivity 60 3.8 Microphone noise specifications 61 3.9 Phantom powering 63 3.10 Frequency modulation 67 4.1 Electrostatic loudspeaker – principles 76 4.2 Transmission line system 80 4.3 Horn loudspeaker – principles 81 4.4 A basic crossover network 84 4.5 Loudspeaker sensitivity 89 5.1 Fader facts 99 5.2 Pan control 100 xi xii Fact File Directory 5.3 Pre-fade listen (PFL) 101 5.4 Audio groups 108 5.5 Control groups 109 5.6 Variable Q 114 5.7 Common mode rejection 132 5.8 Clipping 135 5.9 Metering and distortion 137 6.1 A magnetic recording head 159 6.2 Replay head effects 162 6.3 Sync replay 169 6.4 NAB and DIN formats 170 6.5 Magnetic reference levels 172 6.6 Azimuth alignment 174 6.7 Bias adjustment 175 7.1 Pre-emphasis 183 8.1 Analogue and digital information 194 8.2 Negative numbers 196 8.3 Logical operations 198 8.4 Sampling – frequency domain 203 8.5 Audio sampling frequencies 207 8.6 Parallel and serial representation 212 8.7 Dynamic range and perception 214 8.8 Quantising resolutions 215 8.9 Types of dither 217 8.10 Dynamic range enhancement 224 8.11 Crossfading 229 9.1 Rotary and stationary heads 244 9.2 Data recovery 246 9.3 Error handling 249 9.4 RAID arrays 257 9.5 Storage requirements of digital audio 259 9.6 Peripheral interfaces 262 9.7 Broadcast WAVE format 269 9.8 Recordable DVD formats 273 9.9 Audio processing latency 277 10.1 Plug-in examples 292 10.2 DVD discs and players 302 10.3 Computer networks vs digital audio interfaces 308 10.4 Carrying data-reduced audio 312 10.5 Extending a network 318 Fact File Directory xiii 11.1 Amplifier classes 326 11.2 Power bandwidth 330 11.3 Slew rate 331 12.1 The transformer 335 12.2 Earth loops 338 12.3 XLR-3 connectors 343 13.1 Compression and limiting 362 13.2 Simulating reflections 364 14.1 MIDI hardware interface 378 14.2 MIDI connectors and cables 379 14.3 MIDI message format 384 14.4 Registered and non-registered parameter numbers 405 14.5 Standard MIDI files (SMF) 407 14.6 Downloadable sounds and SoundFonts 410 14.7 Quarter-frame MTC messages 414 15.1 Drop-frame timecode 420 15.2 Types of lock 427 15.3 Synchroniser terminology 428 16.1 Binaural versus ‘stereophonic’ localisation 432 16.2 Stereo vector summation 437 16.3 The ‘Williams curves’ 439 16.4 Transaural stereo 443 16.5 Stereo misalignment effects 446 16.6 Stereo width issues 449 16.7 End-fire and side-fire configurations 457 17.1 Track allocations in 5.1 475 17.2 Bass management in 5.1 477 17.3 What is THX? 484 17.4 Higher-order Ambisonics 496 17.5 Loudspeaker mounting 498 17.6 Surround imaging 503 A1.1 Frequency response – subjective 532 A1.2 Harmonic distortion – subjective 535 A1.3 Noise weighting curves 537 A2.1 Stylus profile 542 A2.2 Tracking weight 545 This page is intentionally left blank Preface to the Second Edition One of the greatest dangers in writing a book at an introductory level is to sacrifice technical accuracy for the sake of simplicity. In writing Sound and Recording: An Introduction we have gone to great lengths not to fall into this trap, and have produced a comprehensive introduction to the field of audio, intended principally for the newcomer to the subject, which is both easy to understand and technically precise. We have written the book that we would have valued when we first entered the industry, and as such it represents a readable reference, packed with information. Many books stop after a vague overview, just when the reader wants some clear facts about a subject, or perhaps assume too much knowledge on the reader’s behalf. Books by contributed authors often suffer from a lack of consistency in style, coverage and technical level. Furthermore, there is a tendency for books on audio to be either too technical for the beginner or, alternatively, subjectively biased towards specific products or operations. There are also quite a number of American books on sound recording which, although good, tend to ignore European trends and practices. We hope that we have steered a balanced course between these extremes, and have deliberately avoided any attempt to dictate operational practice. Sound and Recording: An Introduction is definitely biased towards an under- standing of ‘how it works’, as opposed to ‘how to work it’, although technology is never discussed in an abstract manner but related to operational reality. Although we have included a basic introduction to acoustics and the nature of sound perception, this is not a book on acoustics or musical acoustics (there are plenty of those around). It is concerned with the principles of audio recording and reproduction, and has a distinct bias towards the professional rather than the consumer end of the market. The coverage of subject matter is broad, including chapters on digital audio, timecode synchronisation and MIDI, amongst other more conventional subjects, and there is comprehensive coverage of commonly misunderstood subjects such as the decibel, balanced lines, reference levels and metering systems. This second edition of the book has been published only two years after the first, and the subject matter has not changed significantly enough in the interim to warrant major modifications to the existing chapters. The key difference between xv xvi Preface to the Second Edition the second and first editions is the addition of a long chapter on stereo recording and reproduction. This important topic is covered in considerable detail, including historical developments, principles of stereo reproduction, surround sound and stereo microphone techniques. Virtually every recording or broadcast happening today is made in stereo, and although surround sound has had a number of notable ‘flops’ in the past it is likely to become considerably more important in the next ten years. Stereo and surround sound are used extensively in film, video and television production, and any new audio engineer should be familiar with the principles. Since this is an introductory book, it will be of greatest value to the student of sound recording or music technology, and to the person starting out on a career in sound engineering or broadcasting. The technical level has deliberately been kept reasonably low for this reason, and those who find this frustrating probably do not need the book! Nonetheless, it is often valuable for the seasoned audio engineer to go back to basics. Further reading suggestions have been made in order that the reader may go on to a more in-depth coverage of the fields intro- duced here, and some of the references are considerably more technical than this book. Students will find these suggestions valuable when planning a course of study. Francis Rumsey Tim McCormick Preface to the Third Edition Since the first edition of Sound and Recording some of the topics have advanced quite considerably, particularly the areas dependent on digital and computer technology. Consequently I have rewritten the chapters on digital recording and MIDI (Chapters 10 and 15), and have added a larger section on mixer automation in Chapter 7. Whereas the first edition of the book was quite ‘analogue’, I think that there is now a more appropriate balance between analogue and digital topics. Although analogue audio is by no means dead (sound will remain analogue for ever!), most technological developments are now digital. I make no apologies for leaving in the chapter on record players, although some readers have commented that they think it is a waste of space. People still use record players, and there is a vast store of valuable material on LP record. I see no problem with keeping a bit of history in the book – you never know, it might come in useful one day when everyone has forgotten (and some may never have known) what to do with vinyl discs. It might even appease the fac- tion of our industry that continues to insist that vinyl records are the highest fidelity storage medium ever invented. Francis Rumsey Guildford xvii This page is intentionally left blank Preface to the Fourth Edition The fourth edition is published ten years after Sound and Recording was first published, which is hard to believe. The book has been adopted widely by students and tutors on audio courses around the world. In that time audio tech- nology and techniques have changed in some domains but not in others. All the original principles still apply but the emphasis has gradually changed from pre- dominantly analogue to quite strongly digital, although many studios still use analogue mixers and multitrack tape recorders for a range of purposes and we do not feel that the death-knell of analogue recording has yet been sounded. Readers of earlier editions will notice that the chapter on record players has finally been reduced in size and relegated to an appendix. While we continue to believe that information about the LP should remain in the literature as the format lingers on, it is perhaps time to remove it from the main part of the book. In this edition a new chapter on surround sound has been added, comple- mented by a reworked chapter preceding it that is now called ‘two-channel stereo’. Surround sound was touched upon in the previous edition but a complete chapter reflects the increased activity in this field with the coming of new multichannel consumer replay formats. The chapter on auditory perception has been reworked to include greater detail on spatial perception and the digital audio chapter has been updated to include DVD-A and SACD, with information about Direct Stream Digital (DSD), the MiniDisc, computer-based editing systems and their operation. Chapter 5 on loudspeakers now includes information about distributed-mode loudspeakers (DML) and a substantial section on directivity and the various techniques used to control it. Finally a glossary of terms has now been provided, with some additional material that supports the main text. Francis Rumsey Tim McCormick xix This page is intentionally left blank Preface to the Fifth Edition The fifth edition of Sound and Recording includes far greater detail on digital audio than the previous editions, reflecting the growing ‘all-digital’ trend in audio equipment and techniques. In place of the previous single chapter on the topic there are now three chapters (Chapters 8, 9 and 10) covering principles, record- ing and editing systems, and applications. This provides a depth of coverage of dig- ital audio in the fifth edition that should enable the reader to get a really detailed understanding of the principles of current audio systems. We believe, however, that the detailed coverage of analogue recording should remain in its current form, at least for this iteration of the book. We have continued the trend, begun in previous new editions, of going into topics in reasonable technical depth but without using unnecessary mathematics. It is intended that this will place Sound and Recording slightly above the introductory level of the many broad-ranging textbooks on recording techniques and audio, so that those who want to under- stand how it works a bit better will find something to satisfy them here. The chapter previously called ‘A guide to the audio signal chain’ has been removed from this new edition, and parts of that material have now found their way into other chapters, where appropriate. For example, the part dealing with the history of analogue recording has been added to the start of Chapter 6. Next, the material dealing with mixers has been combined into a single chapter (it is hard to remember why we ever divided it into two) and now addresses both analogue and digital systems more equally than before. Some small additions have been made to Chapters 12 and 13 and Chapter 14 has been completely revised and extended, now being entitled ‘MIDI and synthetic audio control’. Francis Rumsey Tim McCormick xxi This page is intentionally left blank Chapter 1 What is sound? A vibrating source Sound is produced when an object (the source) vibrates and causes the air around it to move. Consider the sphere shown in Figure 1.1. It is a pulsating sphere which could be imagined as something like a squash ball, and it is pulsating regularly so that its size oscillates between being slightly larger than normal and then slightly smaller than normal. As it pulsates it will alternately compress and then rarefy the surrounding air, resulting in a series of compressions and rarefactions travelling away from the sphere, rather like a three-dimensional version of the ripples which travel away from a stone dropped into a pond. These are known as longitudinal waves since the air particles move in the same dimension as the direction of wave travel. The alternative to longitudinal wave motion is transverse wave motion (see Figure 1.2), such as is found in vibrating strings, where the motion of the string is at right angles to the direction of apparent wave travel. Characteristics of a sound wave The rate at which the source oscillates is the frequency of the sound wave it produces, and is quoted in hertz (Hz) or cycles per second (cps). 1000 hertz is termed 1 kilohertz (1 kHz). The amount of compression and rarefaction of the air which results from the sphere’s motion is the amplitude of the sound wave, and is related to the loudness of the sound when it is finally perceived by the ear (see Chapter 2). The distance between two adjacent peaks of compression or rarefac- tion as the wave travels through the air is the wavelength of the sound wave, and is often represented by the Greek letter lambda (λ). The wavelength depends on how fast the sound wave travels, since a fast-travelling wave would result in a greater distance between peaks than a slow-travelling wave, given a fixed time between compression peaks (i.e.: a fixed frequency of oscillation of the source). As shown in Figure 1.3, the sound wave’s characteristics can be represented on a graph, with amplitude plotted on the vertical axis and time plotted on the horizontal axis. It will be seen that both positive and negative ranges are shown on the vertical axis: these represent compressions (+) and rarefactions (–) of the air. 1 2 What is sound? (a) Expanding sound field Pulsating sphere (b) Rarefactions Compressions Direction of air particle motion Apparent direction of wave travel Figure 1.1 (a) A simple sound source can be imagined as like a pulsating sphere radiating spherical waves. (b) The longitudinal wave thus created is a succession of compressions and rarefactions of the air This graph represents the waveform of the sound. For a moment, a source vibrating in a very simple and regular manner is assumed, in so-called simple harmonic motion, the result of which is a simple sound wave known as a sine wave. The most simple vibrating systems oscillate in this way, such as a mass suspended from a spring, or a swinging pendulum (see also ‘Phase’ below). It will be seen that the frequency (f ) is the inverse of the time between peaks or troughs of the wave (f = 1/t ). So the shorter the time between oscillations of the source, the higher the frequency. The human ear is capable of perceiving sounds Motion of point on string Apparent direction of wave motion Figure 1.2 In a transverse wave the motion of any point on the wave is at right angles to the apparent direction of motion of the wave What is sound? 3 t Amplitude + 0 Time – Figure 1.3 A graphical representation of a sinusoidal sound waveform. The period of the wave is represented by t, and its frequency by 1/t with frequencies between approximately 20 Hz and 20 kHz (see ‘Frequency perception’, Chapter 2); this is known as the audio frequency range or audio spectrum. How sound travels in air Air is made up of gas molecules and has an elastic property (imagine putting a thumb over the end of a bicycle pump and compressing the air inside – the air is springy). Longitudinal sound waves travel in air in somewhat the same fashion as a wave travels down a row of up-ended dominoes after the first one is pushed over. The half-cycle of compression created by the vibrating source causes successive air particles to be moved in a knock-on effect, and this is normally followed by a balanc- ing rarefaction which causes a similar motion of particles in the opposite direction. It may be appreciated that the net effect of this is that individual air particles do not actually travel – they oscillate about a fixed point – but the result is that a wave is formed which appears to move away from the source. The speed at which it moves away from the source depends on the density and elasticity of the substance through which it passes, and in air the speed is relatively slow compared with the speed at which sound travels through most solids. In air the speed of sound is approximately 340 metres per second (m s–1), although this depends on the temperature of the air. At freezing point the speed is reduced to nearer 330 m s–1. In steel, to give an example of a solid, the speed of sound is approximately 5100 m s–1. The frequency and wavelength of a sound wave are related very simply if the speed of the wave (usually denoted by the letter c) is known: c = f λ or λ = c /f To show some examples, the wavelength of sound in air at 20 Hz (the low- frequency or LF end of the audio spectrum), assuming normal room temperature, would be: λ = 340/20 = 17 metres 4 What is sound? whereas the wavelength of 20 kHz (at the high-frequency or HF end of the audio spectrum) would be 1.7 cm. Thus it is apparent that the wavelength of sound ranges from being very long in relation to most natural objects at low frequencies, to quite short at high frequencies. This is important when considering how sound behaves when it encounters objects – whether the object acts as a barrier or whether the sound bends around it (see Fact File 1.5). Simple and complex sounds In the foregoing example, the sound had a simple waveform – it was a sine wave or sinusoidal waveform – the type which might result from a very simple vibrating system such as a weight suspended on a spring. Sine waves have a very pure sound because they consist of energy at only one frequency, and are often called pure tones. They are not heard very commonly in real life (although they can be generated electrically) since most sound sources do not vibrate in such a simple manner. A person whistling or a recorder (a simple wind instru- ment) produces a sound which approaches a sinusoidal waveform. Most real sounds are made up of a combination of vibration patterns which result in a more complex waveform. The more complex the waveform, the more like noise the sound becomes, and when the waveform has a highly random pattern the sound is said to be noise (see ‘Frequency spectra of non-repetitive sounds’, below). The important characteristic of sounds which have a definite pitch is that they are repetitive: that is, the waveform, no matter how complex, repeats its pattern in the same way at regular intervals. All such waveforms can be broken down into a series of components known as harmonics, using a mathematical process called Fourier analysis (after the mathematician Joseph Fourier). Some examples of equivalent line spectra for different waveforms are given in Figure 1.4. This figure shows another way of depicting the characteristics of the sound graphically – that is, by drawing a so-called line spectrum which shows frequency along the horizontal axis and amplitude up the vertical axis. The line spectrum shows the relative strengths of different frequency components which make up a sound. Where there is a line there is a frequency component. It will be noticed that the more complex the waveform the more complex the corresponding line spectrum. For every waveform, such as that shown in Figure 1.3, there is a corresponding line spectrum: waveforms and line spectra are simply two different ways of showing the characteristics of the sound. Figure 1.3 is called a time-domain plot, whilst the line spectrum is called a frequency-domain plot. Unless otherwise stated, such frequency-domain graphs in this book will cover the audio-frequency range, from 20 Hz at the lower end to 20 kHz at the upper end. In a reversal of the above breaking-down of waveforms into their component frequencies it is also possible to construct or synthesise waveforms by adding together the relevant components. What is sound? 5 (a) Waveform Line spectrum Amplitude + – f (linear scale) Frequency (b) Amplitude + – f 2f 3f 4f 5f 6f 7f (linear scale) Frequency (c) Amplitude + – f 3f 5f 7f (linear scale) Frequency Figure 1.4 Equivalent line spectra for a selection of simple waveforms. (a) The sine wave consists of only one component at the fundamental frequency f. (b) The sawtooth wave consists of components at the fundamental and its integer multiples, with amplitudes steadily decreasing. (c) The square wave consists of components at odd multiples of the fundamental frequency Frequency spectra of repetitive sounds As will be seen in Figure 1.4, the simple sine wave has a line spectrum consisting of only one component at the frequency of the sine wave. This is known as the fundamental frequency of oscillation. The other repetitive waveforms, such as the square wave, have a fundamental frequency as well as a number of additional components above the fundamental. These are known as harmonics, but may also be referred to as overtones or partials. Harmonics are frequency components of a sound which occur at integer multiples of the fundamental frequency, that is at twice, three times, four times and so on. Thus a sound with a fundamental of 100 Hz might also contain harmonics at 200 Hz, 400 Hz and 600 Hz. The reason for the existence of these harmonics is that most simple vibrating sources are capable of vibrating in 6 What is sound? (a) (b) Antinode Node (c) Figure 1.5 Modes of vibration of a stretched string. (a) Fundamental. (b) Second harmonic. (c) Third harmonic a number of harmonic modes at the same time. Consider a stretched string, as shown in Figure 1.5. It may be made to vibrate in any of a number of modes, corresponding to integer multiples of the fundamental frequency of vibration of the string (the concept of ‘standing waves’ is introduced below). The fundamen- tal corresponds to the mode in which the string moves up and down as a whole, whereas the harmonics correspond to modes in which the vibration pattern is divided into points of maximum and minimum motion along the string (these are called antinodes and nodes). It will be seen that the second mode involves two peaks of vibration, the third mode three peaks, and so on. In accepted terminology, the fundamental is also the first harmonic, and thus the next component is the second harmonic, and so on. Confusingly, the second har- monic is also known as the first overtone. For the waveforms shown in Figure 1.4, the fundamental has the highest amplitude, and the amplitudes of the harmonics decrease with increasing frequency, but this will not always be the case with real sounds since many waveforms have line spectra which show the harmonics to be higher in amplitude than the fundamental. It is also quite feasible for there to be harmonics missing in the line spectrum, and this depends entirely on the waveform in question. It is also possible for there to be overtones in the frequency spectrum of a sound which are not related in a simple integer-multiple fashion to the fundamental. These cannot correctly be termed harmonics, and they are more correctly referred to as overtones or inharmonic partials. They tend to arise in vibrating sources which have a complicated shape, and which do not vibrate in simple harmonic motion but have a number of repetitive modes of vibration. Their patterns of oscillation are often unusual, such as might be observed in a bell or a percussion instrument. What is sound? 7 It is still possible for such sounds to have a recognisable pitch, but this depends on the strength of the fundamental. In bells and other such sources, one often hears the presence of several strong inharmonic overtones. Frequency spectra of non-repetitive sounds Non-repetitive waveforms do not have a recognisable pitch and sound noise-like. Their frequency spectra are likely to consist of a collection of components at unre- lated frequencies, although some frequencies may be more dominant than others. The analysis of such waves to show their frequency spectra is more complicated than with repetitive waves, but is still possible using a mathematical technique called Fourier transformation, the result of which is a frequency-domain plot of a time-domain waveform. Single, short pulses can be shown to have continuous frequency spectra which extend over quite a wide frequency range, and the shorter the pulse the wider its frequency spectrum but usually the lower its total energy (see Figure 1.6). Random waveforms will tend to sound like hiss, and a completely random waveform in which the frequency, amplitude and phase of components are equally probable and constantly varying is called white noise. A white noise signal’s spectrum is flat, when averaged over a period of time, right across the audio-frequency range (and theoretically above it). White noise has equal energy for a given bandwidth, whereas another type of noise, known as pink noise, has equal energy per octave. For this reason white noise sounds subjectively to have more high-frequency energy than pink noise. (a) Waveform Continuous spectrum Amplitude + – (linear scale) Frequency (b) Amplitude + – (linear scale) Frequency Figure 1.6 Frequency spectra of non-repetitive waveforms. (a) Pulse. (b) Noise 8 What is sound? Phase Two waves of the same frequency are said to be ‘in phase’ when their compression (positive) and rarefaction (negative) half-cycles coincide exactly in time and space (see Figure 1.7). If two in-phase signals of equal amplitude are added together, or superimposed, they will sum to produce another signal of the same frequency but twice the amplitude. Signals are said to be out of phase when the positive half-cycle of one coincides with the negative half-cycle of the other. If these two signals are added together they will cancel each other out, and the result will be no signal. Clearly these are two extreme cases, and it is entirely possible to superimpose two sounds of the same frequency which are only partially in phase with each other. The resultant wave in this case will be a partial addition or partial cancellation, and the phase of the resulting wave will lie somewhere between that of the two components (see Figure 1.7(c)). Phase differences between signals can be the result of time delays between them. If two identical signals start out at sources equidistant from a listener at the same time as each other then they will be in phase by the time they arrive at the listener. If one source is more distant than the other then it will be delayed, and the phase relationship between the two will depend upon the amount of delay (see Figure 1.8). A useful rule-of-thumb is that sound travels about 30 cm (1 foot) per millisecond, so if the second source in the above example were 1 metre (just over 3 ft) more distant than the first it would be delayed by just over 3 ms. The resulting phase relationship between the two signals, it may be appre- ciated, would depend on the frequency of the sound, since at a frequency of around 330 Hz the 3 ms delay would correspond to one wavelength and thus the delayed signal would be in phase with the undelayed signal. If the delay had been half this (1.5 ms) then the two signals would have been out of phase at 330 Hz. Phase is often quoted as a number of degrees relative to some reference, and this must be related back to the nature of a sine wave. A diagram is the best way to illustrate this point, and looking at Figure 1.9 it will be seen that a sine wave may be considered as a graph of the vertical position of a rotating spot on the outer rim of a disc (the amplitude of the wave), plotted against time. The height of the spot rises and falls regularly as the circle rotates at a constant speed. The sine wave is so called because the spot’s height is directly proportional to the mathematical sine of the angle of rotation of the disc, with zero degrees occurring at the origin of the graph and at the point shown on the disc’s rotation in the diagram. The vertical amplitude scale on the graph goes from minus one (maximum negative amplitude) to plus one (maximum positive amplitude), passing through zero at the halfway point. At 90° of rotation the amplitude of the sine wave is maximum positive (the sine of 90° is +1), and at 180° it is zero (sin 180° = 0). At 270° it is maximum negative (sin 270° = –1), and at 360° it is zero again. Thus in one cycle of the sine wave the circle has passed through 360° of rotation. What is sound? 9 (a) + } + – equals + – – (b) + } + – equals + – – (c) + – + – } equals + – Figure 1.7 (a) When two identical in-phase waves are added together, the result is a wave of the same frequency and phase but twice the amplitude. (b) Two identical out-of-phase waves add to give nothing. (c) Two identical waves partially out of phase add to give a resultant wave with a phase and amplitude which is the point-by-point sum of the two 10 What is sound? + Speaker 2 Time – Delay of speaker 2 is X t1 t2 t2 – t1 t2 Wave leaves both speakers at time X, and arrives at listener Speaker 1 at times t1 and t2 respectively t1 Figure 1.8 If the two loudspeakers in the drawing emit the same wave at the same time, the phase difference between the waves at the listener’s ear will be directly related to the delay t2−t1 It is now possible to go back to the phase relationship between two waves of the same frequency. If each cycle is considered as corresponding to 360°, then one can say just how many degrees one wave is ahead of or behind another by comparing the 0° point on one wave with the 0° point on the other (see Figure 1.10). In the example wave 1 is 90° out of phase with wave 2. It is important to realise that phase is only a relevant concept in the case of continuous repetitive waveforms, and has little meaning in the case of impulsive or transient sounds where time difference is the more relevant quantity. It can be deduced from the foregoing discussion that (a) the higher the frequency, the greater the phase difference which would result from a given time delay between two signals, and (b) it is possible for there to be more than 360° of phase difference between two signals if the 90° + Height of spot α 180° 0° 90° 270° α – 270° Figure 1.9 The height of the spot varies sinusoidally with the angle of rotation of the wheel. The phase angle of a sine wave can be understood in terms of the number of degrees of rotation of the wheel What is sound? 11 0° 90° 270° 0° 90° Figure 1.10 The lower wave is 90° out of phase with the upper wave delay is great enough to delay the second signal by more than one cycle. In the latter case it becomes difficult to tell how many cycles of delay have elapsed unless a discontinuity arises in the signal, since a phase difference of 360° is indistinguishable from a phase difference of 0°. Sound in electrical form Although the sound that one hears is due to compression and rarefaction of the air, it is often necessary to convert sound into an electrical form in order to perform operations on it such as amplification, recording and mixing. As detailed in Fact File 3.1 and Chapter 3, it is the job of the microphone to convert sound from an acoustical form into an electrical form. The process of conversion will not be described here, but the result is important because if it can be assumed for a moment that the microphone is perfect then the resulting electrical waveform will be exactly the same shape as the acoustical waveform which caused it. The equivalent of the amplitude of the acoustical signal in electrical terms is the voltage of the electrical signal. If the voltage at the output of a microphone were to be measured whilst the microphone was picking up an acoustical sine wave, one would measure a voltage which changed sinusoidally as well. Figure 1.11 shows this situation, and it may be seen that an acoustical compression of the air corresponds to a positive-going voltage, whilst an acoustical rarefaction of the air corresponds to a negative-going voltage. (This is the norm, although some sound reproduction systems introduce an absolute phase reversal in the relationship between acoustical phase and electrical phase, such that an acoustical compression becomes equivalent to a negative voltage. Some people claim to be able to hear the difference.) The other important quantity in electrical terms is the current flowing down the wire from the microphone. Current is the electrical equivalent of the air particle 12 What is sound? Acoustic Varying pressure electrical waves voltage Microphone + + Compress Positive 0 0 Rarefy Negative – – Figure 1.11 A microphone converts variations in acoustical sound pressure into variations in electrical voltage. Normally a compression of the air results in a positive voltage and a rarefaction results in a negative voltage motion discussed in ‘How sound travels in air’, above. Just as the acoustical sound wave was carried in the motion of the air particles, so the electrical sound wave is carried in the motion of tiny charge carriers which reside in the metal of a wire (these are called electrons). When the voltage is positive the current moves in one direction, and when it is negative the current moves in the other direction. Since the voltage generated by a microphone is repeatedly alternating between positive and negative, in sympathy with the sound wave’s compression and rarefaction cycles, the current similarly changes direction each half cycle. Just as the air particles in ‘Characteristics of a sound wave’, above, did not actually go anywhere in the long term, so the electrons carrying the current do not go anywhere either – they simply oscillate about a fixed point. This is known as alternating current or AC. A useful analogy to the above (both electrical and acoustical) exists in plumbing. If one considers water in a pipe fed from a header tank, as shown in Figure 1.12, the voltage is equivalent to the pressure of water which results from the header tank, and the current is equivalent to the rate of flow of water through the pipe. The only difference is that the diagram is concerned with a direct current situation in which the direction of flow is not repeatedly changing. The quantity of resistance Water pressure Header tank ≡ Electrical voltage Diameter of pipe ≡ Electrical resistance Outlet pipe Rate of flow ≡ Electrical current Figure 1.12 There are parallels between the flow of water in a pipe and the flow of electricity in a wire, as shown in this drawing What is sound? 13 Fact file 1.1 Ohm’s law Ohm’s law states that there is a fixed and simple or: relationship between the current flowing through a device (I ), the voltage across it (V ), and its R = V /I resistance (R ), as shown in the diagram: Thus if the resistance of a device is known, V and the voltage dropped across it can be measured, then the current flow may be R calculated, for example. I There is also a relationship between the parameters above and the power in watts (W ) V = IR dissipated in a device: or: W = I 2R = V 2/R I = V /R should be introduced here, and is analogous to the diameter of the pipe. Resistance impedes the flow of water through the pipe, as it does the flow of electrons through a wire and the flow of acoustical sound energy through a sub- stance. For a fixed voltage (or water pressure in this analogy), a high resistance (narrow pipe) will result in a small current (a trickle of water), whilst a low resistance (wide pipe) will result in a large current. The relationship between voltage, current and resistance was established by Ohm, in the form of Ohm’s law, as described in Fact File 1.1. There is also a relationship between power and voltage, current and resistance. In AC systems, resistance is replaced by impedance, a complex term which contains both resistance and reactance components. The reactance part varies with the frequency of the signal; thus the impedance of an electrical device also varies with the frequency of a signal. Capacitors (basically two conductive plates separated by an insulator) are electrical devices which present a high impedance to low-frequency signals and a low impedance to high-frequency signals. They will not pass direct current. Inductors (basically coils of wire) are electrical devices which present a high impedance to high-frequency signals and a low impedance to low-frequency signals. Capacitance is measured in farads, inductance in henrys. Displaying the characteristics of a sound wave Two devices can be introduced at this point which illustrate graphically the various characteristics of sound signals so far described. It would be useful to (a) display the waveform of the sound, and (b) display the frequency spectrum of the sound. In other words (a) the time-domain signal and (b) the frequency-domain signal. 14 What is sound? (a) (b) Figure 1.13 (a) An oscilloscope displays the waveform of an electric signal by means of a moving spot which is deflected up by a positive signal and down by a negative signal. (b) A spectrum analyser displays the frequency spectrum of an electrical waveform in the form of lines representing the amplitudes of different spectral components of the signal An oscilloscope is used for displaying the waveform of a sound, and a spectrum analyser is used for showing which frequencies are contained in the signal and their amplitudes. Examples of such devices are pictured in Figure 1.13. Both devices accept sound signals in electrical form and display their analyses of the sound on a screen. The oscilloscope displays a moving spot which scans horizon- tally at one of a number of fixed speeds from left to right and whose vertical deflection is controlled by the voltage of the sound signal (up for positive, down for negative). In this way it plots the waveform of the sound as it varies with time. Many oscilloscopes have two inputs and can plot two waveforms at the same time, and this can be useful for comparing the relative phases of two signals (see ‘Phase’, above). The spectrum analyser works in different ways depending on the method of spectrum analysis. A real-time analyser displays a constantly updating line spec- trum, similar to those depicted earlier in this chapter, and shows the frequency components of the input signal on the horizontal scale together with their ampli- tudes on the vertical scale. The decibel The unit of the decibel is used widely in sound engineering, often in preference to other units such as volts, watts, or other such absolute units, since it is a con- venient way of representing the ratio of one signal’s amplitude to another’s. It also results in numbers of a convenient size which approximate more closely to one’s subjective impression of changes in the amplitude of a signal, and it helps to compress the range of values between the maximum and minimum sound levels encountered in real signals. For example, the range of sound intensities (see next section) which can be handled by the human ear covers about fourteen powers of ten, from 0.000 000 000 001 W m–2 to around 100 W m–2, but the equivalent range in decibels is only from 0 to 140 dB. Some examples of the use of the decibel are given in Fact File 1.2. The relation- ship between the decibel and human sound perception is discussed in more What is sound? 15 Fact file 1.2 The decibel Basic decibels There are exceptions in practice, since in some The decibel is based on the logarithm of the ratio fields a reference level is accepted as implicit. between two numbers. It describes how much Sound pressure levels (SPLs) are an example, larger or smaller one value is than the other. since the reference level is defined worldwide It can also be used as an absolute unit of as 2 × 10−6 N m−2 (20 µPa). Thus to state measurement if the reference value is fixed ‘SPL = 77 dB’ is probably acceptable, although and known. Some standardised references have confusion can still arise due to misunderstandings been established for decibel scales in different over such things as weighting curves (see fields of sound engineering (see below). Fact File 1.4). In sound recording, 0 dB or ‘zero The decibel is strictly ten times the logarithm level’ is a nominal reference level used for aligning to the base ten of the ratio between the powers equipment and setting recording levels, often of two signals: corresponding to 0.775 volts (0 dBu) although this is subject to variations in studio centres in dB = 10 log10 (P1/P2) different locations. (Some studios use +4 dBu For example, the difference in decibels between as their reference level, for example.) ‘0 dB’ does a signal with a power of 1 watt and one of not mean ‘no signal’, it means that the signal 2 watts is 10 log (2/1) = 3 dB. concerned is at the same level as the reference. If the decibel is used to compare values Often a letter is placed after ‘dB’ to denote other than signal powers, the relationship to the reference standard in use (e.g.: ‘dBm’), and signal power must be taken into account. a number of standard abbreviations are in use, Voltage has a square relationship to power some examples of which are given below. (from Ohm’s law: W = V 2/R ); thus to compare Sometimes the suffix denotes a particular two voltages: frequency weighting characteristic used in the measurement of noise (e.g.: ‘dBA’). dB = 10 log(V 12/V 22), or 10 log (V1/V2)2, or Abbrev. Ref. Level 20 log (V1/V2) dBV 1 volt For example, the difference in decibels between dBu 0.775 volt (Europe) a signal with a voltage of 1 volt and one of dBv 0.775 volt (USA) 2 volts is 20 log (2/1) = 6 dB. So a doubling in dBm 1 milliwatt (see Chapter 12) voltage gives rise to an increase of 6 dB, and dBA dB SPL, A-weighted response a doubling in power gives rise to an increase of A full listing of suffixes is given in CCIR 3 dB. A similar relationship applies to acoustical Recommendation 5741,1982. sound pressure (analogous to electrical voltage) and sound power (analogous to electrical Useful decibel ratios to remember (voltages power). or SPLs) It is more common to deal in terms of voltage or Decibels with a reference SPL ratios than power ratios in audio systems. If a signal level is quoted in decibels, then a Here are some useful dB equivalents of different reference must normally be given, otherwise voltage or SPL relationships and multiplication the figure means nothing; e.g.: ‘Signal level = factors: 47 dB’ cannot have a meaning unless one knows dB Multiplication factor that the signal is 47 dB above a known point. 0 1 ‘+8 dB ref. 1 volt’ has a meaning since one now +3 √2 knows that the level is 8 dB higher than 1 volt, +6 2 and thus one could calculate the voltage of the +20 10 signal. +60 1000 16 What is sound? detail in Chapter 2. Operating levels in recording equipment are discussed further in ‘Metering systems’, Chapter 5 and ‘Magnetic recording levels’, Chapter 6. Decibels are not only used to describe the ratio between two signals, or the level of a signal above a reference, but they are also used to describe the voltage gain of a device. For example, a microphone amplifier may have a gain of 60 dB, which is the equivalent of multiplying the input voltage by a factor of 1000, as shown in the example below: 20 log 1000/1 = 60 dB Sound power and sound pressure A simple sound source, such as the pulsating sphere used at the start of this chapter, radiates sound power omnidirectionally – that is, equally in all directions, rather like a three-dimensional version of the ripples moving away from a stone dropped in a pond. The sound source generates a certain amount of power, measured in watts, which is gradually distributed over an increasingly large area as the wavefront travels further from the source; thus the amount of power per square metre passing through the surface of the imaginary sphere surrounding the source gets smaller with increasing distance (see Fact File 1.3). For practical purposes the intensity of the direct sound from a source drops by 6 dB for every doubling in distance from the source (see Figure 1.14). The amount of acoustical power generated by real sound sources is surprisingly small, compared with the number of watts of electrical power involved in lighting a light bulb, for example. An acoustical source radiating 20 watts would produce a sound pressure level close to the threshold of pain if a listener was close to the source. Most everyday sources generate fractions of a watt of sound power, and this energy is eventually dissipated into heat by absorption (see below). The amount of heat produced by the dissipation of acoustic energy is relatively insignificant – the chances of increasing the temperature of a room by shouting are slight, at least in the physical sense. Acoustical power is sometimes confused with the power output of an ampli- fier used to drive a loudspeaker, and audio engineers will be familiar with power outputs from amplifiers of many hundreds of watts. It is important to realise that loudspeakers are very inefficient devices – that is, they only convert a small proportion of their electrical input power into acoustical power. Thus, even if the input to a loudspeaker was to be, say, 100 watts electrically, the acoustical output power might only be perhaps 1 watt, suggesting a loudspeaker that is only 1 per cent efficient. The remaining power would be dissipated as heat in the voice coil. Sound pressure is the effect of sound power on its surroundings. To use a cen- tral heating analogy, sound power is analogous to the heat energy generated by a radiator into a room, whilst sound pressure is analogous to the temperature of the air in the room. The temperature is what a person entering the room would feel, but the heat-generating radiator is the source of power. Sound pressure What is sound? 17 Fact file 1.3 The inverse-square law The law of decreasing power per unit area of a sphere (S ), which from elementary maths is (intensity) of a wavefront with increasing distance given by: from the source is known as the inverse-square law, because intensity drops in proportion to the S = 4π r 2 inverse square of the distance from the source. where r is the distance from the source or the Why is this? It is because the sound power from radius of the sphere, as shown in the diagram. a point source is spread over the surface area If the original power of the source is W watts, then the intensity, or power per unit area (I ) at distance r is: I = W / 4πr 2 For example, if the power of a source was 0.1 watt, r Source the intensity at 4 m distance would be: Power = W I = 0.1 ÷ (4 × 3.14 × 16) 0.0005 W m−2 The sound intensity level (SIL) of this signal in decibels can be calculated by comparing it with the accepted reference level of 10−12 W m−2 : 1 m2 SIL(dB) = 10 log((5 × 10−4) ÷ (10−12)) = 87 dB Surface area of sphere = S level (SPL) is measured in newtons per square metre (N m–2). A convenient refer- ence level is set for sound pressure and intensity measurements, this being referred to as 0 dB. This level of 0 dB is approximately equivalent to the threshold of hearing (the quietest sound perceivable by an average person) at a frequency of 1 kHz, and corresponds to an SPL of 2 × 10–5 N m–2, which in turn is equiva- lent to an intensity of approximately 10–12 W m–2 in the free field (see below). Sound pressure levels are often quoted in dB (e.g.: SPL = 63 dB means that the SPL is 63 dB above 2 × 10–5 N m–2). The SPL in dB may not accurately represent 4m2 1m2 Source r r Figure 1.14 The sound power which had passed through 1 m2 of space at distance r from the source will pass through 4 m2 at distance 2r, and thus will have one quarter of the intensity 18 What is sound? the loudness of a sound, and thus a subjective unit of loudness has been derived from research data, called the phon. This is discussed further in Chapter 2. Some methods of measuring sound pressure levels are discussed in Fact File 1.4. Free and reverberant fields The free field in acoustic terms is an acoustical area in which there are no reflec- tions. Truly free fields are rarely encountered in reality, because there are nearly always reflections of some kind, even if at a very low level. If the reader can imagine the sensation of being suspended out-of-doors, way above the ground, away from any buildings or other surfaces, then he or she will have an idea of the experience of a free-field condition. The result is an acoustically ‘dead’ environment. Acoustic experiments are sometimes performed in anechoic chambers, which are rooms specially treated so as to produce almost no reflections at any frequency – the surfaces are totally absorptive – and these attempt to create near free-field conditions. In the free field all the sound energy from a source is radiated away from the source and none is reflected; thus the inverse-square law (Fact File 1.3) entirely dictates the level of sound at any distance from the source. Of course the source may be directional, in which case its directivity factor must be taken into account. A source with a directivity factor of 2 on its axis of maximum radiation radiates twice as much power in this direction as it would have if it had been radiating omnidirectionally. The directivity index of a source is measured in dB, giving the above example a directivity index of 3 dB. If calculating the intensity at a given distance from a directional source (as shown in Fact File 1.3), one must take into account its directivity factor on the axis concerned by multiplying the power of the source by the directivity factor before dividing by 4πr 2. In a room there is both direct and reflected sound. At a certain distance from a source contained within a room the acoustic field is said to be diffuse or rever- berant, since reflected sound energy predominates over direct sound. A short time after the source has begun to generate sound a diffuse pattern of reflections will have built up throughout the room, and the reflected sound energy will become roughly constant at any point in the room. Close to the source the direct sound energy is still at quite a high level, and thus the reflected sound makes a smaller contribution to the total. This region is called the near field. (It is popular in sound recording to make use of so-called ‘near-field monitors’, which are loudspeakers mounted quite close to the listener, such that the direct sound predominates over the effects of the room.) The exact distance from a source at which a sound field becomes dominated by reverberant energy depends on the reverberation time of the room, and this in turn depends on the amount of absorption in the room, and the room’s volume (see Fact File 1.5). Figure 1.15 shows how the SPL changes as distance increases from a source in three different rooms. Clearly, in the acoustically ‘dead’ room, the conditions approach that of the free field (with sound intensity dropping at close What is sound? 19 Fact file 1.4 Measuring SPLs Typically a sound pressure level (SPL) meter and a disadvantage in others. The ‘C’ curve is is used to measure the level of sound at a recommended in the USA and Japan for aligning particular point. It is a device that houses a high sound levels using noise signals in movie quality omnidirectional (pressure) microphone theatres, for example. This only rolls off the very (see ‘Omnidirectional pattern’, Chapter 3) extremes of the audio spectrum and is therefore connected to amplifiers, filters and a meter quite close to an unweighted reading. Some (see diagram). researchers have found that the ‘B’ curve produces results that more closely relate measured sound Weighting filters signal levels to subjective loudness of those The microphone’s output voltage is proportional signals. to the SPL incident upon it, and the weighting filters may be used to attenuate low and high Noise criterion or rating (NC or NR) frequencies according to a standard curve such Noise levels are often measured in rooms by as the ‘A’-weighting curve, which corresponds comparing the level of the noise across the closely to the sensitivity of human hearing at low audible range with a standard set of curves levels (see Chapter 2). SPLs quoted simply in called the noise criteria (NC ) or noise rating (NR ) dB are usually unweighted – in other words all curves. These curves set out how much noise frequencies are treated equally – but SPLs is acceptable in each of a number of narrow quoted in dBA will have been A-weighted and frequency bands for the noise to meet a certain will correspond more closely to the perceived criterion. The noise criterion is then that of the loudness of the signal. A-weighting was originally nearest curve above which none of the measured designed to be valid up to a loudness of 55 phons, results rises. NC curves are used principally in since the ear’s frequency response becomes the USA, whereas NR curves are used principally flatter at higher levels; between 55 and 85 phons in Europe. They allow considerably higher levels the ‘B’ curve was intended to be used; above in low-frequency bands than in middle- and 85 phons the ‘C’ curve was used. The ‘D’ curve high-frequency bands, since the ear is less was devised particularly for measuring aircraft sensitive at low frequencies. engine noise at very high level. In order to measure the NC or NR of a Now most standards suggest that the location it is necessary to connect the measuring ‘A’ curve may be used for measuring noise at microphone to a set of filters or a spectrum any SPL, principally for ease of comparability of analyser which is capable of displaying the measurements, but there is still disagreement in SPL in one octave or one-third octave bands. the industry about the relative merits of different curves. The ‘A’ curve attenuates low and high Further reading frequencies and will therefore under-read quite British Standard 5969. Specification for substantially for signals at these frequencies. sound level meters. This is an advantage in some circumstances British Standard 6402. Sound exposure meters. Mic Amplifier Amplifier Meter Weighting filters 20 What is sound? Fact file 1.5 Absorption, reflection and RT Absorption the sound wave will bend round it or be reflected When a sound wave encounters a surface some by it. When an object is large in relation to the of its energy is absorbed and some reflected. wavelength the object will act as a partial barrier The absorption coefficient of a substance to the sound, whereas when it is small the sound describes, on a scale from 0 to 1, how much will bend or diffract around it. Since sound energy is absorbed. An absorption coefficient of wavelengths in air range from approximately 1 indicates total absorption, whereas 0 represents 18 metres at low frequencies to just over 1 cm total reflection. The absorption coefficient of at high frequencies, most commonly encountered substances varies with frequency. objects will tend to act as barriers to sound at The total amount of absorption present high frequencies but will have little effect at in a room can be calculated by multiplying the low frequencies. absorption coefficient of each surface by its area and then adding the products together. Reverberation time All of the room’s surfaces must be taken into W. C. Sabine developed a simple and fairly account, as must people, chairs and other reliable formula for calculating the reverberation furnishings. Tables of the performance of different time (RT60) of a room, assuming that absorptive substances are available in acoustics references material is distributed evenly around the surfaces. (see Recommended further reading). Porous It relates the volume of the room (V ) and its total materials tend to absorb high frequencies more absorption (A ) to the time taken for the sound effectively than low frequencies, whereas resonant pressure level to decay by 60 dB after a sound membrane- or panel-type absorbers tend to be source is turned off. better at low frequencies. Highly tuned artificial RT60 = (0.16V )/A seconds absorbers (Helmholtz absorbers) can be used to remove energy in a room at specific frequencies. In a large room where a considerable volume of The trends in absorption coefficient are shown in air is present, and where the distance between the diagram below. surfaces is large, the absorption of the air 1 becomes more important, in which case an additional component must be added to the Absorption coefficient Helmholtz above formula: Porous RT60 = (0.16V )/(A + xV ) seconds where x is the absorption factor of air, given at various temperatures and humidities in acoustics Membrane references. The Sabine formula has been subject to 0 modifications by such people as Eyring, in an Frequency attempt to make it more reliable in extreme Reflection cases of high absorption, and it should be The size of an object in relation to the wavelength realised that it can only be a guide. of a sound is important in determining whether What is sound? 21 Rev. field level 0 Direct + rev. field level –6 –12 Sound intensity level (dB) –18 Reverberant room –24 –30 Average room –36 –42 –48 ‘Dead’ or ‘dry’ room Distance from source Figure 1.15 As the distance from a source increases direct sound level drops but reverberant sound level remains roughly constant. The resultant sound level experienced at different distances from the source depends on the reverberation time of the room, since in a reverberant room the level of reflected sound is higher than in a ‘dead’ room to the expected 6 dB per doubling in distance), since the amount of reverberant energy is very small. The critical distance at which the contribution from direct sound equals that from reflected sound is further from the source than when the room is very reverberant. In the reverberant room the sound pressure level does not change much with distance from the source because reflected sound energy predominates after only a short distance. This is important in room design, since although a short reverberation time may be desirable in a recording control room, for example, it has the disadvantage that the change in SPL with distance from the speakers will be quite severe, requiring very highly powered amplifiers and heavy-duty speakers to provide the necessary level. A slightly longer reverberation time makes the room less disconcerting to work in, and relieves the requirement on loudspeaker power. Standing waves The wavelength of sound varies considerably over the audible frequency range, as indicated in Fact File 1.5. At high frequencies, where the wavelength is small, it is appropriate to consider a sound wavefront rather like light – as a ray. Similar rules apply, such as the angle of incidence of a sound wave to a wall is the same as the angle of reflection. At low frequencies where the wavelength is comparable with the dimensions of the room it is necessary to consider other factors, since the room behaves more as a complex resonator, having certain frequencies at which strong pressure peaks and dips are set up in various locations. 22 What is sound? Wall λ/2 Wall Max. sound pressure Min. sound pressure Figure 1.16 When a standing wave is set up between two walls of a room there arise points of maximum and minimum pressure. The first simple mode or eigentone occurs when half the wavelength of the sound equals the distance between the boundaries, as illustrated, with pressure maxima at the boundaries and a minimum in the centre Standing waves or eigentones (sometimes also called room modes) may be set up when half the wavelength of the sound or a multiple is equal to one of the dimensions of the room (length, width or height). In such a case (see Figure 1.16) the reflected wave from the two surfaces involved is in phase with the incident wave and a pattern of summations and cancellations is set up, giving rise to points in the room at which the sound pressure is very high, and other points where it is very low. For the first mode (pictured), there is a peak at the two walls and a trough in the centre of the room. It is easy to experience such modes by generating a low-frequency sine tone into a room from an oscillator connected to an amplifier and loudspeaker placed in a corner. At selected low frequencies the room will resonate strongly and the pressure peaks may be experienced by walking around the room. There are always peaks towards the boundaries of the room, with troughs distributed at regular intervals between them. The positions of these depend on whether the mode has been created between the walls or between the floor and ceiling. The frequencies (f ) at which the strongest modes will occur is given by: f = (c/ 2) × (n/d ) where c is the speed of sound, d is the dimension involved (distance between walls or floor and ceiling), and n is the number of the mode. A more complex formula can be used to predict the frequencies of all the modes in a room, including those secondary modes formed by reflections between four and six surfaces (oblique and tangential modes). The secondary modes typically have lower amplitudes than the primary modes (the axial modes) since they experience greater absorption. The formula is: f = (c / 2)√((p /L)2 + (q /W )2 + (r /H )2) What is sound? 23 where p, q and r are the mode numbers for each dimension (1, 2, 3...) and L, W and H are the length, width and height of the room. For example, to calculate the first axial mode involving only the length, make p = 1, q = 0 and r = 0. To calculate the first oblique mode involving all four walls, make p = 1, q = 1, r = 0, and so on. Some quick sums will show, for a given room, that the modes are widely spaced at low frequencies and become more closely spaced at high frequencies. Above a certain frequency, there arise so many modes per octave that it is hard to identify them separately. As a rule-of-thumb, modes tend only to be particularly problematical up to about 200 Hz. The larger the room the more closely spaced the modes. Rooms with more than one dimension equal will experience so-called degenerate modes in which modes between two dimensions occur at the same frequency, resulting in an even stronger resonance at a particular frequency than otherwise. This is to be avoided. Since low-frequency room modes cannot be avoided, except by introducing total absorption, the aim in room design is to reduce their effect by adjusting the ratios between dimensions to achieve an even spacing. A number of ‘ideal’ mode-spacing criteria have been developed by acousticians, but there is not the Fact file 1.6 Echoes and reflections Early reflections Echoes Early reflections are those echoes from nearby Echoes may be considered as discrete surfaces in a room which arise within the first reflections of sound arriving at the listener after few milliseconds (up to about 50 ms) of the direct about 50 ms from the direct sound. These are sound arriving at a listener from a source (see perceived as separate arrivals, whereas those up the diagram). It is these reflections which give to around 50 ms are normally integrated by the the listener the greatest clue as to the size of a brain with the first arrival, not being perceived room, since the delay between the direct sound consciously as echoes. Such echoes are and the first few reflections is related to the normally caused by more distant surfaces which distance of the major surfaces in the room from are strongly reflective, such as a high ceiling the listener. Artificial reverberation devices allow or distant rear wall. Strong echoes are usually for the simulation of a number of early reflections annoying in critical listening situations and should before the main body of reverberant sound decay, be suppressed by dispersion and absorption. and this gives different reverberation programs the characteristic of different room sizes. Flutter echoes A flutter echo is sometimes set up when two parallel reflective surfaces face each other in a Source Early reflection room, whilst the other surfaces are absorbent. It is possible for a wavefront to become ‘trapped’ into bouncing back and forth between these two Direct Listener surfaces until it decays, and this can result in a ‘buzzing’ or ‘ringing’ effect on transients (at the Early reflection starts and ends of impulsive sounds such as hand claps). 24 What is sound? space to go into these in detail here. Larger rooms are generally more pleasing than small rooms, since the mode spacing is closer at low frequencies, and individual modes tend not to stick out so prominently, but room size has to be traded off against the target reverberation time. Making walls non-parallel does not prevent modes from forming (since oblique and tangential modes are still possible); it simply makes their frequencies more difficult to predict. The practical difficulty with room modes results from the unevenness in sound pressure throughout the room at mode frequencies. Thus a person sitting in one position might experience a very high level at a particular frequency whilst other listeners might hear very little. A room with prominent LF modes will ‘boom’ at certain frequencies, and this is unpleasant and undesirable for critical listening. The response of the room modifies the perceived frequency response of a loud- speaker, for example, such that even if the loudspeaker’s own frequency response may be acceptable it may become unacceptable when modified by the resonant characteristics of the room. Room modes are not the only results of reflections in enclosed spaces, and some other examples are given in Fact File 1.6. Recommended further reading General acoustics Alton Everest, F. (2000) The Master Handbook of Acoustics, 4th edn. McGraw-Hill Benade, A. H. (1991) Fundamentals of Musical Acoustics. Oxford University Press Campbell, M. and Greated, C. (2001) The Musician’s Guide to Acoustics. Oxford University Press Eargle, J. (1995) Music, Sound, Technology, 2nd edition. Van Nostrand Rheinhold Egan, M. D. (1988) Architectural Acoustics. McGraw-Hill Hall, D. E. (2001) Musical Acoustics, 3rd edition. Brooks/Cole Publishing Co. Howard, D. and Angus, J. (2000) Acoustics and Psychoacoustics, 2nd edition. Focal Press Rettinger, M. (1988) Handbook of Architectural Acoustics and Noise Control. TAB Books Rossing, T. D. (2001) The Science of Sound, 3rd edition. Addison-Wesley Chapter 2 Auditory perception In this chapter the mechanisms by which sound is perceived will be introduced. The human ear often modifies the sounds presented to it before they are presented to the brain, and the brain’s interpretation of what it receives from the ears will vary depending on the information contained in the nervous signals. An understanding of loudness perception is important when considering such factors as the perceived frequency balance of a reproduced signal, and an under- standing of directional perception is relevant to the study of stereo recording techniques. Below, a number of aspects of the hearing process will be related to the practical world of sound recording and reproduction. The hearing mechanism Although this is not intended to be a lesson in physiology, it is necessary to investigate the basic components of the ear, and to look at how information about sound signals is communicated to the brain. Figure 2.1 shows a diagram of the ear mechanism, not anatomically accurate but showing the key mechanical components. The outer ear consists of the pinna (the visible skin and bone structure) and the auditory canal, and is terminated by the tympanic membrane or ‘ear drum’. The middle ear consists of a three-bone lever structure which connects the tympanic membrane to the inner ear via the oval window (another membrane). The inner ear is a fluid-filled bony spiral device known as the cochlea, down the centre of which runs a flexible membrane known as the basilar mem- brane. The cochlea is shown here as if ‘unwound’ into a straight chamber for the purposes of description. At the end of the basilar membrane, furthest from the middle ear, there is a small gap called the helicotrema which allows fluid to pass from the upper to the lower chamber. There are other components in the inner ear, but those noted above are the most significant. The ear drum is caused to vibrate in sympathy wi