Speech Science and Phonetics PDF
Document Details
Uploaded by ConsummateSocialRealism
Kuwait University
Tags
Related
- CM1 - 24-09 - Acquisition des sons d’une langue PDF
- UE 114 Connaissances fondamentales en sciences de la parole appliquées à l'orthophonie (S3) PDF
- Fonetiek en Fonologie - Studie Notes PDF
- Diapo CM1:CM2 PDF - Connaissances fondamentales en sciences de la parole
- Introduction to Forensic Phonetics Applications 2024 PDF
- Lecture Notes On Phonetics PDF
Summary
This document explains the respiratory system, focusing on its role in speech production. It details the pulmonary apparatus, including the trachea, bronchi, and alveoli. The chapter also touches on how speech breathing differs from normal breathing.
Full Transcript
The Respiratory System LEARNING OUTCOMES After reading this chapter you will be able to o Identify the structures of the pulmonary apparatus a...
The Respiratory System LEARNING OUTCOMES After reading this chapter you will be able to o Identify the structures of the pulmonary apparatus and the chest wall system. o Name the primary muscles of respiration and distinguish them from the accessory muscles. o Describe how the lungs move due to pleural linkage. o contrast the way in which air is moved into and out of the lungs. o List the lung volumes and capacities and explain their relationship to speech breathing. o Identify the differences between breathing for life and breathing for speech. o Describe patterns of speech breathing, and identify how speech breathing patterns develop and change over the lifespan. he primary purpose of respiration is ventilation. locations of the body during breathing and speaking Ventilation is the process of moving air into and will be addressed in this chapter. The chapter begins out of the airways and lungs in order to exchange with a description of the pulmonary airways. This oxygen (0) entering the lungs and carbon dioxide provides a framework for further exploration of the (CO2) leaving the lungs. Adequate respiratory func mechanics of respiration in relation to normal and tion is also essential for voice production. The ability disordered speech production, and singing. Keep in to speak depends on a steady outflow of air that is mind that many structures of the respiratory system, vibrated by the vocal folds to produce sound. The including the larynx, the oral cavity, and nasal cavi sound is then further modified by the articulators to ties, are involved not only in breathing but in pho generate the specific phonemes of whatever language nation and articulation, as well as swallowing. This is being spoken. Without this outflow of air, there multipurpose organization of structures and systems would be no speech. The subject of air-how it flows is extremely efficient and is made possible by an ex into and out of the lungs; how breathing patterns traordinary degree of control and coordination by the change when breathing for speech and for singing, nervous system. rather than for purely vegetative, life-sustaining The respiratory system is divided into the pulmo purposes; the pressures and flows of air in different nary apparatus and the chest wall. Pulmonary Apparatus The pulmonary apparatus includes the lungs and airways and is the means where by air containing oxygen (02) is conducted to the lungs and transported to all the cells of the body, and carbon dioxide (CO2) is transported from the body back to the lungs and exhaled. The pulmonary apparatus is made up of the trachea, bronchi, bronchioles, alveoli, and lungs. The trachea, bronchi, and bronchioles are often referred to as the bronchial tree. 55 56 CHAPTER 2 The Respiratory System Figure 2.1 Bronchial tree Tracheal cartilage rings rimary bronchi Bronchioles Respiratory bronchioles Bronchial Tree The bronchial tree consists of a branching system of hollow tubes that conduct air to and from the lungs. Similar to a tree, the bronchial system begins with a larger tube (the trunk of the tree, in the analogy), which then divides into smaller and smaller tubes (the branches and twigs) (Figure 2.1). Directly beneath the larynx lies the trachea, corresponding to the trunk of the tree in the analogy. The trachea is a hollow tube, about 10 to 16 cm long in adults and approximately 2.0 to 2.5 cm in diameter. The trachea is made up of 16 to 20 rings of cartilage that are closed in the front and open in the back. Between the Pulmonary Apparatus 57 Figure 2.2 Trachea Longitudinal esophageal muscle Esophageal lumen Tracheal Posterior wall cartilage ring Tracheal lumen...- Connective tissue sheath Anterior wall Elastin fibers Sectional view through the trachea and esophagus cartilages and forming the back wall of the trachea is smooth muscle, and overly ing the cartilages and muscle is a mucous membrane. The combination of cartilage and smooth muscle allows both great flexibility as well as support for the trachea. The support prevents the trachea from collapsing when negative pressures gener ated during inhalation occur within it. On the inside surface, the trachea is lined with a layer of ciliated pseudostratified columnar epithelium. The epithelium contains mucus-producing cells and millions of tiny hairlike projections called cilia. The cilia continuously move in a wavelike motion. As they slowly move downward, they pick up particles of dust, pollution, bacteria, and viruses. As they quickly move upward, this material, held together in a sticky blanket of mucus, is forcefully expelled into the throat and swallowed or coughed out. Thus, the cilia act as a filtering system to clean the air going into the lungs. See Figure 2.2. It is well known that cigarette smoke paralyzes the cilia and prevents them fro.m carry ing out their filter ing function. "Smokers' cough'' occurs in the morning because the cilia begin to recover some function during sleep and try to expel some of the accumulated mucus and toxins when the individual wakes. The good news is that cilia can recover their ability to function after a person stops smoking. The trachea splits into two branches called primary or mainstem bronchi (sin gular: bronchus). Each mainstem bronchus is slightly less than half the diameter of the trachea. The right bronchus enters the right lung, and the left bronchus enters the left lung. Each primary bronchus divides into secondary bronchi that supply 58 CHAPTER 2 The Respiratory System the lobes of the lung (two lobes in the left lung, three lobes in the right lung). Each secondary bronchus, in turn, subdivides into tertiary (segmental) bronchi that go into the small segments of the lungs (8 segments in the left lung, 10 segments in the right lung). The structure of the bronchi is very similar to that of the trachea, being composed of rings of cartilage, as well as membrane and smooth muscle. The seg mental bronchi continue to branch into smaller and smaller tubes. Eventually, the bronchi divide into microscopic bronchioles, which at this point lose their cartilage and are composed of only smooth muscle and membrane. The bronchioles further branch into terminal bronchioles and then respiratory bronchioles. In total, the bronchial tree contains 20 to 28 subdivisions (Hixon & Hoit, 2005; Seikel, King, & Drumright, 1997; Zemlin, 1998). With each successive subdivision, the branches become smaller and narrower but increase in number. Consequently, the total sur face area of the smaller branches is larger than that of the next larger branch. This provides an enormous amount of surface area for respiration. The respiratory bronchioles open into alveolar ducts (Figure 2.3). Each al veolar duct leads to an alveolus, which is a microscopic, thin-walled, air-filled Figure 2.3 Alveoli The bronchioles terminate in clusters of tiny balloon-like alveoli that inflate and deflate during breathing. The alveoli are covered by a network of capillaries. The exchange of inhaled oxygen and waste carbon dioxide takes place across the extremely thin walls of these pulmonary capillaries and alveoli. Inhaled 02 passes through the thin alveolar and pulmonary capillary walls and attaches to the hemoglobin of the red blood cells. CO2 (produced as waste by cells) is carried by the red blood cells and passes through the capillary and alveolar walls and is exhaled..... -... =--.. '\.... Pulmonary Apparatus 59 Figure 2.4 Lungs Upper lobe ,,,.. Middle lobe Lower lobe ✓ Lower lobe '-.. RIGHT LUNG LEFT LUNG structure. There are millions of alveoli in the lungs, with an average of 480 mil lion, and larger lungs have up to 900 million (Ochs et al., 2004). Each alveolus is surrounded by a dense network of tiny blood capillaries. Within each alveolus is a substance called surfactant. Surfactant keeps the alveoli inflated by lowering the surface tension of their walls, thus preventing them from being pulled inward during inspiration. The alveoli and the capillaries are extremely thin walled, al lowing for the easy exchange of gases between them. The bronchi, bronchioles, alveoli, and blood vessels make up the interior structure of the lungs. The lungs are very porous and elastic, which allows them to change their size and shape. The lungs are not symmetrical. The right lung is larger than the left and is composed of three lobes that are separated by grooves. The left lung is smaller than the right, to make space for the heart on the left side. The left lung is therefore composed of only two lobes (Figure 2.4). The lungs in a newborn infant are a pinkish color, while the effects of environmental pollution and other toxins, such as cigarette smoke, contribute to a grayish or blackish color in adults. Chest Wall The pulmonary apparatus is enclosed within the chest wall. The chest wall is made up of the rib cage, abdominal wall, abdominal contents, and diaphragm. The rib cage and diaphragm make up the thoracic cavity, which houses the lungs (Figure 2.5). 60 CHAPTER 2 The Respiratory System Figure 2.5 Thoracic cavity Thoracic cavity L'----' j Verteb ':==J The thoracic cavity is bounded by the sternum (breastbone) and rib cage on the front and sides, the spinal column and vertebrae at the back, and the diaphragm muscle at the bottom. The rib cage is composed of 12 ribs on either side that are attached by means of cartilage (costal cartilage) to the sternum. The pectoral girdle also forms part of the rib cage. This structure includes the two collar bones (clavi cles) in front and two shoulder blades (scapulae) at the back. Many muscles attach to the rib cage and to the pectoral girdle. The lungs are thus well protected and are maintained in an airtight fashion. In a healthy, uninjured individual, the only way that air can get into and out of the lungs is through the bronchial tree. The abdominal wall forms the boundaries of the abdominal cavity and is di vided into the posterior, lateral, and anterior walls. The abdominal wall is com posed primarily of muscles and sheets of tendons that wrap around the front, back, and sides. The shape of the abdominal wall depends on the person's age, muscle mass, muscle tone, weight, and posture. The abdominal contents include all organs within the abdominal cavity, such as the stomach, intestines, lower esophagus, colon, appendix, liver, kidneys, pancreas, and spleen. Pulmonary Apparatus 61 Figure 2.6 Diaphragm Central tendon When the diaphragm Is relaxed When the diaphragm is contracted down the lungs deflate. ward, tne lungs expand and inflate. The diaphragm muscle forms the roof of the abdominal cavity and the floor of the thoracic cavity. This is a large, dome-shaped muscle that stretches from one side of the rib cage to the other. It attaches along the lower margins of the rib cage, the sternum, and the vertebral column. The center portion of the diaphragm is composed of a flat sheet of tendon called the central tendon (Figure 2.6). The dia phragm is extremely sensitive to the shifting around of the abdominal contents, which plays an important role in breathing, particularly for speech and singing. In its relaxed state, the diaphragm is shaped like an inverted bowl, in which position the middle portion extends somewhat upward. When it contracts, the diaphragm flattens out, with the middle portion lowering. When the diaphragm contracts, the volume of the thoracic cavity is increased in a vertical direction. In addition, because the diaphragm is attached to the lower rib cage, contraction raises the lower rib cage, and the thoracic cavity is thereby increased circumferentially as well (Hixon & Hoit, 2005).., " 64 CHAPTER 2 The Respiratory System Table 2.1 Accessory Muscles of the Neck, Thorax, and Abdomen and Their Functions in Respiration Muscle Function Neck Scalenes Elevate ribs 1 and 2 Sternocleidomastoid Elevate rib cage Thorax Costal levators Elevate rib cage Pectoralis major Elevate rib cage Pectoralis minor Depress ribs 3 to 5 Serratus anterior Elevate ribs 1 to 9 serratus posterior inferior Depress ribs 9 to 12 Serratus posterior superior Elevate rib cage subclavius Elevate rib cage Subcostals Lower rib cage Transverse thoracic Depress rib cage Abdomen External oblique Compress abdomen Internal oblique Compress abdomen Rectus abdominis Compress abdomen Transverse abdominis Compress abdomen Pleural Linkage For respiration to occur, the lungs must increase and decrease their volume by expanding and contracting. However, the lungs contain very little muscle, so the only way that they can move is by some external force. The external force is gen erated through the structure and linkage of the lungs and thorax, called pleural linkage (Figure 2.8). Each lung is covered on the outside by a thin sheet of membrane, the visceral pleura. The inner surface of the thorax is lined with another layer of membrane, the parietal pleura. These two pleural layers are really one continuous membrane folded back on itself rather than two separate membranes. Between these two pleurae (singular: pleura) is a very small potential space, the pleural space, that contains pleural fluid. The pressure inside this space (intrapleural pressure [PP1]) is negative. The difference between the pressure inside the lungs (alveolar pressure [P.1)) and the intrapleural pressure is called transpulmonary pressure. Under normal conditions, the transpulmonary pressure is always positive; intrapleural pressure is always negative; and alveolar pressure changes from slightly positive to slightly negative as a person breathes (West, 2005) (Table 2.2). Pleural Linkage 65 Figure 2.8 Pleural linkage Visceral pleura Table 2.2 Lung Pressures Pressure Loeatlon Rosltlve/Negatlve lntrapleural pressure (PP1) Between visceral and Negative parietal pleurae Alveolar pressure (P.1) Within the lungs Changes from positive to negative Transpulmonary pressure Difference between PP1 and Positive palv The pleural space is negative in pressure due to the opposing recoil forces of the lung and chest wall (Lai-Fook, 2004). That is, the elastic recoil of the lungs pulls the visceral layer inward, while the elastic tendency of the thorax pulls the parietal layer outward. The space between the pleurae is therefore permanently slightly expanded, which lowers the intrapleural pressure. When two structures Muscles of the Larynx 131 Figure 4.11 Glottal shapes Adducted uiet breathing Forced abduction (forced inhalation) Intrinsic Muscles There are five major intrinsic muscles of the larynx. These muscles have both their origin and insertion within the larynx itself. Two of the five muscles function to adduct the vocal folds, one muscle is the vocal fold abductor, one elongates and 132 CHAPTER 4 The Phonatory /Laryngeal System Figure 4.12 Extrinsic laryngeal muscles Mastoid process Styloid process 1 Stylohyoid 2 Mylohyoid 3 Digastric muscle anterior belly 4 Digastric muscle posterior belly 5 Omohyoid 6 Sternohyoid 7 Cricothyroid (intrinsic muscle) 8 Thyrohyoid 9 Sternothyroid 10 Sternocleidomastoid tenses the folds, and one forms the main body of the vocal folds. The intrinsic mus cles of the larynx are shown in Figures 4.13, 4.14, and 4.15 and listed in Table 4.3. The first adductor is a paired muscle, the lateral cricoarytenoid (LCA). This muscle arises from the lateral border of the cricoid and inserts into the muscular process of the arytenoids. When it contracts, it pulls the muscular process in an Muscles of the Larynx 133 Table 4.2 Extrinsic Muscles of the Larynx Muscle Attachments Function lnfrahyoids Sternohyoid Clavicle and sternum to body Depresses hyoid bone and of hyoid larynx Sternothyroid First costal cartilage and Depresses hyoid bone and sternum to oblique line of larynx thyroid lamina Omohyoid Scapula to inferior border of Depresses and retracts hyoid hyoid bone Thyrohyoid Oblique line of thyroid lamina Draws hyoid and thyroid to major horn of hyoid closer to each other Suprahyoids Digastric Posterior belly: mastoid Elevates hyoid bone process of temporal bone to hyoid bone Anterior belly: mandible Elevates hyoid bone to intermediate tendon of digastric muscle Stylohyoid Styloid process of temporal Elevates and retracts hyoid bone to body of hyoid bone Mylohyoid Body of mandible to hyoid Elevates hyoid bone Geniohyoid Mental symphysis of mandible Pulls hyoid bone anteriorly to body of hyoid bone and superiorly anterior and medial direction. This has the effect of pulling the vocal processes toward each other in an inward and downward movement. The vocal folds, at tached to the vocal processes, are also brought toward each other, closing the membranous glottis. The interarytenoid (IA) muscle is the second adductor. This is an unpaired muscle consisting of two bundles of muscle fibers. The transverse fibers run hori zontally across the posterior portions of the two arytenoid cartilages. The oblique fibers course from the base of one arytenoid to the apex of the other arytenoid, creating a cross between the posterior surfaces of the arytenoids. When the in terarytenoid muscle contracts, it glides the arytenoid cartilages medially toward each other, closing the posterior portion of the glottis. Some fibers of the oblique IA extend superiorly forming the aryepiglottic muscle located within the aryepi glottic folds. The paired posterior cricoarytenoid (PCA) muscle is the only one that abducts the vocal folds to open the glottis. The PCA is a large, fan-shaped muscle that orig inates on the posterior aspect of the cricoid cartilage and inserts into the muscular s process of each arytenoid cartilage. Upon contraction, the muscular processes are r rotated posteriorly, which has the effect of pulling the vocal processes and vocal n folds away from each other and opening the glottis. 136 CHAPTER 4 The Phonatory /Laryngeal System Figure 4.15 Actions of the intrinsic muscles Cricothyroid muscles Lateral cricoarytenoid muscles............_ Posterior cricoarytenoid muscles I nterarytenoid muscle The paired cricothyroid (CT) muscle is also composed of two sets of muscle fibers. The pars recta portion originates at the lateral surface of the cricoid carti lage. Its fibers run at an almost upright angle to insert into the inferior border of the thyroid cartilage. The pars oblique portion originates at the same place as the pars recta, but its fibers run in a more angled direction and insert at the anterior surface of the inferior horn of the thyroid cartilage. This muscle functions as a pitch changer. When it contracts, the thyroid cartilage is tilted downward toward the cricoid, thus increasing the distance between the anterior commissure of the thyroid and the arytenoid cartilages. The vocal folds are attached anteriorly at the anterior commissure and posteriorly at the arytenoids. Therefore, increasing the distance between these points stretches and elongates the vocal folds, decreasing their mass per unit of area and increasing the longitudinal tension placed on them. This increases their rate of vibration, resulting in a higher frequency, perceived as a higher pitch. The fifth muscle, the thyroarytenoid (TA), forms the main mass of the true vocal folds and comprises the body in the cover-body model. The TA is a paired muscle, coursing from the anterior commissure to the arytenoids. The more lateral fibers, sometimes known as the thyromuscularis, insert into the muscular process of each arytenoid. The more medial fibers, sometimes referred to as the thyrovocalis (or just vocalis), insert into each vocal process. In addition to forming the bulk of the vocal folds, the TA can exert internal tension that stiffens it and helps to in crease the rate of vibration of the vocal folds. CLINICAL APPLICATION Evaluation and Treatment of Phonatory Disorders LEARNING OUTCOMES After reading this chapter you will be able to o Describe acoustic measures of phonatory function including those related to frequency, intensity, perturbation, and noise. o compare and contrast laryngeal visualization methods, including electroglottography, endoscopy, videostroboscopy, high-speed digital imaging, and videokymography. o Discuss the uses of acoustic and visual information with regard to neurological disorders, benign and cancerous lesions of the vocal folds, hearing impairment, and treatment of voice in transsexual individuals. o Describe measures of phonation that are used in intervention approaches to stuttering. ----- ' Measurement of Phonatory Variables Phonatory and laryngeal function can be measured acoustically and/or visually. The chapter begins with a discussion of the most commonly used acoustic measures, in cluding frequency and intensity variables; perturbation measures, including jitter and shimmer; and measures of noise in the voice, such as harmonics-to-noise ratio, noise to-harmonics ratio, and normalized noise energy. Discussion then focuses on indirect and direct laryngeal visualization techniques, including electroglottography, endos copy, videostroboscopy, high-speed digital imaging, and videokymography. The next section highlights evaluation and treatment of communication disorders that are often characterized by impaired laryngeal structure and/or function, such as certain neu rological disorders, hearing impairment, laryngeal cancer, benign tumors of the vocal folds, muscle tension dysphonia, gastroesophageal reflux disease/laryngopharyngeal reflux, transsexual voice, and stuttering. Acoustic Analysis As described in Chapter 4, the vibration of the vocal folds generates a nearly pe riodic, complex sound wave with a fundamental frequency (F0) and harmonics. The F0 is the rate at which the vocal folds vibrate and corresponds to the perceived pitch of the voice. The vocal folds also generate a certain degree of amplitude dur ing vibration, corresponding to the loudness of the speaker's voice. Measurements of vocal frequency and amplitude that are commonly used in clinical practice in clude average F0, frequency variability, maximum phonational frequency range, average amplitude level, amplitude variability, and dynamic range. 155 156 CHAPTER 5 Clinical Application Evaluation and Treatment of Phonatory Disorders Average fundamental frequency The rate of vibration of the vocal folds depends on their length, tissue density, and tension. The greater the length and tissue density and the less the tension and stiff ness, the slower the vocal folds will vibrate and the lower the person's pitch. Vocal folds that are shorter, less dense, and stiffer will vibrate more quickly and give rise to the perception of a higher pitch. Vocal F0 is determined primarily by the tension of the vocal fold cover rather than by the actual length of the vocal folds (Kent, 1997b). As described in Chapter 4, F0s differ depending on age and gender (see Tables 4.4 and 4.5). Average F0 is derived by measuring F0 in a particular task, such as sustaining a vowel, reading aloud, or conversational speech, and averaging over the speaking time of that task. When average F0 is measured in an oral read ing or conversational speech task, it is often referred to as the person's speaking fundamental frequency (SFF). Infants in the first several years of life have a very high average F0 from around 350 to almost 500 Hz. From about age 3 to 10 years, both boys and girls have an average F0 of approximately 270 to 300 Hz. After pu berty the average F0 for males drops markedly to around 120 Hz, corresponding to an octave decrease; that for females decreases less dramatically to about 220 Hz (2-3 semitones). As we saw in Chapter 4, average F0 for older men may increase, while that for older women may decrease. Frequency variability People constantly change their F0 levels as they speak to reflect different emotions, different types of accenting and stress of syllables, and different grammatical con structions. The F0 changes contribute to the overall melody, or prosody, of speech. For instance, the sentence "Peter's going home" can be said either as a declarative statement or as a question. As a declarative, the F0 level drops at the end of the ut terance, whereas for a question, the F0 level rises at the end of the utterance. The prosody of a sentence also is influenced by the mood of the speaker. There are likely to be many more F0 changes, and more extensive changes, when the individual is excited that Peter's going home than when the speaker is depressed by Peter's plans. Acoustically, these F0 changes correspond to a frequency variability. A certain amount of frequency variability is desirable in a speaker's voice, depending on the individual's age, sex, culture, social context, and mood. The variability is something that speakers of a particular language in a particular culture intuitively recognize. Too much or too little frequency variability sounds wrong and can indicate a func tional, organic, or neurogenic voice problem. F0 variability is measured in terms of standard deviation (SD) from the av erage F0 Standard deviation is a statistical measure that reflects the spread of scores around the average score; thus standard deviation of F0 reflects the spread of F0 values around the average F0 When the variability is measured in hertz, it is called F0SD. F0SD in normal conversational speech is around 20 to 35 Hz or higher. F0SD is likely to increase when the speaker is excited or agitated. In an isolated vowel F0SD is typically much smaller as the speaker attempts to main tain a relatively steady F0 for the duration of the vowel. Sometimes F0SD is con verted to semitones. When the frequency variability is measured in semitones rather than hertz, it is called pitch sigma. Pitch sigma for normal speakers dur ing conversation should be around 2 to 4 semitones for both males and females (Colton & Casper, 1996). Another measure of F0 variability is the range, which is the difference be tween the highest and lowest F0 in a particular sample of speech. The range can Laryngeal Visualization Methods 163 turbulence results in lower-than-normal HNR or higher-than-normal NHR values. For example, if a person has some kind of problem in vocal fold vibration resulting from a growth (e.g., polyp, nodule), paralysis of one or both vocal folds, or other kind of laryngeal problem, a larger amount of air than normal escapes during vibra tion, creating turbulent noise and decreasing the harmonic components of the voice (Pabon & Plomp, 1988). Table 5.4 Average Values for Harmonics-to-Noise Ratios (HNRs) Reported for Normally Speaking Adults (A) and Children (C) Authors A/C w. Awan & Frenkel (1994) A 15.63 (males) 15.38 (females) Bertino et al. (1996) A 7.23 Dehqan et al. (2010) A 18.42 (males) 18.81 (females) Ferrand (2000) C 2.346 (girls) 2.368 (boys) Ferrand (2002) A 7.82 (young women) 7.86 (middle-aged women) 5.54 (elderly women) Horii & Fuller (1990) A 17.3 (males) 19.1 (females) Yumoto et al. (1982) A 7.3 HNRs show a high correlation with perceptual judgments of voice quality, so this measure can be used to make objective, quantitative assessments of breathi ness or roughness. HNRs for normally speaking individuals have been reported by various researchers as shown in Table 5.4. HNRs for children and for elderly adults have been reported to be lower than for young and middle-aged adults (Ferrand, 2000; 2002). Laryngeal Visualization Methods Techniques to visualize the larynx have become valuable tools in the assessment and treatment of vocal fold function and laryngeal pathologies. Electroglottography is an indirect measure, while endoscopy, videostroboscopy, high-speed digital im aging, and videokymography provide direct imaging of laryngeal structure and function. Electroglottogra p hy Electroglottography (EGG), also sometimes called laryngography, has become a popular tool for evaluating vocal fold function safely and noninvasively. Originally 17 6 CHAPTER 5 Clinical Application Evaluation and Treatment of Phonatory Disorders Evaluation and Treatment of Communication Disorders Involving the Phonatory System Patients with disorders that affect phonation are very commonly treated in hos pitals, clinics, private practices, nursing homes, preschools, and elementary and secondary schools. Acoustic and visual information is helpful in the clinical man agement of neurological disorders: hearing impairment, laryngeal cancer, benign tumors of the vocal folds, disorders resulting from laryngeal muscle tension, trans sexual speech, and stuttering. Neurological Disorders Many different types of neurological disorders can affect phonation, including amyotrophic lateral sclerosis, Parkinson's disease, unilateral vocal fold paralysis, and spasmodic dysphonia. Amyotrophic lateral sclerosis Amyotrophic lateral sclerosis (ALS) is a progressive neurological disease in which all the motor functions of the body deteriorate. The degeneration is due to damage to the motor nerves that supply the voluntary muscles of the body, including those involved in speech production. Acoustic features Speakers with ALS demonstrate a variety of acoustic changes including increased levels of jitter, smaller maximum phonational frequency ranges, abnormal FD levels, and reduced ranges of frequency during connected speech (Silbergleit et al., 1997; Strand, Buder, Yorkston, & Olson Ramig, 1994). Some of these changes may be detected acoustically even when the speakers perceptually sound normal. Acoustic analysis may therefore provide a means by which early oral-facial and laryngeal signs of the disease can be detected. Clinical application Knowing that neuromuscular weakness is present despite the normal-sounding voice, clinicians can offer intervention at early stages of the disease in order to maintain the patient's vocal function for as long as possible. Acoustic analysis has also indicated that the vocal characteristics of patients with ALS are not uniform but vary greatly from individual to individual. While changes in FD seem to be present consistently in patients with ALS, some speakers have lower-than-normal F0 levels, while others demonstrate higher-than-normal F0 levels. In addition, some but not all individuals with ALS exhibit a reduced range of F0 during connected speech. This type of information is critical in planning effective therapy tailored to each individual's particular voice and speech deficits. Parkinson's disease The underlying pathophysiology in Parkinson's disease (PD) is the reduction in the neurotransmitter dopamine. The depletion of dopamine results in the charac teristic signs and symptoms of PD, including muscle rigidity, bradykinesia (dif ficulty and/or slowness in initiating movements), and tremor. Patients with PD often suffer from voice difficulties, including hoarseness, reduced loudness, and Evaluation and Treatment of Communication Disorders Involving the Phonatory System 177 reductions in pitch range. In many cases, the voice difficulties are the first sign of PD (Logemann, Fisher, Boshes, & Blonsky, 1978). Acoustic features Acoustic characteristics that have been reported for individuals with PD include higher Fa, higher levels of jitter, lower intensity levels during connected speech, de creased frequency variability, and decreased dynamic range (Gamboa et al., 1997; Midi et al., 2008). The higher Fa and decreased frequency variability may indicate increased laryngeal tension in some patients (Goberman & Blomgren, 2008; Harel, Cannizzaro, & Snyder, 2004). These acoustic data support the perceptual impres sions of restricted pitch and loudness ranges, which are the common complaint of speakers with PD. Furthermore, these measures have the added advantage of quantifying the precise degree of loss of frequency and intensity ranges compared to normal. Acoustic manifestations of loss of laryngeal control may be present years be fore the individual demonstrates clinical signs of the disease. Harel et al. (2004) analyzed Fa variability in conversational speech in one speaker with PD and one healthy speaker. Speech samples were collected from more than five years before the speaker with PD was diagnosed, as well as several years following diagnosis and treatment with anti-Parkinson medications. The authors reported a decrease in F0 variability starting several years prior to diagnosis and extending until the start of medical treatment. Once the speaker began taking the medication, the Fa variability in conversational speech normalized. The authors suggested that acoustic analysis may be helpful to track the progression of the disease as well as the patient's response to treatment. Clinical application Acoustic measures have been used to evaluate the effectiveness of pharmaco logical and surgical treatments for PD. The most commonly used drug for PD is levodopa (L-dopa). L-dopa is converted into dopamine in the brain by means of "converter" cells, thus replacing the dopamine that is lacking. This reduces the primary problems of rigidity and tremor. However, acoustic measures of voice in patients with early-stage PD taking L-dopa show increased levels of jitter, shim mer, and NHR, indicating reductions in vocal stability despite the medication (Midi et al., 2008). Surgical treatments for PD include fetal cell transplantation (FCT) and deep brain stimulation (DBS). In FCT, suitable fetal cells are transplanted into the ap propriate site in the patient's brain. The aim of this procedure is to replenish the missing dopamine and thus alleviate the symptoms. Baker, Ramig, Johnson, and Freed (1997) reported that, although FCT surgery was effective in improving overall motor performance of the patients studied, acoustic analysis failed to demonstrate improvements in the patients' phonation capacities. DBS involves the insertion of implant wires into a specific brain area and implantation of a pulse generator in the patient's chest. When activated by the patient, the device transmits electrical impulses to the brain, which has the effect of reducing the rigidity and blocking the tremors (Benabid, 2003; Trepanier, Kumar, Lozano, Lang, & Saint-Cyr, 2000). However, acoustic analysis has demonstrated mixed results in terms of patients' voices. Some researchers have reported no improvements in voice (e.g., Valalik, Smehak, Bognar, & Cs6kay, 2011; Xie et al., 2011); others have reported reduced jitter and increased vocal stability (Mate, Cobeta, Jimenez-Jimenez, & Figueiras, 2012). 178 CHAPTER 5 Clinical Application Evaluation and Treatment of Phonatory Disorders EGG has been used to compare different types of treatment for patients with PD. Reduction in vocal intensity is a common complaint of speakers with PD. Ramig and Dromey (1996) compared two treatments designed to increase vocal loudness. One treatment, the respiratory treatment (RT), was designed to increase the activity of the respiratory musculature to generate increased volumes and sub glottal air pressure for speech. The other treatment, known as the Lee Silverman Voice Treatment (LSVT), targeted increased vocal intensity through improved vo cal fold adduction. EGG analysis demonstrated that the patients who received LSVT increased their vocal fold adduction and corresponding loudness levels, whereas the group that received only RT showed no improvements in vocal fold adduction and loudness. This objective finding allowed the authors to suggest that treatment approaches that focus solely on respiration to increase loudness may be counterproductive in patients with PD. Dr. Martinez, a 68-year-old college professor, referred herself to the university speech language-hearing clinic due to concerns about her voice. She reported that she had been having difficulty raising her voice during class, and also noticed that her voice sounded "flat" and "monotonous." She was worried that these vocal issues had been negatively influencing her effectiveness as a teacher. Acoustic analysis using the clinic's Visi-Pitch in dicated severely reduced frequency variability (8.3 Hz) during connected speech. Average amplitude during connected speech was 49.6 dB. Maximum phonational frequency range was less than one octave, and the maximum amplitude Dr. Martinez was able to generate was 53. 7 dB. Based on these values, the clinician suspected the presence of a neurological problem and referred Dr. Martinez to a neurologist. The neurologist confirmed a diagnosis of Parkinson's disease and put Dr. Martinez on a pharmacological regimen. Dr. Martinez also began receiving speech services at the university clinic using the LSVT. Re-evaluation after six months of speech therapy demonstrated that Dr. Martinez's frequency variability during conversation had increased to 18.4 Hz, and average amplitude during conversation had increased to 57.3 dB. Dr. Martinez reported that she felt much more confident about her teaching. The effectiveness of PCT in patients with PD has also been examined by means of EGG measures. Baker et al. (1997) obtained acoustic and EGG recordings of patients undergoing this procedure for three consecutive days before surgery and some months postsurgery. The patients' limb movements did show notable im provements after surgery. However, the EGG measures did not differ much before and after surgery. This was consistent with perceptual measures that indicated that listener ratings of speech presurgery were not remarkably different from rat ings after PCT surgery. Unilateral vocal fold paresis/paralysis Paresis refers to a slight or partial paralysis, resulting in partial loss of muscle strength and movement. The intrinsic laryngeal muscles may be paretic if either the recurrent laryngeal nerve or superior laryngeal nerve is damaged. Although the vocal folds are able to move, their ability to adduct, abduct, or regulate their tension is reduced, particularly when the individual performs repetitive phonatory tasks (Heman-Ackah & Batory, 2003; Rubin et al., 2005). The condition may be unilateral or bilateral. The degree of loss of mobility usually depends on the severity of the injury and can range from mild to severe. Unilateral vocal fold paralysis (UVPP) results from damage to the vagus nerve and its laryngeal branches (recurrent laryngeal nerve and superior laryngeal nerve). 180 CHAPTER 5 Clinical Application Evaluation and Treatment of Phonatory Disorders patients demonstrated the reappearance of vocal fold vibration and improve ment in vocal quality (Szkielkowska, Miaskiewicz, Remade, Krasnod bska, & Skarzynski, 2013). In addition to visual inspection of vibratory parameters, researchers have reported several methods of analyzing open and closed quotients of the vibratory cycle during phonation by counting individual frames of stroboscopy recordings. Frame-by-frame analysis of vocal fold opening, open, closing, and closed phases can provide information regarding prob lems in vocal fold closure (glottic insufficiency). One such method involves calculating the proportion of frames in which the glottis is closed in relation to the total number of frames per glottic cycle. Vocally normal speakers have been shown to have 50 percent closed dura tion, whereas speakers with glottic insufficiency were shown to have closure durations of less than 40 percent (Carroll, Wu, McRay, & Gherson, 2012). Another way in which glottal closure can be evaluated with stroboscopy is by counting the number of pixels in specific video frames and dividing this number by the square of the average length of the right and left vocal folds. Kimura, Nita, Imagawa, Tayama, and Chan (2008) used this method in combination with perceptual and aerodynamic measures to evaluate patients' voices before and after vocal fold medialization surgery and/or vocal fold injection laryngoplasty. The authors reported that both techniques were successful in increasing relative glottal area, demonstrating more complete vocal fold closure following treatment. Videokymography (VKG) has been used to examine fine details of vocal fold vibratory patterns in patients with UVFP. Kimura et al. (2010) used kymogra phy plus acoustic analysis to compare patients' voices before and after arytenoid adduction surgery. Prior to surgery, jitter and shimmer measures were increased. VKG showed that the paralyzed vocal fold did not reach the midline, resulting in incomplete glottal closure for all participants. VKG also revealed different vi bration frequencies in the left and right vocal folds. After surgery, VKG showed marked improvements in phonation, including reductions in the size of the glottal gap, improved glottal closure, and identical or nearly identical frequencies in the left and right vocal folds for all patients. Spasmodic dysphonia Spasmodic dysphonia (SD) is a neurological voice disorder in which the individual suffers spasms of the vocal folds. Depending on which intrinsic laryngeal muscles are involved, the spasms may cause the vocal folds to either adduct (adductor spas modic dysphonia [ADSD]) or abduct (abductor spasmodic dysphonia [ABSD]) inappropriately during phonation. In ADSD, the laryngospasms cause the vocal folds to adduct so tightly that they are very difficult to set into vibration. The voice quality of individuals with this disorder is called strained-strangled because of the severely strained and tense quality. Acoustic features The most common acoustic characteristics of ADSD are voice breaks, aperiodicity, erratic frequency shifts, higher standard deviation of F0, higher jitter and shimmer values, and lower signal-to-noise ratios (e.g., Sapienza, Walton, & Murry, 2000; Zwirner, Murry, Swenson, & Woodson, 1991). Clinical application Botox injection has become the gold standard for treatment and is used to alleviate the laryngospasms. The toxin is injected into the affected muscle, which in the case of ADSD is usually the thyroarytenoid muscle. The toxin temporarily weakens or even paralyzes the muscle. Therefore, during vibration, the vocal folds are unable 188 CHAPTER 5 Clinical Application Evaluation and Treatment of Phonatory Disorders Clinical application The goal of voice treahnent for MTF speakers is to emphasize and highlight the markers of female speech. Markers include a higher pitch, greater intonational range and pitch variability, increased vocal expression, rising intonation on statements, breathier voice quality, feminine patterns of phrasing, and nonverbal visual mark ers, such as increased eye contact, increased hand/ arm gestures, and increased use of touch (Parker, 2008). F0 is the most salient cue to gender identification for trans speakers as well as for biological men and women (Gelfer & Mikos, 2005). Therefore, raising an indi vidual's pitch has typically been targeted as the most important goal of treatment. Research has established a cutoff F0 of around 150 to 173 Hz, below which speak ers are perceived as male and above which speakers are perceived as female (e.g., Brown, Perry, Cheesman, & Pring, 2000; Gelfer & Schofield, 2000; Wolfe, Ratusnik, Smith, & Northrop, 1990). The range between approximately 145 to 165 Hz forms a gender-ambiguous F0 zone in which the speaker 's gender is not identifiable (Adler, Hirsch, & Mordaunt, 2006; Thornton, 2008). This zone forms the initial target for pitch-raising exercises. Pitch range and variability are aspects of intonation that are strong markers of gender identity. Female speakers typically use a wider range of F0s and a more var ied pattern of pitch inflections, while male speakers tend to use a more restricted F0 range and fewer inflectional patterns (Ferrand & Bloom, 1996). Studies have demonstrated that trans speakers' voices identified as female are characterized by more pitch inflections both upward and downward, less extensive downward shifts, a greater proportion of upward shifts, and fewer level intonation patterns (e.g., Gelfer & Schofield, 2000; Wolfe et al., 1990). Women's intonational patterns differ from men's not only in range but in phrase endings. Women tend to use more rising inflections at the end of utterances, often giving their speech a some what tentative sound. In fact, a speaker who uses a lower pitch but a more femi nine intonational pattern and style tends to sound more feminine than one who uses a higher pitch but fewer feminine patterns. Palmer, Dietsch, and Searl (2012) used endoscopy and stroboscopy to examine laryngeal aspects of voice in MTF speakers when they were using their feminine voice. They reported that rather than complete glottal closure during phonation, as expected for a male larynx, glottal closure was incomplete in most of the speak ers. This resulted in a more extended open phase during vibration. In addition, most of the speakers showed vocal hyperfunction. Because female speakers typi cally show a slight degree of posterior opening during vibration, which gives the female voice a slight degree of breathiness, the authors suggested that incomplete closure may work well for a trans speaker to create a more female-sounding voice. In order to facilitate this laryngeal position, visual feedback using laryngeal imag ing procedures may be beneficial. Stuttering Difficulties with control of the phonatory system are very common in people who stutter, and these may be evident even when the individual sounds perceptually fluent. Research has shown that people who stutter are slower to initiate phonation during reaction time experiments even during perceptually fluent utterances (e.g., Adams & Hayden, 1976; Max & Gracco, 2005). Watson, Pool, Devous, Freeman, and Finitzo (1992) examined laryngeal reaction times in response to simple and Evaluation and Treatment of Communication Disorders Involving the Phonatory system 189 more complex utterances in normally fluent speakers and individuals who stut tered. Participants were required to produce an utterance (either /a/, the word "Oscar," or the phrase "Oscar took the dog out") as quickly as possible following a visual stimulus. Some of the participants who stuttered (those who demonstrated a certain pattern of cerebral blood flow) showed a longer response time for the phrase than for the isolated vowel and single word conditions. Fluent speakers and those individuals who stuttered but demonstrated similar patterns of cerebral blood flow as fluent speakers did not exhibit longer response times depending on the type of utterance. Laryngeal and acoustic characteristics Individuals who stutter often show differences in patterns of vocal fold vibration. Sebastian, Benedict, and Balraj (2013) used EGG to examine opening time (time tak en for the vocal folds to separate), open time (the interval of time duration during which the vocal folds remain in the lateral position), closing time (time taken for the folds to close the glottis), and closed time (the interval of time during which the glot tis is closed). The participants who stuttered differed significantly from the fluent subjects, demonstrating shorter opening and open durations and longer closing and closed durations. The men who stuttered also demonstrated significantly higher F0s than the fluent controls (210.8 Hz compared to 127.4 Hz). The authors speculated that this was due to increased laryngeal tension. Another laryngeal measure that differs between individuals who stutter and normally fluent speakers is seen in phonated intervals (Pls). Pis refer to the amount of time that the vocal folds vibrate during selected utterances (Davidow, Bothe, Richardson, & Andreatta, 2010). Speakers produce a number of Pis of varying du ration in a specified amount of speaking time (Davidow, Bothe, Andreatta, & Ye, 2009). Research has shown that people who stutter tend to produce very short Pis compared to fluent speakers. However, when individuals who stutter consciously reduce the amount of the short Pis, the stuttering is greatly reduced (e.g., Davidow, Bothe, Andreatta, & Ye, 2009; Ingham, Ingham, Bothe, Wang, & Kilgo, 2015). Suprasegmental aspects of speech (i.e., prosody, stress, intonation) may also differ between people who stutter and fluent speakers. Linguistic stress is achieved by increasing the F0, amplitude, and duration of the target syllable or word in re lation to the surrounding nonstressed portions of the utterance. To achieve this, speakers must constantly make coordinated adjustments to the respiratory/vocal and articulatory systems from syllable to syllable (Packman, Onslow, & Menzies, 2000). Bosshardt, Sappok, Knipschild, and Holscher (1997) examined the way in which participants imitated F0 in sentences with varied patterns of stress. Only the fluent productions of the subjects who stuttered were analyzed and compared with those of the nonstuttering subjects. The individuals who stuttered produced a much smaller increase in F0 than the controls and marked the stressed portions of the sentence by increasing duration more than did the fluent speakers. Clinical application It has been known for centuries, and even millennia, that stuttering is eliminated or reduced during some types of activities, such as singing, slowed or prolonged speech, reading in chorus with another person, and speaking in time to a rhythmic beat, such as a metronome. These conditions have been termed "fluency inducing conditions" (FICs). Wingate (1979) proposed that the basic mechanism underly ing the success of FICs is the establishment of continuous vocalization in which 190 CHAPTER 5 Clinical Application Evaluation and Treatment of Phonatory Disorders phonation is sustained during the entire utterance. TI1e continuity may lessen the motoric demands on the larynx and thus reduce the stuttering. Two of the most popular and most effective current treatment approaches fo cus on teaching people who stutter to use a different mode of phonation based on the use of FICs. One is called prolonged speech and the other is known as rhyth mic speech. Prolonged speech The term prolonged speech is an umbrella term that refers to a group of speech patterns that are also called smooth speech, easy speech, rate control/breath stream management, and precision fluency shaping (Cream, Onslow, Packman, & Llewellyn, 2003). There are several different variations of prolonged speech, which are all designed to eliminate or greatly reduce stuttering by having the individual talk at an extremely slow rate using continuous vocalization across syllables and words. Most of the programs also focus on light articulatory contacts, gentle onset of voice, and continuous flow of air throughout the utterance (Davidow, Bothe, Richardson, & Andreatta, 2010; Packman & Onslow, 1994). Most current prolonged speech programs establish the new pattern by hav ing the client imitate the clinician or a tape-recorded model. The goal is for the individual to begin at a very slow rate of speech and then systematically increase the rate while maintaining the stutter-free speech. However, often the individual's speech, although stutter-free, does not sound natural, so establishing and main taining natural sounding speech has also become an important treatment goal. Block, Onslow, Packman, Gray, and Dacakis (2005) described a prolonged speech treatment program in which clients are taught to use stutter-free speech by imitat ing a clinician's model of prolonged speech that progresses from 60 to 120 syllables per minute. The authors reported that stuttering frequency was substantially re duced immediately following the intensive 5-day treatment period, and the im provement was maintained for 3 to 5 years after the termination of treatment. Cotton Mather was a Puritan minister and student of medicine in colonial North America. In 1724 he wrote a long article on medicine entitled "The Angel of Bethesda." Included in the article is a section called "Ephphatha, or some Advice to STAMMERERS." The section is subtitled "How to gett Good by, and how to gett rid of, their grievous Infirmity." Mather was the first American to write about stuttering. He himself stuttered severely, and he told an anecdote about how an aged "Schole-Master" gave him advice on how to control his stutter ing.The advice is not too different from current prolonged speech programs. "Mytn·end, I now visit you for Nothing, but only to talk with you about the Infirmity in your Speech, and offer you my Advice about it; because I suppose tis a Thing that greatly troubles you. What I advise you to, is, to seek a Cure for it, in the Method of Deliberation. Did you ever know any one stammer in singing of the Psalms?...While you go to snatch at Words, and are too quick at bringing of them out, you'l be stop'd a thousand Times in a Day. But first use yourself to a very deliberate Way of Speaking; a Drawling that shall be little short of Singing. Even this Drawling will be better than Stammering; especially if what you speak, be well worth our waiting for. This deliberate Way of Speaking will also give you a great Command of pertinent Thoughts; yea, and if you find a Word likely to be too hard for you, there will be Time for you to think of substituting another that won't be so. By this Deliberation you will be accustomed anon to speak so much without the indecent Haesitations, that you'l always be in a Way of it; yea, the Organs of your Speech will be so habituated unto Right-Speaking, that you will by Degrees, and sooner than you imagine, grow able to speak asfast again, as you did when the Law of Deliberation first of all began to govern you. Tho' my Advice is, beware of speaking toofast, as long as you live." (Bormann, 1969) vocal Tract Resonance 223 Figure 6.20 Vowel quadrilateral Central High /u/ High /u/ Front /al fa-/ fol Back Mid 13/ /'3'-/ fol /11/ /re/ /al Low The vowels are schematized along the two major dimensions in a format called the vowel quadrilateral (Figure 6.20). The vowel quadrilateral loosely represents the oral cavity. The relative vertical position of any phonetic symbol in the quadri lateral represents the position of the highest point of the tongue in the articulation of the vowel. The relative horizontal position of a phonetic symbol represents the degree of tongue advancement within the oral cavity. In the vertical plane, vowels are classified as high, mid, or low; in the horizontal plane, vowels are considered to be front, central, or back. Thus, vowels are classified as high front, or high back, or mid front, or low back, or central, and so on. The highest front vowel is /i/, and the lowest is /re/. In the case of the back vowels, /u/ is the highest and /a/ the lowest. The back vowels, in English, tend to be produced with lip rounding, whereas the front vowels are not. The central vowels, such as the schwa, are made with the tongue in a neutral position within the vocal tract. You should be able to feel the differences in positions between high and low, and back and front vowels. If you produce each front vowel from highest to lowest, your tongue should progressively drop with each sound. The same differences should be evident when you produce the back vowels from highest to lowest. If you switch between /i/ and /u/, the highest front vowel and the highest back vowel, the body of your tongue should move from a relatively anterior position for the /i/ to a more posterior position for the /u/, and your lips should round for the back vowel and spread for the front vowel. Vocal Tract Resonance Human anatomy is uniquely suited to producing a wide variety of different sounds. The amazing capacity of our species to produce a range of sounds completely out side the scope of any other species is due to the distinct and unique evolution of the human vocal tract. Our ancestral species, such as Homo erectus, and other spe cies, such as Neanderthals, did not possess a vocal tract like ours. The vocal tracts of these species were much shorter, with the larynx positioned much higher in the neck than ours, between the first and third cervical vertebrae (Figure 6.21). Thus, such species could produce only a limited range of sounds. Only with our own spe cies, Homo sapiens, do we find a larynx that is located much lower in the neck (be tween the fourth and seventh cervical vertebrae). This drop of the larynx res1.,1lted in a considerably longer vocal tract. An adult male's vocal tract is approximately 224 CHAPTER 6 The Articulatory /Resonatory System Figure 6.21 Neanderthal versus modern human vocal tract I Neanderthal Modern human 17 to 18 cm from the vocal folds to the lips. An adult female's is about 14 or 15 cm. A very young child's vocal tract is 6 to 8 cm in length and does not have the distinctive right-angled structure seen in adults (Vorperian & Kent, 2007). Because the vocal tract contains the structures of the pharynx and the oral and nasal cavities, it has an extraordinary capacity to change and vary its shape. Every time one moves the tongue to a different position, or raises or lowers the velum, or opens or closes the lips and jaw, the shape of the vocal tract is changed. This abil ity to vary the shape of the vocal tract is what allows the wide range of different speech sounds to be generated. Characteristics of the vocal Tract Resonator The vocal tract is a tube filled with air and is thus an acoustic resonator. As with all other tube resonators, it acts as a filter by selectively responding with greater or lesser amplitude to specific driving frequencies. The frequencies are produced either by vocal fold vibration or generated within the vocal tract itself. As an acoustic resonator, the vocal tract has certain characteristics. Four characteristics are of prime importance (Table 6.9). First, the vocal tract can be tJ,ought of as a tube that is closed at one end" (at the glottis) and open atthe ether end (the IJ_ps). The vocal tract is therefore a Table 6.9 Characteristics of the Vocal Tract Resonator Quarter-wave resonator Series of air-filled containers that are connected to each other Broadly tuned resonator variable resonator Vocal Tract Resonance 225 quarter-wave reso11at r. A de cribed in Chapter 1, in a quarter-wave resonato , the lowest resonant fr.equency has a wavelength that is four times th length of th resonato,:, and the higher resonant frequencies are odd number multiples of the "lowest. \5econd, because the vocal tract consists of the pharynx, oral cavity, and nasal - cavities, it can be thought of as a series of air-filled containers that are connected to each other. Each container has its own resonance frequency (RF), and the overall RF of all these ho_oked-up containers together is different from each of the separate contain rs. Each cavity in the series acts as a band-pass filter to transmit certain fre quencies witnin its bandwidth and to attenuate frequencies outside its bandwidth 'rd, the irregular shape of the vocal tract makes it a broadly tuned resonator that transmits a wide range of frequencies around each RF. Fourth, the vocal tract is a variable resonator whose frequency response changes depending on its shape!. Each time a speaker moves his or her articula tors into position fo a._different sound, the resonant frequencies of the vocal tract_, change because the cross-sectional diameter of the different cavities chang. The tube becomes more constricted in some areas and more open in other areas. The different areas of the cavities then resonate at different frequencies. Figure 6.22 shows different vocal tract constrictions and cross-sectional areas when producing the /i/ and /m/ compared with the schwa. The resonant requencies of the vocal tract are calledi!< formants, Being " tube resonato1, the vocal tract resonates at numerous frequenci. Because it is a quarter-wave resonato the higher formant frequencies are odd number multiples of the lowest formant frequern;y. The formant frequencies can be calculated based on the length of the vocal tract when the articulators are positioned for the mid. centraLschwa vowel (/a/). In this articulatory position, the vocal tract has alm@st Figure 6.22 Vocal tract shapes for /a/, /i/, and Im/ The red outline indicates the portion of the vocal tract that is acting as an acoustic resonator. 226 CHAPTER 6 The Articulatory /Resonatory System Table 6.10 Calculation of Vocal Tract Resonant Frequencies for the Schwa Vowel for an Adult Male Length of male vocal tract 17 cm wavelength of lowest resonant frequency: 17 x 4 = 68 cm convert wavelength to frequency: F = speed of sound divided by wavelength F = 34,000 cm per sec/68 cm F = 500 Hz Higher RFs are odd-number multiples: 500 x 3, 500 x 5, 500 x 7, etc. 1500 Hz, 2500 Hz, 3500 Hz, etc. the same cross-sectional width-throughout. The calculation is based on an adult male's vocal tract, which is about 17 cm long. Table 6.10 shows how the formant frequencies for the schwa vowel are derived. The vocal tract has very many resonant (formant) frequencies. However, be · cause most.of the acoustic energy generated byc the vocal folds is at frequencies below 5000 Hz, typically only the-first three formants are considered-in speec h t producticm. The formants are numbered formant (F1 ), formant (F2), and for manf\lB (F3). F1 is always the lowest in frequenc Changing tongu position from the neutral schwa to the position for a different vowel raises or lowers formants 1, 2, and 3 from 500, 1500, and 2500 Hz to other resonant frequencies. - Like any other resonator, the vocal tract will respond more strongly to driving frequencies that are within its bandwidth. The driving frequencies are those that are pres ent in the complex periodic sound generated by vocal fold vibration. Like all complex periodic sounds, the glottal sound has a specific F0 and harmonics that are whole number multiples of the fundamental. As the sound travels through the vocal tract, the F0 and harmonic frequencies are amplified or damped by the vocal tract formants. We know that the vocal tract is a broadly tuned resonator, so. the bandwidth of each formant is relatively wide, and many harmonic £re-,., quencies that are reasonably close to a formant will be amplified. The sound that emerges from this filtering system has the same FO and harmonics as the glot tal sound. What changes are the amplitudes of the harmonics: Some harmonics have been amplified, while others have been damped. Thus, it is the harmonic content and quality of the sound that has changed from its initial creation at the glottis to its emergence at the lips. source-Filter Theory of Vowel Production The manner in which the vocal tract filters the glottal sound is formalized in the source-filter theory of vowel production, developed by the Swedish scientist Gunnar Fant in 1960. The theory takes the three elements involved in vowel pro duction-the glottal sound, the vocal tract resonator, and the sound at the lips and represents them on three graphs (Figure 6.23). The first graph represents the glottal spectrum, showing the sound as it ex ists at the larynx before being modified by the filtering properties of the vocal tract. The F0 has the greatest amplitude, with successively higher harmonics losing