JSS Institute of Speech and Hearing Past Paper PDF

Logo Description automatically generated **JSS INSTITUTE OF SPEECH AND HEARING, MUSURU-04** **SUBJECT: SPEECH PERCEPTION** ***TOPIC: Cues for the Perception of Stops (Manner, Place & Voicing)*** **Submitted to** Dr Asha Yathiraj Professor of Audiology JSSISH, Mysuru **Submitted by** Ms. Sahana P P01IJ23S023012 II MSc (Audiology) JSSISH, Mysuru +-----------------------+-----------------------+-----------------------+ | **Sl no** | **Contents** | **Page no** | +=======================+=======================+=======================+ | 01 | 1. **Introduction** | 3-6 | | | | | | | 1.1 Stop | | | | consonants | | | | | | | | 1.2 Stops and | | | | their production | | | | | | | | 1.3 Stops | | | | Classification | | +-----------------------+-----------------------+-----------------------+ | 02 | 2. **Manner of | 6-8 | | | Articulatory Cues | | | | for Stop | | | | Consonants** | | | | | | | | 2.1 | | | | Stopgap/Closure | | | | duration/Silent | | | | gaps | | | | | | | | 2.2 Release burst | | | | | | | | 2.3 Formant | | | | transition | | | | | | | | 2.4 Voice Onset | | | | Time | | | | | | | | 2.5 Cues to | | | | differentiate | | | | Stops Consonants | | | | from Other | | | | Consonants such | | | | as Fricatives, | | | | Glides, Nasals, | | | | and Affricates | | +-----------------------+-----------------------+-----------------------+ | 03 | 3. **Place of | 8-17 | | | Articulatory Cues | | | | for Stop | | | | Consonants** | | | | | | | | 3.1 Spectral | | | | shape of the | | | | release burst | | | | | | | | 3.2 Frequency | | | | position of burst | | | | in relation to | | | | vowel | | | | | | | | 3.3 Burst | | | | amplitude | | | | | | | | 3.4 Duration of | | | | the burst | | | | | | | | 3.5 Closure | | | | duration | | | | | | | | 3.6 Voice onset | | | | time | | | | | | | | 3.7 Rate of | | | | transition of the | | | | formants | | | | | | | | 3.8 Transition | | | | duration | | | | | | | | 3.9 Formant | | | | transitions | | +-----------------------+-----------------------+-----------------------+ | 04 | 4. **Voicing cues** | 17-23 | | | | | | | 4.1 Voice onset | | | | time | | | | | | | | 4.2 Formant | | | | transition | | | | | | | | 4.3 F1 transition | | | | onset and offset | | | | | | | | 4.4 F1 cut back | | | | before closure | | | | | | | | 4.5 F1 cut back | | | | after closure | | | | | | | | 4.6 Voicing bar | | | | | | | | 4.7 Closure | | | | duration | | | | | | | | 4.8 Burst | | | | frequency | | | | | | | | 4.9 Fundamental | | | | frequency | | | | | | | | 4.10 Preceding | | | | vowel duration | | | | | | | | 4.11 Aspiration | | | | | | | | 4.12 Decay time | | | | | | | | 4.13 Burst | | | | duration | | | | | | | | 4.14 Burst | | | | amplitude | | +-----------------------+-----------------------+-----------------------+ | 05 | **Summary** | 23-26 | +-----------------------+-----------------------+-----------------------+ | 06 | **References** | 26-28 | +-----------------------+-----------------------+-----------------------+ 1. **INTRODUCTION** A consonant is a speech sound that is articulated with a complete or partial closure of the vocal tract. The acoustic characteristics of consonants are more complicated than vowels. Some consonants are produced with a period of complete obstruction of the vocal tract but others are produced with narrowing of the vocal tract. Some consonants are strictly oral in their sound transmission but others involve nasal transmission of acoustic energy. All vowels can be described with essentially the same acoustic characteristics, such as duration or formant pattern. However, consonants differ significantly among themselves in their acoustic properties, and it is, therefore, difficult to describe all of them with a single set of measures. An alternate way to analyze these sounds is by making use of the binary concept or paired features where there is an opposition between the presence or absence of a feature in a particular sound. Consonants that are distinctive in their articulatory and acoustic properties are stops, fricatives, affricates, nasals, glides, and liquids. **1.1 Stop consonants** Stops, also known as an occlusive or Plosive. The essential articulatory feature of a stop consonant is a momentary blockage of the vocal tract, characterized by a complete closure somewhere in the vocal tract, causing cessation of the airflow. The articulatory blockage has a variable duration, usually between 50-100 ms, and is subsequently released with a burst of air pressure impounded behind the obstruction escape. Typically, the burst is no longer than 5-40 ms in duration. Stops are one of the shortest, acoustic events that is commonly analyzed in speech. There are six stop sounds in English: - Voiceless stops: /p/, /t/, /k/ - Voiced stops: /b/, /d/, /g/ In Indian languages we have 16 stops including /k/, /g/,/t/,/d/,/p/,/b/,/t/,/d/, and aspirated form of all the mentioned stop consonants. The closure duration varies from 51-81 ms. Unvoiced stops have longer closure duration, higher Articulatory resistance, and duration varies from 90 - 164 ms (B. Yegnanarayana et al., 2008). V- voiced, uV-unvoiced, A-aspirated, uA-unaspirated **1.2 Stops and their production** Stop consonants are characterized by a complete constriction of the vocal tract. The essential articulatory feature of stop is a momentary blockage of the vocal tract. Blockage is formed in 3 sites: lips(bilabial), alveolar ridge (alveolar) and velum(velar) Other places include the uvular, pharyngeal, and glottal---important class of speech sounds that are present in all languages **1.3 Stop classifications** ![](media/image6.png) Fig: Phonetic classification of stop consonants Indic Oral Stops: A Typical Array \| Download Table - Based on place of articulation: - Bilabials (/p/, /b/) - Alveolars (/t/, /d/) - Velars (/k/, /g/) - Dental - Retroflex - Based on voicing: - Voiced (/b/, /d/, /g/) - Unvoiced (/p/, /t/, /k/) (Chittora, 2015) - Based on aspiration: - Voiceless Unaspirated - Voiceless Aspirated - Voiced Unaspirated - Voiced Aspirated **(Singh & Tiwari, 2013)** 2. **[Manner of articulatory cues for stop consonants]** **2.1 Stop Gap/Closure Duration/Silent gap:** The stop gap is the acoustic interval corresponding to a complete obstruction of the vocal tract, this interval is an energy minimum in the acoustic signal. That is little or no sound radiates from the obstructed vocal tract. - Stops have a period of **complete closure** that results in a **silent gap**. The duration of this closure helps in perceiving the stop's manner. This becomes a cue for the perception stops that differentiates other sounds such as nasals, fricatives and glides. - The closure for affricates is similar to that of stops but shorter in duration, followed by frication but closure is not a big cue (Stevens, 1998) - Nasals involve **continuous voicing during closure** and are identified by a **nasal murmur** (low-frequency energy around 250-300 Hz), which is distinct from silence in stops (Stevens, 1998) - There is no closure duration in fricatives. The perception is instead driven by the **continuous nature of fricative noise**, which distinguishes them from stops (Ladefoged & Maddieson, 1996). - Glides are continuous and have no closure. 2. **Release burst**: Release burst follows the silent gap. Stops are characterized by a **brief burst** of noise when the closure is released and Affricates involve a **brief burst** similar to stops but followed by **frication noise**. This combination of a **stop-like burst** with **fricative-like noise** is the key perceptual cue (Johnson, 2003). There is no burst associated with nasals, fricatives, and glides (Ladefoged & Maddieson, 1996). This burst helps to differentiate stops from Nasals, Fricatives, and glides. 3. **Formant transitions:** Articulatory transition from stop to vowel is associated with an acoustic transition in the form of shifting formants, these changes in formant frequency reflect the changes in the resonating cavities of the vocal tract, and formant transitions are the minor cue for perception. The transitions can be from vowel to consonant (VC) or consonant to vowel (CV) - Stops typically show very short and quick formant transitions, especially during the transition from the closure phase to the release. F1 rises as the mouth opens after the closure, while F2 and F3 vary depending on the stop and surrounding vowel (Johnson, K., 2003). - Diphthongs involve a slow, smooth transition between two vowel sounds, reflecting continuous vocal tract movement. F1 and F2 show clear, noticeable shifts. For example, in /ai/, F1 decreases and F2 Increases as the tongue moves from low to high position (Ladefoged, P. 2001). - In glides, Formant transitions are shorter in duration than diphthongs but still dynamic. - Nasals and Fricatives exhibit gradual formant transitions but in affricates formant transitions are rapid and combined with frication noise, making them distant from stops (Johnson, K., 2003). 4. **Voice Onset Time (VOT)** **[Cues to differentiate Stops Consonants from Other Consonants such as Fricatives, Glides, Nasals, and Affricates.]** - **Stops** involve a complete closure in the vocal tract, blocking airflow entirely, followed by a burst of sound upon release (Kent & Read, 2001), whereas **fricatives** maintain a narrow constriction, allowing continuous turbulent airflow (Borden, Harris, & Raphael, 2011). - **Stops** have a distinct release burst of transient noise, while **fricatives** are characterized by sustained high-frequency noise, especially in sibilants like /s/ (Ladefoged & Maddieson, 1996). - In contrast to **glides**, which have a smooth, vowel-like articulation and involve minimal constriction, **stops** exhibit a brief but complete closure (Kent & Read, 2001). - **Glides**, such as /w/ and /j/, lack the sudden burst of energy found in stops, relying instead on rapid formant transitions (Borden, Harris, & Raphael, 2011). - **Nasals** involve the lowering of the velum, allowing airflow through the nasal cavity, which is absent in **stops**, where the velum is raised (Ladefoged & Maddieson, 1996). - Unlike **nasals**, which produce continuous nasal resonance without a burst, **stops** rely on the oral release for their acoustic energy (Kent & Read, 2001). - **Affricates** begin with a stop-like closure but are followed by a fricative release, combining the properties of both stops and fricatives (Borden, Harris, & Raphael, 2011). - The release phase in **affricates** is longer and includes frication noise, whereas in **stops**, the release is brief and sharp (Kent & Read, 2001). - **Stops** exhibit a significant voice onset time (VOT) difference between voiced and voiceless variants, a feature less prominent in **glides** and **nasals** (Ladefoged & Maddieson, 1996). - **Stops** are characterized by a complete obstruction of airflow, a feature not shared by **fricatives, glides,** or **nasals**, all of which allow continuous airflow to some degree (Borden, Harris, & Raphael, 2011). 3. **[Place of articulatory cues for Stop Consonants ]** 3.1 Spectral shape of the release burst 3.2 Frequency position of the burst in relation to vowel 3.3 Burst amplitude 3.4 Duration of the burst 3.5 Closure duration 3.6 Voice onset time 3.7 Rate of transition of the formants 3.8 Transition duration 3.9 Formant transitions **3.1 Spectral shape of the release burst** The energy of the release burst depends on the location of the constriction in the vocal tract and the surrounding phonetic context. Changes in the burst spectrum are shaped by noise, which in turn is shaped by resonance properties determined by the particular articulatory configuration. The burst spectral shape is strongest in initial and medial positions where full closure and release occur. - Bilabials\[b\] and \[p\]: concentration of energy in low frequency from 500-1500 Hz - Alveolars \[d\] and \[t\]: concentration of energy above 4000Hz - Velars \[g\] and \[k\]: concentration of energy at 1500-4000 Hz Stevens & Blumstein (1978) & (1979) investigated the possibility of a spectral template that could be associated with each place of articulation. - Bilabial: diffuse, flat, or falling spectrum, - Alveolar: diffuse, rising spectrum - Velar: mid-frequency spectrum. ![](media/image10.png) 5. **Frequency position of burst in relation to vowel** **The frequency position of the burst for stops is closely related to the place of articulation of the stop and the adjacent vowel.** The frequency position of burst is strongest in the initial position where the burst is distinct and clear, and less prominent in final position. **Liberman et.al., (1952) used pattern playback technique to generate synthesized speech stimuli.** **The stop burst is acoustically characterized as a noise pulse with a defined center frequency, while the following vowel is represented by two steady formants. When the synthetic burst and synthetic vowel are combined, a stop-vowel sequence is perceived.** The major conclusion is that the phonetic identification of the noise burst depends on the vowel context. - Bursts with a center frequency lower than vowel F2 were identified as /p/ - Bursts with a center frequency approximating the vowel F2 were heard as /k/ - Bursts with a center frequency higher than the vowel F2 were labeled /t/ However, exception in the study is that some bursts with energy above vowel f2 were heard as \[p\] when vowels were \[o\] and \[u\]. The conclusion from the experiment was that with the burst cue the place of articulation for stops can be perceived. ![](media/image12.jpeg) **3.3 Burst amplitude (Minor cues)** Jongman and Blumstein (1985) determined that burst amplitude could serve as a cue to distinguish alveolar and dental stops, with the former having a large burst amplitude. Bursts are evident for the syllable initial and medial stops and its energy increases as the constriction moves back into the mouth. Burst amplitude is **strongest for velars** and **weakest for the bilabial stops**. **This could be due to** the burst energy varies with the cross-section area of constriction after stop release, the resonant cavity after the point of release, and the release itself. [ ] Fujimua 1961, and Zue 1976 reported that bilabials that have rapid release gestures without a front cavity display a weak transient release, while velars with large cross-sectional areas and narrow tuned cavities are strong bursts. Burst energy falls in the midway for dentals and retroflex since they have smaller cross-sectional areas and more broadly tuned front cavities than bilabials. Burst amplitude is a strong cue in initial and medial positions but weak or absent in final position. **3.4 Duration of burst (Minor cues)** The stops with strong aspiration exhibited durations significantly longer (Velar) than those with only slight aspiration (Bilabial). Constriction made by the tongue body for velars that is massive and cannot be moved rapidly away from the palate. So, for a longer duration, a narrow opening is maintained even after the release. Burst duration is a key cue in initial and medial positions, but not relevant in final position. **3.5 Closure duration (Minor cues)** Closure duration decreases from labials to velar place of articulation (Zue,1976). - Longer closure duration for labials and dentals - Reduced closure duration for retroflex The vocal tract requires a longer duration for occlusion to accommodate more amount of air since frontal sounds have a larger area behind constriction. Retroflection involves complex tongue positioning and movement; closure duration is reduced since it cannot be held static for a long duration. This closure becomes a cue in the middle position. **3.6 Voice onset time (Minor cues)** The time from the release of the stop closure to the onset of voicing is called the voice onset time (VOT) (Lisker and Abramson, 1964) The VOT in milliseconds can be measured from either the waveform or the spectrogram. Figure: showing voice onset time of /p\^/ and /b\^/ *VOT and place of articulation* - Bilabials -- shortest VOTs - Alveolar -- intermediate VOTs - Velar -- longest VOT. Delattre, Liberman, and Cooper 1995 gave VOT for different stop consonants - 25 msec -- labial - 35msec - alveolar - 40 msec - velar The voice onset time is inversely proportional to the rate at which release gestures are made (Summerfield & Haggard, 1977). Stevens & Klatt 1974 reported that the duration of the movement of articulation is greater for the tongue body, lesser for the tongue tip, and least for lips. An increase in release causes an increase in time for the development of pressure drop that is sufficient to initiate voicing thereby increasing the VOT. Hence VOT is more for velars and less for bilabials. **3.7 Rate of transition of the formants (Major cues)** Bilabials \> alveolar \> velars. Rate of transition provides essential cues to place of articulation in all positions but can be reduced in final position. **3.8 Transition duration (Minor cues)** Transition duration varies, but is shorter in initial position, longer in medial and final positions depending on the articulatory movement.![](media/image15.png) **3.9 Formant transition** The bends in formant pattern occur between stop closure to the open vocal tract for the following sound. During the transition from stop to vowel, the formant pattern can be raising, falling, or relatively flat. The direction and the magnitude of transition depend on the place for stops and the vocal tract configuration for the following vowel and the duration is 50ms. Within this time all formant frequencies shift from their values for the stop to their values for the vowel. **F2 transition**: Delattre, Liberman & Cooper (1955) reported that starting frequency is the unique feature of various f2 transitions. F2 starting frequency for /b/ was between 600-800Hz. Whereas For /d/ is at 1800Hz. This starting frequency is known as a **locus**. - 600-800HzHz for bilabials - 1800Hz for alveolars - 1500-2500Hz for velars. Formant transitions are key in all positions to signal place of articulation **Summary**: - +-----------------+-----------------+-----------------+-----------------+ | | **Bilabials** | **Alveolar** | **Velars** | +=================+=================+=================+=================+ | **The spectral | Flat or falling | Diffuse raising | Mid-frequency | | shape of the | | spectrum above | spectrum | | release burst** | 500-1500Hz | 4 KHz | | | | dominance | | 1.5-4 KHz | +-----------------+-----------------+-----------------+-----------------+ | **Burst | Weak burst | | Strong burst | | amplitude** | amplitude | | amplitude | +-----------------+-----------------+-----------------+-----------------+ | **Duration of | shortest | | longest | | burst** | | | | +-----------------+-----------------+-----------------+-----------------+ | **Closure | Longer closure | | Reduced than | | duration** | duration | | bilabials | +-----------------+-----------------+-----------------+-----------------+ | **Voice onset | 25msec | 35msec | 40msec | | time** | | | | +-----------------+-----------------+-----------------+-----------------+ | **Transition | More than | Shortest for | More than | | duration** | alveolars | alveolars | alveolars | +-----------------+-----------------+-----------------+-----------------+ | **F2 locus** | 600-800Hz | 1800Hz | Variable | | | | | 1500-2500Hz | +-----------------+-----------------+-----------------+-----------------+ | **F2-F3 | All formant | variable | Raising | | transition** | frequencies | | transition of | | | drop just | | 2^nd^ formant | | | before the | | and falling | | | bilabials | | 3^rd^ formant | | | | | | | | | | velar pinch | +-----------------+-----------------+-----------------+-----------------+ The /b/, /d/, and /g/ are the voiced stops and here more pressure is built up behind the closure and is not followed by the aspiration. Voicing refers to denote the presence of simultaneous vibration of the vocal folds. CUES: 1. Voice Onset Time (VOT) 2. Formant transition 3. F1 transition onset and offset 4. F1 cut back before closure 5. F1 cut back after closure 6. **Voice bar** 7. Closure duration 8. Burst frequency 9. Fundamental frequency 10. Preceding vowel duration 11. Aspiration 12. Decay time of glottal signal preceding closure 13. Burst duration 14. Burst amplitude 15. Intensity of the preceding vowel 4.1 **Voice onset time (VOT): Major cues** VOT is defined as the amount of time between the release of stops and onset of voicing or the duration of interval between onset of burst resulting from stop release and glottal signal (Lisker and Abramson, 1964). Studies revealed that short lag or lead VOT results in voiced sound and long lag VOT cues for unvoiced sound. VOT carries information about voicing in stops in syllable initial position and place of articulation for a stop (Forrest, Weismer & Turner, 1989; Klatt, 1975; Lisker & Abramson, 1964). VOT has a range of values that are often classified as: - Pre-voicing lead (VOT= -10ms): voicing begins before the stop is released. - Simultaneous voicing (VOT=0ms): Onset of voicing occurs along with the transient. - If voicing begins after the transients, then it will be called as lag VOT. It can be either short or long (VOT=+10ms). - Short lag VOT (-20 to +20ms): onset of voicing begins shortly after the transient and is usually associated with voiced consonants. - Long lag VOT (25 to 100ms): onset of voicing begins considerably later than the transient and it characterizes unvoiced stops. - Lead VOT varies from 40-80ms and lag VOT varies from -20ms to +100ms. - VOT ranges from 20 to 25ms. In Malayalam language, Sarah & Paroo, (2009) reported that Voiced stops exhibited shorter VOT compared to Voiceless stops. **4.2 Formant transition:** During the transition from a stop to a vowel (or from a vowel to a stop), all formant frequencies shift from their values for the stop to their values for the vowels, and this transition is called ***formant transition.*** Formant transitions provide information in all positions. - Usually, it is completed within 50 ms. - These changes reflect changes in the resonating cavities. - It can be VC or CV form and carry information on the voicing feature of the post-vocalic stop. **4.3 F1 transition (onset and offset)** F1 transition refers to the movement of the first formant frequency (F1) during the articulation of a sound, particularly in the context of how it changes at the beginning (onset) and end (offset) of a consonant or vowel sound. Leigh Lisker (1977): **Higher f1onset** frequency cued for the perception of **voiceless stop**. **Lower f1 onset** frequency is a cue for the perception is **voiced stop.** F1 offset frequency transition, preceding the stop consonant is important cue for perception of voiced and voiceless stops. Fischer & Ohde, 1990 conducted a study to assess the spectral and duration properties of front vowel which were cue for word final velar stops. They have studied cue as vowel duration, change in offset frequency of F1 transition and rate of change of F1 transition for several vowel context. Vowel /i/ and /ae/ were included in the study. The results showed that for /ae/, voiced response increased when F1 offset frequency decreased and vowel duration increased. Similar results were also for /i/. For /i/ the responses with respect to offset frequency of F1 was not so robust, but increase in vowel duration increased the voiced response. High F1 offset frequencies cued for voiceless stops and low F1 offset frequencies cued for voiced stops. Leigh Lisker (1975) conducted further experiments on the **F1 transition** using synthetic stimuli of and found that although F1 had a significant effect on the voiced/voiceless classification it was neither necessary nor as sufficient as the VOT duration. In addition, it was not the dynamic quality (rapidly changing) of F1 but the low F1 onset frequency that indicated voicing. F1 offset is cue in medial and final position. **4.4 First formant cutback (before closure):** If the F1 cut back before closure is less, then the perception is voiced stop, whereas if the F1 cut back before closure is more, then the perception is voiceless stop (Lisker, 1977). F1 Cut back before closure mainly helps in voiced voiceless distinction in medial position (inter-vocalically). Also, F1 cutback before the closure is a cue in Initial and medial voiceless stops. **4.5 First formant cutback (after closure):** A delay in F1 relative to the beginning of the F2 transition is referred to as an **F1 cutback**. This kind of relationship between F1 and F2 transition from stop to the following vowel can be a cue to perceive voiced v/s. voiceless stops. Liberman et al (1958) conducted perceptual experiments using synthetic syllable-initial plosives (/b/, /d/, and /g/), where the onset of F1 was cut back (delayed) by amounts varying between 10 and 50 ms relative to the burst; F2 and F3, however, began immediately after the release. The authors found that the VOT duration (as defined by the amount of F1 cutback) was a cue for voicing and that replacing F2 and F3 with noise, instead of harmonics, increased the voiceless effect. *To conclude,* If the **F1 cutback is less**, it leads to the **perception of a voiced stop** and if **F1 cut back is more, it leads to perception of voiceless stop.** **4.6 Voicing bar: Major cues** **Voice bar is a band of energy, typically reflecting the first harmonic of the voice source that appears on a spectrogram; it is indicative of voicing. It is more prominent in voiced stops than unvoiced.** The voicing bar provides clear cues in all positions for voiced stops. Description: C:\\Users\\LAP\\Desktop\\voice bar.jpg **4.7 Closure duration: Major cues** **Time interval between the onset of closure to the onset of articulatory release.** Closure duration is one of the major cues that help us to perceptually differentiate voiced and voiceless stops. The closure duration of unvoiced stops is longer than voiced stops. This is because articulatory resistance is higher for unvoiced stops than for voiced and this is due to the absence of vibration of the vocal folds. It has variable duration, usually between 50-100 ms. Closure duration is key in medial positions for voicing contrast. ![](media/image18.png) In Malayalam language, for coronal stops, the tongue made broad contact with the alveolar or post- alveolar region, and the closure duration was longer for voiceless stops than for voiced stops. Usha Rani (1989) studied the temporal perceptual cues of stop consonants. She reported that a closure duration of less than 60 minutes led to a voiced perception and that no difference was noticed between the Kannada and Hindi groups. The critical evaluation of the study is that closure duration is a major cue for voiced vs. voiceless distinction of stops in medial positions of a word (inter-vocalically). No difference was noticed between the judgment of the Kannada and Hindi groups says as is that closure duration is a major cue to differentiate voiceless vs. voiced stop in both languages. **4.8 Burst frequency: Minor cues** The burst frequency is defined as the peak with the highest amplitude (Stevens, 1980). Burst frequency is a key cue for initial and medial stops. Rami M et al., (1999) reported that burst frequencies above 1500 Hz were associated with voiced velar stop productions, while those below defined the voiceless velar stops. **4.9 Fundamental frequency:** The role of F0 cues for voiced and unvoiced distinctions is generally considered secondary. F0 has been found to be level or to rise after the release of voiced stops and to fall following the release of a voiceless stop (Lehiste & Peterson, 1961; Ohde, 1984; Chistovich, 1969; Haggard et.al, 1970). F0 is useful in medial and final positions for voicing distinctions. F0 tends to be higher in vowels that follow voiceless consonants than voiced consonants as far as 100 msec from voicing onset (House & Fairbanks, 1953; Umeda, 1981). **4.10 Preceding vowel duration (PVD or VD):** **The length of time a vowel is articulated before a following consonant, particularly in the context of stops (plosives), fricatives, or other consonantal sounds.** Preceding vowel duration is crucial for voicing contrasts in final stops. **Haskins Laboratories experimented on synthetic velar stops in the final position, stops in which voicing cues had been neutralized, and showed that preceding vowels longer than 200 msec caused listeners to hear /eg/. Vowels shorter than 200 msec produced listener judgments of /ek/.** **Lawrence J. Raphael, (1972) reported through his study that listeners perceived the final segment as voiceless when they were preceded by vowels of short duration and voiced when vowels of long duration. That is, a final consonant or cluster synthesized with cues appropriate for voicing was perceived as voiceless when the vowel preceding it was of short duration, and as voiced when the preceding vowel was of long duration (Figure below right). A final cluster or consonant synthesized with cues for voicelessness was as perceived in precisely the same way (figure below left).** ![](media/image21.png) ![](media/image23.png) **4.11 Aspiration: Minor cues** English used both phonation and aspiration to differentiate stop voicing categories, context dependent. For example, in syllable initial position before a heavily stressed vowel /p, t, k/ are usually perceived as voiceless and aspirated, whereas /b, d, g/ are perceived as voiced and unaspirated. Early experiments investigated the perception of pre-vocalic stops preceded by /s/ (as in *spill*, *still*, *skill*) that are often described as voiceless and unaspirated (Lotz et al., 1960; Reeds & Wang, 1961). When the /s/-frication was deleted, English-speaking listeners reported hearing words initiated by "voiced" stops (*bill*, *dill*, *gill*), not voiceless stops (*pill*, *till*, *kill*). The experimenters concluded that, in the edited stimuli, "aspiration is a more dominant cue than voicing in the perceptual separation of these two classes of consonants." Aspiration is most prominent in the initial position. **4.12 Decay time (glottal preceding closure)** **The gradual reduction or decrease in vocal fold vibration (voicing) just before the closure of a stop consonant, particularly voiced stops like /b/, /d/, and /g/. It describes how voicing fades out as the vocal tract prepares to fully close for the stop articulation. This occurs between the end of a preceding vowel or sonorant and the actual closure phase of the stop.** Less decay time leads to voiced perception and more decay time leads to voiceless perception. It is not a major cue for the distinction of voiced vs. voiceless stops. Decay time is a relevant cue in medial and final stops, particularly for voiced stops. **4.13 Burst duration** Revoile et al., 1987 studied burst duration as a cue for perception of word initial stops. The burst was deleted from voiced and voiceless stops. Because of the duration difference between the voiced and voiceless burst, this deletion removed more of the syllable onsets for the voiceless stops than for voiced. The results showed that when burst release was deleted for voiceless stops, listeners were not able to perceive the voiceless stop correctly. When burst release was deleted for voiced stops, the listeners were able to perceive that voiced stop correctly. **4.14 Burst amplitude** The stop burst is more intense for voiceless than for voiced stops. Greater amplitude of a burst leads to voiceless perception (repp, 1979). This burst amplitude can be a cue mainly in medial positions of words. (In a language such as English) 15. **Intensity of the preceding vowel** Vowels which precede voiceless stops are generally terminated with more abrupt drop in intensity compare to vowels that precede voiced stop (Derbock, 1977). This is a cue for medial and final position of stops. SUMMARY: **Manner of Articulatory Cues:** **Speech Sounds** **Burst** **Closure Duration** **VOT** **Formant Transition** ------------------- ----------- ---------------------- --------- ------------------------ **Nasals** No No No Yes **Fricatives** No No No Yes **Affricates** Yes Yes No Yes **Glides** No No No Yes **Stops** Yes Yes Yes Yes **Diphthongs** No No No Yes **Place of Articulatory cues:** +-----------------+-----------------+-----------------+-----------------+ | | **Bilabials** | **Alveolar** | **Velars** | +=================+=================+=================+=================+ | **The spectral | Flat or falling | Diffuse raising | Mid-frequency | | shape of the | | spectrum above | spectrum | | release burst** | 500-1500Hz | 4 KHz | | | | dominance | | 1.5-4 KHz | +-----------------+-----------------+-----------------+-----------------+ | **Burst | Weak burst | | Strong burst | | amplitude** | amplitude | | amplitude | +-----------------+-----------------+-----------------+-----------------+ | **Duration of | shortest | | longest | | burst** | | | | +-----------------+-----------------+-----------------+-----------------+ | **Closure | Longer closure | | Reduced than | | duration** | duration | | bilabials | +-----------------+-----------------+-----------------+-----------------+ | **Voice onset | 25msec | 35msec | 40msec | | time** | | | | +-----------------+-----------------+-----------------+-----------------+ | **Transition | More than | Shortest for | More than | | duration** | alveolars | alveolars | alveolars | +-----------------+-----------------+-----------------+-----------------+ | **F2 locus** | 600-800Hz | 1800Hz | Variable | | | | | 1500-2500Hz | +-----------------+-----------------+-----------------+-----------------+ | **F2-F3 | All formant | variable | Raising | | transition** | frequencies | | transition of | | | drop just | | 2^nd^ formant | | | before the | | and falling | | | bilabials | | 3^rd^ formant | | | | | | | | | | velar pinch | +-----------------+-----------------+-----------------+-----------------+ | **Rate of | More | More than velar | Less than | | transition | | | bilabials and | | formants** | | | alveolars | +-----------------+-----------------+-----------------+-----------------+ **Voicing Cues:** Features Voiced stop Unvoiced stop --------------------------- ------------------------ --------------- Voicing Present Absent Duration of closure Short Long Overall energy High Low Total duration Short Long VOT Lead VOT/short lag VOT Long lag VOT PVD Long Short Closure duration Short Long Burst duration Short Long Burst amplitude Weaker Stronger Burst frequency High Low F1 cut back Absent Present F1 onset after closure Low High F1 offset frequency Low High F0 of the following vowel Rising Falling F0 before & after closure Low High Aspiration Weaker Stronger **CUES** **UNVOICED** **VOICED** ---------------- ---------------------------------------------- -------------- ------------- **Major cues** **VOT** **Long** **Short** **Preceding vowel duration** **Short** **Long** **F1 Cutback before closure** **Long** **Short** **F1 Cutback after closure** **Long** **Short** **Voice bar** **Absent** **Present** **Closure duration** **Long** **Short** **Burst duration** **Long** **Short** **Minor cues** **Transition duration of a preceding vowel** **Short** **Long** **Burst amplitude** **Stronger** **Weak** **Noise on upper formants** **Stronger** **Weak** **Aspiration** **Stronger** **Weak** **Drop in the intensity of preceding vowel** **More** **Less** **F0 before and after closure** **High** **Low** **F1 offset frequency** **High** **Low** **F1 onset after closure** **High** **Low** **Burst frequency** **Low** **High** **References:** - Kent, R. D., & Read, C. (2001). *Acoustic analysis of speech*. Singular Publishing Group. - *Borden G.J & Harris (1980): Speech science primer, physiology, acoustics, and perception of speech.* - Brano H. Repp, 1979: *Relative amplitude of aspiration noise as a cue for syllable initial stops consonants. Journal of language and speech, 22-2, 1979.* - Lawrence J. Raphel, 1972*: Preceding vowel duration as cue to perception of voicing characteristics of word final consonant in American English*. JASA. 51(4) 1296 -- 1303. - *Liegh Lisker & S. Abramson, 1968*: *Voice timing: cross language experiments in identification and discrimination. JASA, 1968.* - *Liegh Lisker, 1978: Cue to voicing, manner and place of consonant occlusion. Haskins laboratory: status report on speech research.* - *Liegh Lisker, 1977: Rapid vs. rapid: a catalogue of acoustic patterns that may be a cue for distinction. Haskins laboratory: status report on speech research, SR- 54, 1978.* - Fisher R. M., Ohde, 1990*: Spectral and duration of front vowels as cues to final stop-consonant voicing,* JASA, 88(3)1250-9. - Kuan- YI chao, and Li- mei Chen, 2008: *A cross linguistic study of voice onset time in stop consonant productions: The association of computational linguists and Chinese language processing, 13(2) 215-232* - Parinitha Shetty, Rubia N Sada and Ajith U Kumar, 2009*: Voice onset time as a perceptual cue in voicing contrast in Tulu Language,* IJP, 84(3) 291-299. - **Lee Williams, 1977*: The perception of stop consonant voicing by Spanish-English bilinguals,* Perception and Psychophysics,21(4), 289-297.** - D. H. Whalen, Arthur S. Abramson, Leigh Lisker, and Maria Mody, 1993*:**F0 gives voicing information even with unambiguous voice onset times,*** JASA, 2152. - Berhman A. (2004) speech and voice sciences. plural publishers - Usha Rani & Savithry S. R, 1989: temporal perceptual cues of stop consonants. Unpublished dissertation submitted to University of Mysore. - Borden, G. J., Harris, K. S., & Raphael, L. J. (2011). *Speech science primer: Physiology, acoustics, and perception of speech* (6th ed.). Lippincott Williams & Wilkins. - - Rami, M. K., Kalinowski, J., Stuart, A., & Rastatter, M. P. (1999). Voice onset times and burst frequencies of four velar stop consonants in Gujarati. *J Acoust Soc Am*, *106*(6), 3736-3738. Sarah, & Paroo. (2009). The articulation of Malayalam coronal stops and nasals. - Mohanan, K. P. (2014). *Lexical phonology of the consonant system in Malayalam*. - Emeneau, M. B., & Kausalya, K. (2015). *Tamil expressives with initial voiced stops*. - Fowler, R. (2016). *The segmental phonemes of Sanskritized Tamil*. - Chittora, A. (2015). *Classification of Stop Consonants using Modulation Spectrogram-Based Features*. 145--150. - Plauche, M. C. (2001). *Acoustic cues in the directionality of stop consonant confusions*. - Raphael, L. J., Borden, G. J., & Harris, K. S. (1981). Speech Science Primer: Physiology, Acoustics, and Perception of Speech. In *Annals of Otology, Rhinology & Laryngology* (6th ed., Vol. 90, Issue 4). https://doi.org/10.1177/000348948109000430 - Repp, B. H. (1984). *STOP CONSONANT MANNER AND PLACE OF ARTICULATION \**. *May 1983*, 245--254. - Savithri, S. R. (1989). *TIMING IN SPEECH : A REVIEW OF SANSKRIT LITERATURE AND ITS VERIFICATION*. *22*, 305--315.

JSS Institute of Speech and Hearing Past Paper PDF

Document Details

Tags

Related

Summary

Full Transcript