Role of Visual Speech in Phonological Processing by Children with Hearing Loss PDF

Role of Visual Speech in Phonological Processing by Children With Hearing Loss Susan Jerger University of Texas at...

Role of Visual Speech in Phonological Processing by Children With Hearing Loss Susan Jerger University of Texas at Dallas, Callier Center for Communication Disorders and Purpose: This research assessed the influence of visual speech on phonological Central Institute for the Deaf at processing by children with hearing loss ( HL). Washington University School Method: Children with HL and children with normal hearing ( NH) named pictures of Medicine, St. Louis, MO while attempting to ignore auditory or audiovisual speech distractors whose onsets relative to the pictures were either congruent, conflicting in place of articulation, or Nancy Tye-Murray conflicting in voicing—for example, the picture “pizza” coupled with the distractors “peach,” “teacher,” or “beast,” respectively. Speed of picture naming was measured. Central Institute for the Deaf and Washington Results: The conflicting conditions slowed naming, and phonological processing by University School of Medicine, children with HL displayed the age-related shift in sensitivity to visual speech seen in University of Texas at Dallas children with NH, although with developmental delay. Younger children with HL exhibited a disproportionately large influence of visual speech and a negligible Hervé Abdi influence of auditory speech, whereas older children with HL showed a robust influence University of Texas at Dallas of auditory speech with no benefit to performance from adding visual speech. The congruent conditions did not speed naming in children with HL, nor did the addition of visual speech influence performance. Unexpectedly, the /^ /-vowel congruent distractors slowed naming in children with HL and decreased articulatory proficiency. Conclusions: Results for the conflicting conditions are consistent with the hypothesis that speech representations in children with HL (a) are initially disproportionally structured in terms of visual speech and ( b) become better specified with age in terms of auditorily encoded information. KEY WORDS: phonological processing, lipreading, picture–word task, multimodal speech perception F or decades, evidence has suggested that visual speech may play an important role in learning the phonological structure of spoken lan- guage (Dodd, 1979, 1987; Locke, 1993; Mills, 1987; Weikum et al., 2007). A proposed link between visual speech and the development of phonology may be especially essential for children with prelingual hear- ing loss. The importance of this link is challenged, however, by the observation that visual speech has less influence on speech perception in typically developing children than in adults. The McGurk effect illus- trates this finding (McGurk & MacDonald, 1976). In a McGurk task, individuals hear a syllable whose onset represents one place of articu- lation while seeing a talker simultaneously mouthing a syllable whose onset represents a different place of articulation (e.g., auditory / ba / and visual /ga / ). Adults typically experience the illusion of perceiving a blend of the auditory and visual places of articulation (e.g., /da / or /8a / ). Signif- icantly fewer children than adults experience this illusion. As an exam- ple, in response to one type of McGurk stimulus (auditory / ba / – visual 412 Journal of Speech, Language, and Hearing Research Vol. 52 412–434 April 2009 D American Speech-Language-Hearing Association 1092-4388/09/5202-0412 Downloaded From: http://jslhr.pubs.asha.org/ by St. John's University, Sarah Azer on 09/18/2014 Terms of Use: http://pubs.asha.org/ss/Rights_and_Permissions.aspx /ga / ), McGurk and MacDonald (1976) reported auditory section, we briefly describe our original task and the capture (individuals heard / ba / ) in 40%–60% of children recent adaptation. but in only 10% of adults. This pattern of results has been replicated and extended to other tasks by several Children’s Multimodal Picture-Word Task investigators. Visual speech has less influence on per- formance by children for (a) identification of nonsense In the Cross-Modal Picture-Word Task (Jerger, syllables representing congruent and /or incongruent Martin, & Damian, 2002), the dependent measure is the audiovisual pairings such as auditory / ba / paired with speed of picture naming. Children name pictures while visual / ba/, /va /, /da /, /ga /, /8a /, and/or /qa / (Desjardins, attempting to ignore auditory distractors that are ei- Rogers, & Werker, 1997; Hockley & Polka, 1994); au- ther phonologically onset related, semantically related, ditory /aba / paired with visual /aba /, /ada /, /aga /, and or unrelated to the picture. The phonological distractors /ava / (Dupont, Aubin, & Menard, 2005); or auditory / ba / contain onsets that are congruent, conflicting in place of in noise paired with visual /ba/, /da/, and /ga/ (Sekiyama articulation, or conflicting in voicing (e.g., the picture & Burnham, 2004); (b) identification of synthesized speech “pizza” coupled with the distractors “peach,” “teacher,” stimuli ranging from / ba / to /da / paired with visual / ba/ or “beast,” respectively). The semantic distractors com- or /da / (Massaro, 1984; Massaro, Thompson, Barron, & prise categorically related pairs (e.g., the picture “pizza” Laren, 1986); and (c) the release from informational mask- coupled with the distractor “hot dog”). The unrelated ing due to the addition of visual speech for recognizing distractors are composed of vowel nucleus onsets (e.g., speech targets (e.g., colored numbers such as red eight) the picture “pizza” coupled with the distractor “eagle”). in the presence of competing speech (Wightman, Kistler, The unrelated distractors usually form a baseline condi- & Brungart, 2006). Overall results are consistent with tion, and the goal is to determine whether phonological the conclusion that audiovisual speech perception is dom- or semantic distractors speed up or slow down naming inated by auditory input in typically developing children relative to the baseline condition. Picture-word data and visual input in adults. The course of development of may be gathered simultaneously with the entire set of audiovisual speech perception to adultlike performance distractors, particularly in children, due to (a) the dif- is not well understood. A few studies report an influence ficulties in recruiting and testing the participants and of visual speech on performance by the pre-teen/teenage ( b) the idea that an inconsistent relationship between years (Conrad, 1977; Dodd, 1977, 1980; Hockley & Polka, the distractors and pictures may aid participants’ at- 1994), with one report citing an earlier age of 8 years tempts to disregard the distractors. Research questions (Sekiyama & Burnham, 2004). about the different types of distractors, however, are typically reported in separate articles. In this article Recently, we modified the Children’s Cross-Modal aimed at explicating the influence of visual speech on Picture-Word Task (Jerger, Martin, & Damian, 2002) phonological processing, we focus only on the phonolog- into a multimodal procedure for assessing indirectly the ical distractors. influence of visual speech on phonological processing (Jerger, Damian, Spence, Tye-Murray, & Abdi, 2009). The onset of the phonological distractors may be varied Indirect versus direct tasks may be demarcated on the to be before or after the onset of the pictures, referred to basis of task instructions as recommended by Merikle as the stimulus onset asynchrony (SOA). Whether a and Reingold (1991). An indirect task does not direct par- distractor influences picture naming depends upon the ticipants’ attention to the experimental manipulation of type of distractor and the SOA. First, with regard to the interest, whereas a direct task unambiguously instructs SOA, a phonologically related distractor typically pro- participants to respond to the experimental manipula- duces a maximal effect on naming when the onset of the tion. Both the Cross-Modal and Multimodal Picture- distractor lags behind the onset of the picture (Damian Word tasks qualify as indirect measures. Children are & Martin, 1999; Schriefers, Meyer, & Levelt, 1990). With instructed to name pictures and to ignore auditory or regard to the type of distractor, a phonologically re- audiovisual speech distractors that are nominally irrel- lated distractor speeds naming when the onset is con- evant to the task. The participants are not informed of, gruent and slows naming when the onset is conflicting nor do they consciously try to respond to, the manipula- in place or voicing relative to the unrelated or baseline tion. An advantage of indirect tasks is that performance distractor. The basis of any facilitation or interference is less influenced by developmental differences in higher is assumed to be interactions between the phonological level cognitive processes, such as the need to consciously representations supporting speech production and access and retrieve task-relevant knowledge (Bertelson perception. & de Gelder, 2004). The extent to which age-related dif- The interaction between speech production and per- ferences in multimodal speech processing are reflecting ception is illustrated in Figure 1, which portrays the development change versus the varying demands of tasks speech chain (see Denes & Pinson, 1993) with an au- remains an important unresolved issue. In the next ditory distractor. The speaker is supposed to be naming the Jerger et al.: Role of Visual Speech 413 Downloaded From: http://jslhr.pubs.asha.org/ by St. John's University, Sarah Azer on 09/18/2014 Terms of Use: http://pubs.asha.org/ss/Rights_and_Permissions.aspx Figure 1. The speech chain (adapted from Jerger, 2007) for auditory-only input representing the stages of processing for speech production and perception. The speech chain has been augmented to portray a speaker naming the picture, “duck,” while hearing an auditory phonologically related distractor conflicting in place of articulation, “buckle” (see Jerger, 2007). picture “duck” while hearing a phonologically onset-related the contribution of visual speech to phonological pro- distractor conflicting in place of articulation, “buckle.” cessing by children with HL. The boxes in Figure 1 illustrate the stages of processing for speech production and perception, which proceed in opposite directions. With regard to producing speech, Results in Typically Developing Children the picture to be named activates conceptual linguistic Originally, we hypothesized that results on our in- information that undergoes transformation to derive its direct Multimodal Picture-Word Task would consistently pronunciation, termed output phonology, followed by ar- reveal an influence of visual speech on performance, in ticulation, termed output. With regard to perceiving contrast to results of the studies reviewed previously speech, the acoustic waveform of the distractor enters using direct tasks (Desjardins et al., 1997; Dupont et al., the listener ’s ear, where it undergoes input phonological 2005; Hockley & Polka, 1994; Massaro, 1984; Massaro processing to derive the word’s phonological pattern, et al., 1986; Sekiyama & Burnham, 2004; Wightman et al., which activates the word’s meaning (i.e., conceptual 2006). Disparate results on indirect versus direct visual information). speech tasks have been reported previously by Jordan and When congruent or conflicting auditory distractors Bevan (1997). Results did not support our hypothesis (see speed up or slow down naming, performance is assumed Jerger et al., 2009). Although preschoolers (4-year-olds) to reflect an interaction between production and per- and preteens/teenagers (10- to 14-year-olds) showed an ception (Levelt et al., 1991). The interaction is assumed influence of visual speech on phonological processing to occur when the picture-naming process is occupied (termed positive results), young elementary school–aged with output phonology and the distractor perceptual children did not (termed negative results). Positive results process is occupied with input phonology. Congruent dis- in younger and older children coupled with negative re- tractors are assumed to speed picture naming by acti- sults in children of in-between ages yielded a U-shaped vating input phonological representations whose activation developmental function. spreads to output phonological representations, thus U-shaped functions have been carefully scrutinized allowing the shared speech segments to be selected more by dynamic systems theorists, who propose that multi- rapidly during the naming process. Conflicting distrac- ple interactive factors—rather than one single factor— tors are assumed to slow naming by activating conflict- typically form the basis of developmental change ing output phonological representations that compete (Gershkoff-Stowe & Thelen, 2004; Smith & Thelen, 2003). with the picture’s output phonology for control of the Furthermore the components of early skills are viewed response. A novel contribution of our new methodology is as “softly assembled” behaviors that reorganize into that the distractors are presented both auditorily and more mature and stable forms in response to internal and audiovisually. Thus, we can gain new knowledge about environmental forces (Gershkoff-Stowe & Thelen, 2004). 414 Journal of Speech, Language, and Hearing Research Vol. 52 412–434 April 2009 Downloaded From: http://jslhr.pubs.asha.org/ by St. John's University, Sarah Azer on 09/18/2014 Terms of Use: http://pubs.asha.org/ss/Rights_and_Permissions.aspx From this perspective, the plateau of the U-shaped tra- that can enhance performance relative to auditory speech jectory, which seems to reflect the loss or disappearance of only (see Campbell, 1988, for a discussion). a behavior, is instead viewed as reflecting a period of transition and instability. Applying knowledge systems Research Questions and Predicted Results in Children With HL that are in a period of significant growth may require more resources and may overload the processing system, resulting in temporary decreases in processing efficiency. Our general research aim was to explain whether A proposal, in concert with dynamic systems theory, is and how visual speech may enhance phonological pro- that the temporary loss of sensitivity to visual speech in cessing by children with HL relative to auditory speech the children of in-between ages is reflecting reorganiza- alone. We addressed the following research questions by tion and adaptive growth in the knowledge systems comparing performance in NH versus HL groups or in underpinning performance on our task, particularly subgroups of children with HL. phonology. Question 1. Does performance on the picture-word In short, the apparent loss of the influence of visual task differ in the NH versus HL groups for the auditory speech on performance in elementary school–aged chil- and audiovisual distractors? dren may be viewed as a positive sign, an indication that Question 2. Does performance on the picture-word relevant knowledge systems—particularly phonology— task differ in HL subgroups representing less mature are reorganizing and transitioning into more highly spe- versus more mature age-related competencies? cified, stable, and robust forms. The reorganization and transition may be in response to environmental and in- With regard to the possible influence of prelingual ternal forces such as (a) formal instruction in the reading childhood HL on phonology, auditory input is assumed and spelling of an alphabetic language and ( b) develop- to play a disproportionately important role in building mental changes in multimodal processing as well as phonological knowledge (Levelt, Roelofs, & Meyer, 1999; in auditory perceptual, linguistic, and cognitive skills. Ruffin-Simon, 1983; Tye-Murray, 1992; Tye-Murray, An important point is that the ends of the U-shaped Spencer, & Gilbert-Bedia, 1995). The presence of HL may trajectory, which seemed to reflect identical perfor- degrade and filter auditory input, resulting in less well- mance in the 4-year-olds and 10- to 14-year-olds, are not specified auditory phonological knowledge. Visual speech necessarily reflecting identical substructures for the may become a more important source of phonological performance. knowledge, with representations more closely tied to ar- ticulatory or speechread codes than auditory codes. Rel- Results in the typically developing children with nor- atively impoverished phonological knowledge in children mal hearing (NH) raise questions about whether children with moderate HL is supported by previous research dem- with prelingual hearing loss ( HL), who rely more on onstrating abnormally poor phoneme discrimination visual speech to learn to communicate, will show a lack and phonological awareness (Briscoe, Bishop, & Norbury, of influence of visual speech on phonological processing 2001). during the years characterizing reorganization in the children with NH. To the extent that the apparent loss of With regard to the first research question, the data the influence of visual speech on performance reflects of Briscoe and colleagues predict that the auditory dis- developmental evolution within the relevant knowledge tractors should affect performance significantly less in systems, particularly phonology, then perhaps children the HL group than in the NH group. This outcome would with HL and no other disabilities, excluding delayed support the contention that speech representations in speech and language, also undergo this pivotal trans- the children with HL are less well-structured in terms of formation. In the next section, we summarize general auditorily based linguistic information. With regard to primary and secondary research questions, predict some the audiovisual distractors, the data suggest that the of the possible results, and interpret some of the possible children with HL should show a stronger influence of implications for speech representations in children with visual speech on performance relative to the NH group. HL. Our model for predicting and interpreting results This outcome would be consistent with the contention proposes that performance on the Multimodal Picture- that speech representations in the children with HL are Word Task reflects cross-talk between the output and encoded disproportionally in terms of visual speech ges- input phonological representations supporting speech tural and /or articulatory codes, as previously suggested production and perception. The patterns of cross-talk by Jerger, Lai, and Marchman (2002). provide a basis for theorizing about the nature of the In distinction to this vantage point, there is evi- knowledge representations underlying performance (see dence indicating extensive phoneme repertoires and Rayner & Springer, 1986, for a discussion). Visual speech reasonably normal phonological processes in children is viewed as an extra phonetic resource (perhaps adding with moderate HL, although results are characterized another type of phonetic feature such as mouth shape) by significant developmental delay and individual Jerger et al.: Role of Visual Speech 415 Downloaded From: http://jslhr.pubs.asha.org/ by St. John's University, Sarah Azer on 09/18/2014 Terms of Use: http://pubs.asha.org/ss/Rights_and_Permissions.aspx variability (Dodd & Burnham, 1988; Dodd & So, 1994; status may be encoded disproportionally in terms of vi- Oller & Eilers, 1981; see Moeller, Tomblin, Yoshinaga- sual speech gestural and/or articulatory codes. In this Itano, Connor, & Jerger, 2007, for a discussion). With case, a predicted outcome is that audiovisual distractors regard to the first research question, these data predict may influence performance significantly more in children that results for the auditory distractors in the children with poorer hearing status due to their greater depen- with HL will not differ from those in the NH group. This dence on visual speech for spoken communication. outcome would support the idea that speech representa- These data should provide new insights into the pho- tions in these children with HL are sufficiently specified nological representations underlying the production and in terms of auditorily encoded linguistic information to perception of words by children with HL. Studying onset support performance without relying on visual speech. effects may be particularly revealing to the extent that Scant evidence with the Cross-Modal Picture-Word incremental speech processing skills are important to Task and auditory nonsense syllable distractors sug- spoken word recognition and production and that word- gests that phonological distractors produce the ex- initial input is speechread more accurately than word- pected pattern of results, facilitation from congruent noninitial input (Fernald, Swingley, & Pinto, 2001; distractors and interference from conflicting distrac- Greenberg & Bode, 1968; Marslen-Wilson & Zwitserlood, tors, in children with moderate HL and good phoneme 1989; Gow, Melvold, & Manuel, 1996). discrimination but not in children with moderate HL and poor phoneme discrimination (Jerger, Lai, & Marchman, 2002). Method With regard to possible results for the second of the primary research questions, one predicted outcome for Participants the auditory distractors from the evidence in the lit- HL group. Participants were 31 children (20 boys erature about developmental delay is that performance and 11 girls) with prelingual sensorineural hearing loss in the children with HL may change with increasing age. (SNHL) ranging in age from 5;0 (years;months) to 12;2. To the extent that auditorily encoded linguistic infor- The racial/ethnic distribution was 74% White, 16% Black, mation becomes sufficiently strong with age to support 6% Asian, and 3% multiracial; of the total group, 6% were performance without relying on visual speech, we may of Hispanic ethnicity. We initially screened 63 children see a developmental shift in the influence of visual with a wide range of hearing impairments to attain the speech on performance as seen in a previous study with current pool of participants who met the following cri- children with NH (Jerger et al., 2009). Results would teria: (a) prelingual SNHL; ( b) English as a native lan- echo the normal pattern of results if there is an influence guage; (c) ability to communicate successfully aurally/ of visual speech in younger children that decreases with orally; (d) ability to hear accurately—on auditory-only, increasing age. This outcome would be consistent with open-set testing—100% of the baseline distractor onsets the idea of adaptive growth and reorganization in the and at least 50% of all other distractor onsets; (e) ability to knowledge systems underpinning performance on our discriminate accurately—on auditory-only two-alternative task, particularly phonology. forced-choice testing with a 50% chance level—at least In addition to the two primary questions, a second- 85% of the voicing (p-b, t-d) and place of articulation (p-t, ary question addressed whether performance on the b-d) phoneme contrasts; and (f) no diagnosed or suspected picture-word task differs in HL groups representing disabilities, excluding the speech and language problems poorer versus better hearing statuses as measured by that accompany prelingual childhood HL. auditory speech recognition ability. We should note that Unaided hearing sensitivity on the better ear, esti- most of the current participants with HL had good mated by pure-tone average ( PTA) hearing threshold hearing capacities as operationally defined by auditory levels ( HTLs) at 500, 1000, and 2000 Hz (American word recognition scores due to stringent auditory cri- National Standards Institute [ANSI], 2004), averaged teria for participation (see Method section). Thus, re- 50.13 dB HTL and was distributed as follows: ≤20 dB sults examining poorer versus better hearing status may (n = 7), 21–40 dB (n = 5), 41–60 dB (n = 9), 61–80 dB (n = 4), be limited, especially if age must also be controlled when 81–100 dB (n = 2), and >101 dB (n = 4). An amplification assessing the influence of hearing status. To the extent device was worn by 24 of the 31 children—18 were hearing that speech representations may be better structured in aid users and 6 were cochlear implant or cochlear implant terms of auditorily based linguistic information in chil- plus hearing aid users. Of the children who wore ampli- dren with better hearing status, a predicted outcome is fication, the average age at which they received their first that auditory distractors will affect performance signifi- listening device was 34.65 months (SD = 19.67 months). cantly more in children with better hearing status ver- Duration of listening device use was 60.74 months (SD = sus poorer hearing status. A corollary to this idea is that 20.87 months). Participants who wore amplification were speech representations in children with poorer hearing tested while wearing their devices. Most devices were 416 Journal of Speech, Language, and Hearing Research Vol. 52 412–434 April 2009 Downloaded From: http://jslhr.pubs.asha.org/ by St. John's University, Sarah Azer on 09/18/2014 Terms of Use: http://pubs.asha.org/ss/Rights_and_Permissions.aspx digital aids that were self-adjusting with the volume con- Nonverbal Measures trol either turned off or nonexistent. The children were re- Visual-motor integration. Visual-motor integration cruited from cooperating educational programs. The type was estimated with the Beery-Buktenica Developmen- of program was a mainstreamed setting for 25 chil- tal Test of Visual-Motor Integration (Beery, Buktenica, dren with some assistance from special education ser- & Beery, 2004). Visual-motor integration refers to the vices for 1 child, deaf education for 5 children, and total capacity to integrate visual and motor activities. Visual- communication for 1 child. Again, all children communi- motor integration was assessed by children’s ability to cated successfully aurally/orally. With regard to non- reproduce shapes ranging from a straight line to com- auditory status, all children passed measures establishing plex three-dimensional geometric forms. normalcy of visual acuity (including corrected-to-normal) and oral motor function. The average Hollingshead Visual perception. Visual perception was estimated (1975) social strata score, 2.16, was consistent with a with the subtest of the Beery-Buktenica Developmental minor professional, medium business, or technical socio- Test of Visual-Motor Integration (Beery et al., 2004). economic status (SES). Visual perception was assessed by children’s ability to visually identify and point to an exact match for geo- NH comparison group. Individuals were 62 children metric shapes. with normal hearing (33 boys and 29 girls) selected from a pool of 100 typically developing children who partici- Visual simple reaction time ( RT). Simple visual RT pated in an associated project explicating normal results was assessed by a laboratory task that quantified chil- (see Jerger et al., 2009). Ages ranged from 5;3 to 12;1. The dren’s speed of processing in terms of detecting and racial/ethnic distribution was 76% White, 5% Asian, 2% responding to a predetermined target. The stimulus was Black, 2% Native American, and 6% multiracial; of the always a picture of a basketball. Children were instructed group, 15% reported a Hispanic ethnicity. All children to push the response key as fast and as accurately as passed measures establishing normalcy of hearing sen- possible. Each run consisted of practice trials until per- sitivity, visual acuity (including corrected-to-normal), formance stabilized, followed by a sufficient number of gross neurodevelopmental history, and oral motor func- trials to yield eight good reaction times. tion. The average Hollingshead social strata score, 1.52, was consistent with a major business and professional Verbal Measures SES. A comparison of the groups on the verbal and non- Vocabulary. Receptive and expressive knowledge were verbal demographic measures, described in the section estimated with the Peabody Picture Vocabulary Test, that follows, is detailed in the Results section. Third Edition ( PPVT-III; Dunn & Dunn, 1997) and the Expressive One-Word Picture Vocabulary Test (Brownell, Demographic Measures 2000). Output phonology. Output phonology was estimated Materials, Instrumentation, with the Goldman–Fristoe Test of Articulation (Goldman and Procedure & Fristoe, 2000). Articulation of the picture-names of the Sensory-motor function and SES. All of the stan- Picture-Word Task (Jerger et al., 2009) was also assessed. dardized measures in this and the following sections Pronunciation of the pictures’ names was scored in terms were administered and scored according to the recom- of both onsets and offsets. Dialectic variations in pro- mended techniques. Hearing sensitivity was assessed nunciation were not scored as incorrect. with a standard pure-tone audiometer. Normal hearing Input phonology. Input phonological knowledge was sensitivity was defined as bilaterally symmetrical thresh- estimated by laboratory measures of auditory and au- olds of ≤20 dB HTL at all test frequencies between 500 diovisual onset and rhyme skills. These skills were and 4000 Hz (ANSI, 2004). Visual acuity was screened chosen because they are usually operational in young with the Rader Near Point Vision Test (Rader, 1977). children and are assumed to be more independent of read- Normal visual acuity was defined as 7 out of 8 targets ing skill than other phonological measures (Stanovich, correct at 20/30 Snellen Acuity (including participants Cunningham, & Cramer, 1984; Bird, Bishop, & Freeman, with corrected vision). Oral-motor function was screened 1995). The onset test consisted of 10 words beginning with a questionnaire designed by an otolaryngologist who with the stop consonants of the Picture-Word Task. Each is also a speech pathologist (Peltzer, 1997). The question- word had a corresponding picture response card with four naire contained items concerning eating, swallowing, and alternatives. The alternatives had a CV nucleus and rep- drooling. Normal oral-motor function was assumed if resented consonant onsets with (a) correct voicing and the child passed all items on the questionnaire. SES correct place of articulation, ( b) correct voicing and in- was estimated with the Hollingshead Four-Factor Index correct place of articulation, (c) incorrect voicing and cor- (Hollingshead, 1975). rect place of articulation, and (d) incorrect voicing and Jerger et al.: Role of Visual Speech 417 Downloaded From: http://jslhr.pubs.asha.org/ by St. John's University, Sarah Azer on 09/18/2014 Terms of Use: http://pubs.asha.org/ss/Rights_and_Permissions.aspx incorrect place of articulation. For example, for the tar- only and audiovisually. The Distractor Recognition Task ^ get onset segment /d /, the pictured alternatives were “duck,” “bus,” “tongue,” and “puppy.” The rhyme was scored in terms of both words and onsets. Experimental Picture-Word Task judgment task also consisted of 10 words, each with a corresponding picture response card containing four alternatives. The alternatives represented the following Materials and Instrumentation relations to the test item (“boat”): the rhyming word Pictures and distractors. Specific test items and con- (“goat”) and words with the test item’s initial consonant ditions comprising the Children’s Cross-Modal Picture- (“bag”), final consonant (“kite”), and vowel (“toad”). Chil- Word Task have been detailed previously (Jerger, or “Which one rhymes with boat?” Scores for both mea- ^ dren were asked to indicate “Which one begins with /d /?” Martin, & Damian, 2002). The pictured objects in this study are the same items. In brief, the pictures’ names sures were percentage correct. Individual items making always began with / b/, /p/, /t/, or /d /, coupled with the up the tests were administered randomly. The items were recorded and edited as described in the Preparation ^ vowels /i/ or / /. These onsets represent developmentally early phonetic achievements and reduced articulatory de- of stimuli section. In brief, all auditory and audiovisual mands and tend to be produced accurately by young chil- items were recorded digitally by the same talker who dren, both NH and HL (Abraham, 1989; Dodd, Holm, Hua, recorded the stimuli for the experimental task. & Crosbie, 2003; Smit, Hand, Freilinger, Bernthal, & In addition to the previously mentioned measures Bird, 1990; Stoel-Gammon & Dunn, 1985; Tye-Murray, used to compare skills in the HL versus NH groups, an- Spencer, & Woodworth, 1995). To the extent that pho- other input phonological task, phoneme discrimination, nological development is a dynamic process, with knowl- was used to screen participants. Phoneme discrimina- edge improving from (a) unstable, minimally specified, tion was assessed with auditory and audiovisual same– and harder-to-access/retrieve representations to (b) sta- different tasks, which were administered in a counter- ble, robustly detailed, and easier-to-access/retrieve repre- balanced manner across participants. Test stimuli were sentations, it seems important to assess early-acquired the stop consonants of the picture-word test ( p/, / b/, /t /, phonemes that children are more likely to have mastered ^ /d /) coupled with the vowels (/i / and / / ). The test com- prised 40 trials representing a priori probabilities of 60% in an initial study (see McGregor, Friedman, Reilly, & Newman, 2002, for similar reasoning about semantic “different” and 40% “same” responses. The same stimuli knowledge). The onsets represent variations in place of ar- consisted of two different utterances; thus, the same pairs ticulation (/b/ – /d/ vs. /p/ – /t/) and voicing (/b/ – /p/ vs. /d/ – had phonemic constancy with acoustic variability. The /t/), two phonetic features that are traditionally thought to intersignal interval from the onset of the first syllable to be differentially dependent on auditory versus visual the onset of the second syllable was 750 ms. The response speech (Tye-Murray, 1998). board contained two telegraph keys labeled “same” and Each picture was administered in the presence of “different.” The labels were two circles (or two blocks for word distractors whose onsets relative to the pictures’ younger children) of the same color and shape or of dif- onsets represented either all features congruent, one ferent colors and shapes. A blue star approximately 3 cm feature conflicting in place of articulation, or one feature to the outside of each key designated the start position conflicting in voicing. The vowel of all distractors was for each hand, assumed before each trial. Children were always congruent. A baseline condition for each picture instructed to push the correct response key as fast and as was provided by a vowel-onset distractor, “eagle” for /i / accurately as possible. Children were informed that about one-half of trials would be the same and about one-half ^ vowel-nucleus pictures and “onion” for / / vowel-nucleus pictures. Appendix Table A-1 details all of the individual would be different. Individual items were administered picture and distractor items. Linguistic statistics, the randomly. Again, the items were recorded by the same ratings or probabilities of occurrence of various attri- talker and were edited as described in the Preparation of butes of words, indicated that the test materials were of stimuli section. high familiarity (Coltheart, 1981; Morrison, Chappell, & Word recognition. Auditory performance for words Ellis, 1997; Nusbaum, Pisoni, & Davis, 1984; Snodgrass was quantified by averaging results on a test with a & Vanderwart, 1980), high concreteness (Coltheart, closed-response set, the Word Intelligibility by Picture 1981; Gilhooly & Logie, 1980), high imagery (Coltheart, Identification ( WIPI) test (Ross & Lerman, 1971), and 1981; Cortese & Fugett, 2004; Morrison et al., 1997; one with an open-response set, the auditory-only con- Paivio, Yuille, & Madigan, 1968), high phonotactics prob- dition of the Children’s Audio-Visual Enhancement Test abilities (Vitevitch & Luce, 2004), low word frequency (CAVET; Tye-Murray & Geers, 2001). Visual-only and ( Kucera & Francis as cited in Coltheart, 1981), and an audiovisual word performance were also estimated with early age of acquisition (Carroll & White, 1973; Dale & the CAVET. Finally, recognition of the distractor words Fenson, 1996; Gilhooly & Logie, 1980; Morrison et al.). of the Picture-Word Task was assessed, both auditory At least 90% of preschool/elementary school children 418 Journal of Speech, Language, and Hearing Research Vol. 52 412–434 April 2009 Downloaded From: http://jslhr.pubs.asha.org/ by St. John's University, Sarah Azer on 09/18/2014 Terms of Use: http://pubs.asha.org/ss/Rights_and_Permissions.aspx recognized the distractor words (Jerger, Bessonette, for the tester (e.g., delete trial and re-administer item Davies, & Battenfield, 2007). We should note, however, later), yielding an effective monitor size of about 356 mm. that phonological effects are not affected by whether the The inner facial image was approximately 90 mm in distractor is a known word (Levelt, 2001). In fact, non- height with a width of about 80 mm at eye level. Previous word auditory distractors produce robust phonological research has shown that there are no detrimental effects effects on picture-word naming (Jerger, Lai, & Marchman, on lipreading for head heights varying from 210 mm to 2002; Jerger, Martin, & Damian, 2002; Starreveld, 2000). only 21 mm, when viewed at a distance of 1 m (Jordan & Preparation of stimuli. The distractors—and the stim- Sergeant, 1998). The dimensions of each picture on the uli for assessing phoneme discrimination and input pho- talker ’s chest were about 65 mm in height and 85 mm in nological skills—were recorded by an 11-year-old child width. The auditory track of the Quicktime movie file actor in the Audiovisual Stimulus Preparation Labora- was routed through a speech audiometer to a loud- tory of the University of Texas at Dallas with recording speaker. For audiovisual trials, each trial contained equipment, soundproofing, and supplemental lighting 1,000 ms of the talker’s still neutral face, followed by an and reflectors. The child was a native speaker of English audiovisual utterance of one distractor word with one with general American dialect. His full facial image and colored picture introduced on the chest in relation to the upper chest were recorded. Full facial images have been auditory onset of the utterance, followed by 1,000 ms of shown to yield more accurate lipreading performance still neutral face and colored picture. For auditory-only (Greenberg & Bode, 1968), suggesting that facial move- trials, the auditory track and the picture were exactly ments other than the mouth area contribute to lipread- the same as previously mentioned, but the visual track ing (Munhall & Vatikiotis-Bateson, 1998). The audiovisual contained a still neutral face for the entire trial. Thus, the recordings were digitized via a Macintosh G4 computer only difference between the auditory and audiovisual with Apple Fire Wire, Final Cut Pro, and Quicktime conditions was that the auditory items had a neutral software. Color video was digitized at 30 frames/second static face and the audiovisual items had a dynamic face. with 24-bit resolution at 720 × 480 pixel size. Auditory Both the computer monitor and the loudspeaker input was digitized at a 22-kHz sampling rate with 16-bit were mounted on an adjustable height table directly in amplitude resolution. The pool of utterances was edited front of the child at a distance of approximately 90 cm. To to an average RMS level of –14 dB. name each picture, children spoke into a unidirectional The colored pictures were pasted onto the talker’s microphone mounted on an adjustable stand. To obtain chest twice to form SOAs of –165 ms (the onset of the naming times, the computer triggered a counter/timer distractor is 165 ms or 5 frames before the onset of the with better than 1 ms resolution at the initiation of a picture) and +165 ms (the onset of the distractor is 165 ms movie file. The timer was stopped by the onset of the or 5 frames after the picture). To be consistent with re- child’s vocal response into the microphone, which was sults in the literature, we defined a distractor’s onset on fed through a stereo mixing console amplifier and 1-dB the basis of its auditory onset. In the results reported step attenuator to a voice-operated relay ( VOR). A pulse herein, only results at the lagging SOA are considered. from the VOR stopped the timing board via a data mod- The rationale for selecting only the lagging SOA is that a ule board. The counter timer values were corrected for lag of roughly 100–200 ms maximizes the interaction the amount of silence in each movie file before the dis- between output and input phonological representations tractor’s auditory onset and picture onset. (Damian & Martin, 1999; Schriefers et al., 1990). When a Procedure phonological distractor is presented about 100–200 ms before the onset of the picture, the activation of its input phonological representation is hypothesized to have Participants were tested in two separate sessions, decayed prior to the output phonological encoding of the one for auditory testing and one for audiovisual testing. picture, and the interaction is lessened (Schriefers et al., The sessions were separated by approximately 13 days 1990). For the present study, the semantic items were for the NH group and 5 days for the HL group. The mo- also viewed as filler items. A rationale for including the dality for first and second sessions was counterbalanced semantic items is that an inconsistent relationship across participants for the NH group. For the HL group, between the picture–distractor pairs helps participants however, the first session was always the audiovisual disregard the distractors. modality because pilot results indicated that recognition Experimental setup. To administer picture-word of the auditory distractor words was noticeably better in items, the video track of the Quicktime movie file was the children with HL who had previously undergone au- presented via a Dell Precision Workstation to a high- diovisual testing. Word recognition for the auditory dis- resolution 457-mm monitor. The outer borders of the mon- tractors remained at ceiling in the NH group regardless itor contained a colorful frame covering control buttons of the modality of the initial test session. Children sat at Jerger et al.: Role of Visual Speech 419 Downloaded From: http://jslhr.pubs.asha.org/ by St. John's University, Sarah Azer on 09/18/2014 Terms of Use: http://pubs.asha.org/ss/Rights_and_Permissions.aspx a child-sized table in a double-walled sound-treated booth. about 3% of overall trials. In sum, about 4.5% of trials in A tester sat at the computer workstation, and a co-tester the NH group and 7.5% of trials in the HL group were sat alongside the child, keeping her on task. Each trial missing or excluded. Individual data for each experimen- was initiated by the tester’s pushing the space bar (out of tal condition were naming times averaged across the participant’s sight). Children were instructed to name picture–distractor pairs for which the distractor onset each picture and disregard the speech distractor. They was heard correctly. In other words, if a child with HL were told that “Andy” (pseudonym) was wearing a pic- misheard the onset of one distractor of an experimental ture on his chest, and he wanted to know what it was. condition, his or her average performance for that con- They were to say the name correctly and as quickly as dition was based on seven picture–distractor pairs rather possible. The microphone was placed approximately than the traditional eight picture–distractor pairs. Per- 30 cm from the child’s mouth without blocking her view formance on the auditory distractor onset recognition of the monitor. Children were encouraged to speak into task was at ceiling in all children with NH and averaged the microphone at a constant volume representing a clear 96.06% correct (range = 78%–100% correct) in the HL conversational speech level. If necessary, the child’s speak- group. We should clarify that an incorrect response on the ing level, the position of the microphone or child, and/or distractor repetition task could be attributed to mishear- the setting on the 1-dB step attenuator between the ing because all of the children pronounced the onsets of our microphone and VOR were adjusted to ensure that the pictures and distractors correctly. In addition to controlling VOR was triggering reliably. The intensity level of the for mishearing distractor onsets, we also controlled for distractors was approximately 70 dB SPL, as measured phoneme discrimination abilities by requiring that all par- at the imagined center of the participant’s head with a ticipants pass (better than 85% correct) a phoneme dis- sound level meter. crimination task, as described previously. Performance on Prior to beginning, picture naming was practiced. A the auditory phoneme discrimination task was at ceiling tester showed each picture on a 5µ × 5µ card, asking in all children with NH and averaged 98.33% correct children to name the picture and teaching them the tar- (range = 88%–100% correct) in the children with HL. get names of any pictures named incorrectly. Next, the Data for each condition were analyzed with a mixed- tester flashed some picture cards quickly and modeled design analysis of variance (ANOVA) by regression and speeded naming. The child was asked to copy the tester multiple t tests. The problem of multiple comparisons for another few pictures, emphasizing that we are prac- was controlled with the false discovery rate ( FDR) pro- ticing naming the pictures as fast as we can to say them cedure (Benjamini & Hochberg, 1995; Benjamini, Krieger, correctly. Speeded naming practice trials went back and & Yekutieli, 2006). The FDR approach controls the ex- forth between tester and child until the child was nam- pected proportion of false positive findings among rejected ing pictures fluently, particularly without saying “a” hypotheses. A value of the approach is its demonstrated before names. For experimental trials, each picture was applicability to repeated-measures designs. Finally, to presented with each type of speech distractor at each of form a single composite variable representing age-related the two SOAs. Test items and SOAs were presented ran- competencies, we analyzed the dependencies among the domly within one unblocked condition (see Starreveld, demographic variables in the HL group with principal 2000, for a discussion). Each test began with two practice component analysis ( PCA; Pett, Lackey, & Sullivan, trials. All trials judged to be flawed (e.g., lapses of atten- 2003). We decided to use composite variables because tion, squirming out of position, triggering the micro- they provide a more precise and reliable estimate of de- phone in a flawed manner) were deleted online and were velopmental capacities than age alone, particularly in re-administered after intervening items. nontypically developing children (Allen, 2004). PCA creates composite scores for participants by computing a linear Data Analysis combination (i.e., a weighted mean) of the original var- iables. Standard scores for simple visual reaction time With regard to picture naming results, the total and articulatory proficiency (number of errors) were mul- number of trials deleted online (with replacement) rep- tiplied by –1 so that good performance would be positive. resented 12% of overall trials in the HL group and 17% of Variables in the PCA analysis were visual motor integra- overall trials in the NH group. The percentage of missing tion, visual perception, visual simple RT, receptive and trials remaining at the end because the replacement expressive vocabulary, articulation proficiency, auditory trial was also flawed was about 3% of overall trials in both onset, auditory rhyme, and visual-only lipreading. Age groups. Naming responses that were more than 3 SDs and auditory word recognition scores were not entered from an item’s conditional mean were also discarded. into the analysis in order to allow assessment of the un- This procedure resulted in the exclusion of about 1.5% of derlying construct represented by the component by its trials in both groups. In the HL group, the number of trials correlation with age and auditory word recognition (i.e., deleted due to mishearing the distractor represented hearing status). 420 Journal of Speech, Language, and Hearing Research Vol. 52 412–434 April 2009 Downloaded From: http://jslhr.pubs.asha.org/ by St. John's University, Sarah Azer on 09/18/2014 Terms of Use: http://pubs.asha.org/ss/Rights_and_Permissions.aspx Results Results averaged 19–20 for visual motor integration and 20–24 for visual perception. The raw scores represented In this section, we analyze results for the demo- comparable percentile scores across groups for visual graphic and picture-word tasks. The focus of the demo- motor integration, about the 50th percentile, but not for graphic analyses was (a) to compare the NH and HL visual perception. For the latter measure, average per- groups in terms of verbal and nonverbal abilities and age formance represented about the 40th percentile in the and ( b) to detail results of the PCA analysis in the HL HL group and the 75th percentile in the NH group. group. The focus of the picture-word analyses was (a) to Finally, speed of visual processing as indexed by sim- compare the groups in terms of the influence of hearing ple RT averaged about 725 ms and ranged from ap- impairment, the type of distractor, and the modality of proximately 485 ms to approximately 1,065 ms in both the distractor on performance, and ( b) to detail results in groups. the HL group in terms of age-related competencies and Multiple regression analyses indicated no signifi- hearing status. cant difference in overall performance between groups (i.e., group membership could not be predicted from Demographic Results knowledge of the set of nonverbal measures). However, at least one of the nonverbal measures appeared to ex- Comparison between groups. Table 1 compares re- hibit a difference between groups, resulting in a Group × sults on a set of nonverbal and verbal measures in the Nonverbal Measures interaction that approached signif- NH and HL groups. For the nonverbal measures, we icance, F(3, 273) = 2.453, p =.064. Pairwise comparisons attempted to select NH children whose ages and skills carried out with the FDR method, controlling for mul- were comparable to those in the HL group. Chronological tiple comparisons, indicated that visual perception was age averaged approximately 7;9 and ranged from 5 to significantly better in the NH group than in the HL group. 12 years of age in both groups. Visual skills were quan- Age and all other nonverbal skills did not differ signif- tified by raw scores or the number of correct responses. icantly between groups. Table 1. Average performance on sets of demographic variables in the groups with normal hearing (NH) versus the groups with hearing loss (HL). Demographic variable NH groups HL groups Nonverbal measuresa Age (years;months) 7;8 (1;9) (5;3–12;1) 8;0 (1;8) (5;0–12;2) Visual motor integration (raw score) 20.39 (3.98) (11–28) 19.42 (3.93) (12–26) Visual perception** (raw score) 23.71 (3.62) (15–30) 20.16 (4.09) (9–28) Visual simple RT (ms) 700 (132.08) (508–1045) 755 (193.37) (464–1088) Verbal measures b Vocabulary (raw score) Receptive** 120.27 (24.77) 97.52 (28.82) Expressive** 91.18 (18.81) 76.74 (21.60) Output phonology Articulation (number of errors)** 1.13 (2.60) 4.64 (7.44) Input phonology (%) Onset—Auditory** 98.55 (4.38) 90.32 (19.06) Onset—Audiovisual — 94.84 (9.62) Rhyme—Auditory 91.45 (15.87) 89.03 (21.35) Rhyme—Audiovisual — 91.61 (14.85) Word recognition (%) Auditory** 99.48 (1.20) 87.34 (12.23) Audiovisual — 93.23 (10.21) Visual-only lipreading** 11.37 (10.68) 23.06 (15.09) Note. Dashes in the NH group column indicate that this was not administered due to ceiling performance on auditory-only task. RT = reaction time. a Parentheses show SD and range, respectively. bParentheses show SD. **Adjusted p <.05. Jerger et al.: Role of Visual Speech 421 Downloaded From: http://jslhr.pubs.asha.org/ by St. John's University, Sarah Azer on 09/18/2014 Terms of Use: http://pubs.asha.org/ss/Rights_and_Permissions.aspx The lower part of Table 1 summarizes results on the both NH and HL, pronounced all of the onsets of the pic- set of verbal measures. We did not attempt to form sim- tures’ names accurately. The offsets of the pictures’ names ilar verbal skills in the groups. Average raw scores for were pronounced correctly by 89% of the NH group and vocabulary skills differed between the NH versus HL 55% of the HL group. Of the children who mispronounced groups, about 120 versus 97 for receptive abilities and 91 an offset, typical errors in both groups involved the /th / versus 77 for expressive abilities. Average performance in “teeth,” the /mp/ in “pumpkin,” the /r/ in “deer,” and / represented about the 78th percentile in the NH group or the /z / in “pizza.” Performance for auditory-only dis- and the 40th percentile in the HL group. Output pho- tractor recognition scored in terms of the word was con- nological skills, as estimated by articulatory proficiency, sistently at ceiling in the NH group and was distributed differed between groups, with an average number of as follows in the HL group: ≥90% (n = 21), 80%–89% errors of about 1 in the NH group and 5 in the HL group. (n = 7), 70%–79% (n = 2), and 60% (n = 1). Input phonological skills for auditory-only stimuli were PCA analysis in the HL group. Results of a PCA anal- slightly better in the NH group than in the HL group, ysis identified two significant (i.e., eigenvalues larger averaging about 96% versus 91%, respectively. Results in than 1) principal components. The proportion of vari- the HL group may have been influenced by our entry ance extracted by these two components was 67.842%, criterion requiring all participants with HL to have at with the first principal component accounting for the least 85% phoneme discrimination. Input phonological majority of the variance, 53.018%. We focused only on skills for audiovisual stimuli were slightly improved in the first component. This factor reflected relatively high the HL group; about 92%–95%; the audiovisual condi- positive loadings on all of the demographic measures ex- tion was not administered in the NH group due to ceiling cept visual speech, as detailed in Table 2. To investigate performance in the auditory-only condition. Finally, per- the underlying construct represented by the component, formance for auditory word recognition in the NH and we assessed the correlation between the participants’ HL groups averaged about 99% and 87%, respectively. composite scores for the component versus age and hear- Results in the HL group yielded the following distribu- ing status. The composite scores were significantly cor- tion: ≥90% (n = 16), 80%–89% (n = 9), 70%–79% (n = 3), related with age, r =.683, F(1, 29) = 13.977, p <.001, but 60%–69% (n = 2), and 50%–59% (n = 1). Audiovisual not with hearing status. The underlying construct for word recognition on the CAVET was improved in the HL this component appeared to be age-related competencies. group, about 93%; again, the audiovisual condition was The influence of age-related competencies and hearing not administered in the NH group. Visual-only lipreading status in the HL subgroups was assessed after we stud- of words was better in the HL group than in the NH ied the general effect of hearing impairment by compar- group, about 23% versus 11%, respectively. ing overall picture-word results in the NH and HL Multiple regression analysis of the set of verbal mea- groups. Data are presented separately for the base- sures, excluding audiovisual measures which were not line, conflicting, and congruent conditions. An impor- obtained in the NH group, indicated significantly dif- tant preliminary step prior to addressing our research ferent abilities in the groups (i.e., group membership questions is to determine whether baseline naming could be predicted from knowledge of the set of mea- times differed significantly as a function of the type of sures), F(1, 91) = 9.439, p =.003. The pattern of results vowel onset, the modality of the distractor, or the status between groups was not consistent, however, resulting of hearing. in a significant Group × Verbal Measures interaction, Picture-Word Results F(6, 546) = 14.877, p <.0001. Pairwise comparisons car- ried out with the FDR correction for multiple compar- isons indicated that receptive and expressive vocabulary, Baseline condition. Figure 2 shows average nam- articulation proficiency, auditory onset judgments, and ing times in the NH and HL groups for the vowel onset auditory word recognition were significantly better in baseline distractors presented in the auditory and au- the NH group than in the HL group. Auditory rhyming diovisual modalities. The data were collapsed across skills did not differ significantly between groups. In con- vowels. Results in Figure 2 were analyzed to assess the trast, visual-only lipreading was significantly better in following: (a) Do absolute auditory and audiovisual nam- the HL group than in the NH group. This latter finding in children is consistent with previous observations of the baseline distractors in the groups? ( b) Do absolute ^ ing times differ for the two vowel onsets, /i / versus / /, of enhanced lipreading ability in adults with early-onset baseline naming times differ for the auditory versus au- hearing loss (Auer & Bernstein, 2007). diovisual distractors in either group? (c) Do absolute au- Finally, we should note that output phonological ditory and audiovisual baseline naming times differ in skills and auditory word recognition were also assessed the NH versus HL groups? These issues were addressed for the pictures and distracters, respectively, of the with a mixed-design ANOVA by regression with one picture-word test (not shown in Table 1). All children, between-subjects factor, group ( NH vs. HL), and two 422 Journal of Speech, Language, and Hearing Research Vol. 52 412–434 April 2009 Downloaded From: http://jslhr.pubs.asha.org/ by St. John's University, Sarah Azer on 09/18/2014 Terms of Use: http://pubs.asha.org/ss/Rights_and_Permissions.aspx Table 2. Explanation of the age-related competencies component yielded by the principal components analysis with the set of demographic variables. Age subgroups Age-related competencies component Component loadings Less mature (n =13) More mature (n = 18) Vocabulary (raw score) Receptive**.183* 60.46 (14.99) 88.50 (17.77) Expressive**.173* 77.31 (21.30) 112.11 (24.65) Input phonology (% correct) Auditory onset**.141* 79.23 (25.32) 98.33 (5.15) Auditory rhyme**.152* 77.69 (29.48) 97.22 (4.61) Output phonology Articulation (number errors)**.148* 9.00 (9.45) 1.50 (3.09) Visual skills Visual perception (raw score)**.155* 17.00 (3.39) 22.44 (2.87) Visual motor integration (raw score)**.159* 16.38 (2.57) 21.61 (3.24) Visual simple RT (ms) **.169* 935.48 (116.03) 625.71 (117.78) Mixed visual/phonology Visual-only lipreading (% correct).059 16.92 (14.07) 27.50 (14.58) Variables not in component analysis Age (years;months)** 6;9 (1;3) 8;10 (1;9) Auditory word recognition (% correct) 86.04 (14.28) 88.28 (10.85) Note. In this table, the first two columns detail the variables in the PCA analysis and their loadings for the age-related competencies principal component. The last two columns compare average performance and SDs (in parentheses) on the variables in HL subgroups representing less mature or negative composite scores (M = –0.372) versus more mature or positive composite scores (M = 0.269). Average performance and SDs are shown at the bottom of the table for chronological age and auditory word recognition, variables that were not included in the analysis. *Relatively high loading. **Adjusted p <.05. ^ within-subjects factors, type of vowel onset (/i / vs. / / ) and modality of distractor (auditory vs. audiovisual). Figure 2. Average absolute naming times in normal hearing (NH) Results indicated that naming times for the /i / ver- and hearing loss (HL) groups for vowel onset, baseline distractors, presented in the auditory versus audiovisual modalities. ^ sus / / onsets did not differ significantly in the groups for either modality. Naming times collapsed across ^ modality for the /i / and / / onsets, respectively, aver- aged about 1,555 and 1,520 ms in the NH group and 1,665 and 1,620 ms in the HL group. Naming times for the auditory versus audiovisual distractors also did not differ significantly in the groups. The difference between modalities averaged only about 32 ms for each group. Finally, overall naming times did not differ significantly between groups, p =.269, although on average, overall performance was about 100 ms slower in the HL group. This result agrees with our previous findings for baseline verbal distractors in children with NH versus HL (Jerger, Lai, & Marchman, 2002). In short, baseline naming times did not differ signif- icantly as a function of the type of vowel onset, the mo- dality of the distractor, or the status of hearing. In the subsequent results, we quantified the degree of facilita- tion and interference from congruent and conflicting onsets, respectively, with adjusted naming times. These data were derived by subtracting each participant’s Jerger et al.: Role of Visual Speech 423 Downloaded From: http://jslhr.pubs.asha.org/ by St. John's University, Sarah Azer on 09/18/2014 Terms of Use: http://pubs.asha.org/ss/Rights_and_Permissions.aspx vowel baseline naming times from his or her congruent conflicting condition was whether adjusted naming and conflicting times, as done in our previous studies times differ in the groups for the different types of con- (Jerger, Lai, & Marchman, 2002; Jerger, Martin, & flicting distractors. Damian, 2002; Jerger et al., 2009). This approach con- Conflicting conditions. Figure 3 shows the degree of trols for developmental differences in detecting and interference as quantified by adjusted naming times in responding to stimuli and allows each picture to serve as the NH and HL groups for the auditory and audiovisual its own control without affecting the differences between distractors conflicting in place of articulation or in voic- the types of distractors. ing. We addressed the issues described previously with a The adjusted naming times were used to quantify mixed-design ANOVA by regression with one between- picture-word performance for the conflicting and con- subjects factor, group (NH vs. HL) and two within-subjects gruent conditions. Prior to proceeding, we conducted factors, modality of distractor (auditory vs. audiovisual) preliminary analyses to determine whether the adjusted and type of distractor (conflicting in place vs. in voicing) naming times differed significantly in the groups as a func- and with t tests. The dependent variable was the degree ^ tion of the vowel nucleus (/i/ vs. / /) of the experimen- tal distractors. Results indicated that adjusted naming of interference as quantified by adjusted naming times. Results indicated that the overall degree of interference times did not differ significantly for the two vowel nuclei differed significantly in the NH and HL groups, F(1, 91) = in either the NH or HL groups for the conflicting con- 7.983, p =.006. The degree of interference collapsed dition. In contrast, naming times for the vowel nuclei did across modality and type of distractor was significantly differ significantly for the congruent condition but only greater in the NH group than in the HL group, about in the HL group. In the sections that follow, the data for 147 ms versus 82 ms, respectively. The type of distractor, the conflicting condition are collapsed across the vowel conflicting in place versus in voicing, did not significantly nuclei for all analyses, whereas the data for the con- influence results in the groups. However, the modality of gruent condition are analyzed separately for each vowel the distractor produced noticeably different patterns of nucleus. To address our first primary research question interference in the groups, with a modality of Distractor × concerning whether performance differed in the NH versus Group interaction that achieved borderline significance, HL groups for the auditory and audiovisual distractors, F(1, 91) = 3.725, p =.057. As seen in Figure 3, the degree of we formulated three a priori queries for the conflicting interference for the audiovisual distractors was similar, and congruent conditions: (a) Do overall adjusted nam- about 128 ms, in the NH and HL groups. In contrast, the ing times differ in the NH versus HL groups? ( b) Do ad- degree of interference for the auditory distractors was justed naming times differ for the audiovisual versus significantly greater in the NH group, about 154 ms ver- auditory distractors in the groups? (c) Do all adjusted sus only 50 ms in the HL group. Because the interaction naming times differ from zero (i.e., show significant in- between the modality of the distractor and the groups was terference or facilitation)? An additional query for the only of borderline significance, we probed the strength of Figure 3. Distractors conflicting in place of articulation or in voicing. Degree of interference for auditory versus audiovisual modalities as quantified by adjusted naming times in NH and HL groups. The zero baseline of the ordinate represents naming times for vowel onset baseline distractors (see Figure 2). A larger positive value indicates more interference. 424 Journal of Speech, Language, and Hearing Research Vol. 52 412–434 April 2009 Downloaded From: http://jslhr.pubs.asha.org/ by St. John's University, Sarah Azer on 09/18/2014 Terms of Use: http://pubs.asha.org/ss/Rights_and_Permissions.aspx the effect with multiple t tests. Results of the FRD ap- differs in HL subgroups representing different levels proach indicated that the auditory modality produced of age-related competencies. The first two columns of significantly more interference in the NH group than in Table 2 summarize the variables in the PCA analysis the HL group for both types of conflicting distractors. and their loadings for the age-related competencies prin- The pairwise comparisons for the audiovisual distrac- cipal component. The last two columns compare results tors were not significant. Finally, we carried out multiple on the variables and on chronological age and auditory t tests to determine whether each mean differed signif- word recognition in HL subgroups representing less icantly from zero, indicating significant interference. mature and more mature age-related competencies. The Results of the FDR method indicated that the audiovisual less mature subgroup (n = 13) included the 12 children distractors produced significant interference in all groups, with negative composite scores and 1 child whose com- but the auditory distractors produced significant interfer- posite score was essentially zero. The more mature sub- ence in the NH group only. group (n = 18) included the 16 children with positive The pattern of results in the HL group (i.e., sig- composite scores and 2 children with near-zero compos- nificant interference from the audiovisual conflicting ite scores. The children with near-zero composite scores distractors only) suggests a relatively greater influence were assigned to the subgroup whose regression equa- of visual speech on phonological processing by children tion minimized the deviation of their composite scores with HL. As we noted earlier, however, results from 4 to from the regression line. Results of t tests controlling for 14 years of age in children with NH showed a U-shaped multiplicity with the FDR method indicated that the developmental function with a period of transition from subgroups differed significantly on all measures in Table about 5 years to 9 years that did not show any influence 2 excepting visual-only lipreading and auditory word of visual speech on performance. Developmental shifts recognition. It seems important to stress the latter re- in the influence of visual speech on performance in the sult indicating that the HL subgroups did not differ on NH group broach the possibility of developmental shifts our proxy variable for hearing status. in the pattern of results seen in Figure 3 in the children Figure 4 shows the degree of interference as quan- with HL. To explore this possibility, we formed HL sub- tified by adjusted naming times in the less mature and groups with less mature and more mature age-related more mature HL subgroups for auditory and audio- competencies from the composite scores expressing the visual distractors conflicting in place of articulation or in first principal component. voicing. We again analyzed results with a mixed-design Age-related competencies versus performance for con- ANOVA by regression with one between-subjects factor, flicting conditions in children with HL. Our second pri- group ( less mature vs. more mature) and two within- mary research question concerned whether performance subjects factors, modality of distractor (auditory vs. Figure 4. Distractors conflicting in place of articulation or in voicing: Age-related competency. Degree of interference for auditory versus audiovisual modalities as quantified by adjusted naming times in HL subgroups representing less mature (M = 6;9 [years;months]) and more mature (M = 8;10) age- related competencies. The zero baseline of the ordinate represents naming times for vowel onset baseline distractors (see Figure 2). A larger positive value indicates more interference. Hearing status, as defined by auditory word recognition, did not differ in the subgroups.

Role of Visual Speech in Phonological Processing by Children with Hearing Loss PDF

Document Details

Tags

Related

Summary

Full Transcript