Summary

This document discusses various aspects of language acquisition, from nativist perspectives to connectionist models and the role of language in AI. It covers topics such as language learning in children, the development of past tenses, and computational models like SHRDLU.

Full Transcript

Language I. Language learning in children A. Nativist views of language learning 1. Proponents a. Fodor’s language of thought b. Chomsky’s language acquisition device 2. Challenges to nativist views A. Connectionist models of past tense acquisition B. Bayesian language learning: word segmentation B....

Language I. Language learning in children A. Nativist views of language learning 1. Proponents a. Fodor’s language of thought b. Chomsky’s language acquisition device 2. Challenges to nativist views A. Connectionist models of past tense acquisition B. Bayesian language learning: word segmentation B. Age and language learning C. Effects of early exposure to language on cognitive development II. Neurolinguistics − Word processing in reading: Direct vs. indirect access hypotheses III. Second language acquisition IV. Social context of speech V. Language and thought VI.Natural language processing in AI A. Speech perception B. ELIZA C. SHRDLU Models of Language Learning Language is a rule-governed activity, but where does the knowledge of those rules come from? Where does meaning start? Fodor’s model: Language is a rule-governed activity, not just in terms of grammatical rules, but also rules giving the meanings of words and governing the deep structure of sentences The default hypothesis with regard to language learning is that it is a matter of learning those rules Fodor has built on the default hypothesis to argue that learning a language requires you to formulate, test, and revise hypotheses about truth rules, which must be formulated in the language of thought ➜ The language of thought cannot itself be learned, and so must be innate Chomsky’s model: Sophisticated cognitive ability, such as use of language, involves stored bodies of information (about phrase structures and transformation rules) − These bodies of information can be manipulated algorithmically However, humans are also born with a specialized language acquisition device, that is, they are prewired to learn language Ø Ex: Baby kittens and human babies are both exposed to the same language, but human (almost) always acquire the ability to understand/produce language while cats do not Chomsky’s theory holds that All human languages can be understood in terms of different parameter settings in a universal grammar The universal grammar is an innate fixed structure that holds across all languages The parameter settings are language specific and learned through hypothesis formation and testing This is thus a nativist view (i.e., view that language learning is innate) It is based on poverty of stimulus arguments: young children are simply not exposed to enough information to allow them to learn a language Much of the speech that children hear is actually ungrammatical, but not flagged as such Children are typically only exposed to positive information, i.e., they are not told what counts as ungrammatical (e.g., the bell ringed) Language Learning in Neural Networks ★ How to test the claim of innatism? Can try to disprove by constructing models that simulate the trajectory of human language learning without explicitly representing any rules Connectionist approach Provides an alternative to the rule-based conception of language comprehension and learning of nativist approaches discussed above Demonstrates that it is possible to learn complex linguistic skills without having any explicit linguistic rules encoded in it Ø Ex: Simple recurrent networks have been successfully trained to predict the next letter in a sequence of letters or the next word in a sequence of words The learning trajectory of these networks strongly resembles the learning trajectory of human infants Ø Ex: learning how to form the past tense of English verbs, both regular and irregular Children learning the English past tense go through three easily identifiable stages: Stage 1: They employ a small number of very common verbs in the past tense (e.g., “got,” “gave,” “went,” “was”) − Most of these verbs are irregular, and assumption is that children learn these by rote − At this stage, children are not capable of generalizing from the words they have learned. They also tend not to make too many mistakes. Stage 2: They use a much greater number of verbs in the past tense, some of which are irregular but most of which employ the regular past tense ending of “-ed” − During this stage, they can generate a past tense for an invented word (e.g., “ricked”) − Children at this stage take a step backward and make mistakes on the past tense of irregular verbs that they had previously given correctly (overregularization errors), Ø Ex: saying, “gived” instead of “gave” Stage 3: They learn more verbs and cease to make overregularization errors Connectionist models of past tense acquisition have been developed that display a similar trajectory of language learning without having any rules explicitly coded in them Ø Rumelhard-McClelland network: Simple pattern associator that uses the perceptron convergence learning rule − Recall that a perceptron is a single-layer network that changes its weights and threshold based on feedback (supervised learning) − Network learns to associate specific input patterns, e.g., “n” sound, with specific output patterns − Training: o Network was initially trained on ten high-frequency verbs to simulate the first stage in past tense acquisition o Then it was trained on a much larger training set of 410 medium-frequency verbs of which 80% were regular ➜Network reproduced overregularization phenomenon v Criticism: Training set is dramatically increased at a certain point, and the set is predominantly made up of regular verbs, so that might contribute to the overregularization phenomenon observed Ø Plunkett-Marchman multilayer neural network model of tense learning – Similarly produced overregularization errors, but without sudden increase in size of training set – However, because this a neural network, it is not really biologically viable, since human neurons don’t engage in backpropagation ★ Main point though is that it is possible to devise neural networks that reproduce typical trajectory of language learning without having encoded into them explicit representation of linguistic rules ☞ Suggests that nativist view of language learning is not the only viable model Bayesian Language Learning Bayesian (probabilistic) models of language learning also argue against innatism – by showing how much can be learned through sensitivity to statistical regularities in heard speech Ø One of the most basic challenges in understanding speech is word segmentation: segmenting a continuous stream of sounds into individual words − In English, an actual physical event, such as a pause, marks a word boundary less than 40% of the time − How does 8-month-old infant (which is when this skill starts to emerge) figure out which combinations of syllables make words, and which ones don’t? − Can be explained by model of transitional probabilities The transitional probability between any two sounds is the probability that the second will follow the first sound High transitional probabilities will tend to indicate syllables occurring within a word, while low transitional probabilities will tend to occur across the boundaries of words Infants are exquisitely sensitive to the frequency of correlations, and they exploit this sensitivity to parse streams of sound into words − Transitional probabilities are also used by adults to map the boundaries of phrases Age and Language Learning In general, it is much easier to learn a language at an early age before “hardening of the categories” sets in Ø Ex: We are all born with the ability to recognize speech sounds or phonemes (e.g., b vs. p) from all the world’s languages but gradually lose this ability – Adult Hindi-speakers and young infants from English-speaking homes can easily discriminate two Hindi t sounds not spoken in English. By age one, however, English-speaking listeners rarely perceive this sound difference – Japanese speakers have difficulty distinguishing between English r and l – At birth or a few weeks after, infants can perceive almost all (95%) of the subtle phoneme differences in non-native languages. However, by 8-10 months, accuracy drops to 70% and, by 10-12 month, to 20% (Werker, J.F., 1989) Effect of Early Exposure of Language on Cognitive Development ✧ By age 3, a child growing up in poverty would have heard 30 million fewer words in his home environment than a child from a professional family (Hart, & Risley, 1995, 2003) ✧ Also, the greater the number of words children heard from their parents or caregivers before they were 3, the higher their IQ and the better they did in school ✧ TV talk not only doesn’t help, it is detrimental Ø Study found that infants (12-18 months) actually could not learn vocabulary by merely watching (bestselling) DVD in which spoken words were linked with appropriate objects (DeLoache, Chiong, Sherman et al., 2010) ➜ Some researchers have argued that the racial and socioeconomic gap in academic performance can be wholly accounted for by disparities in exposure to language alone! - New intervention programs target this problem Question: What happens if someone is not exposed to language in childhood? Can they still acquire language? They can learn vocabulary, but it will probably not be possible for them to fully master grammar Ø Genie: Secret of the Wild Child Neurolinguistic Neurolinguistics: study of relationship between the brain and language Hemispheric specialization Left hemisphere typically performs most language processing (95% for the right-handed; 50% for the left-handed) However, right hemisphere interprets a message’s emotional tone, decodes metaphors, and resolves ambiguities Word Processing in Reading Dual-route approach to reading: direct vs. indirect access hypotheses Do readers recognize a word directly from the printed letters (direct access)? Or do they convert the printed letters into a phonological code to access the word and its meaning (indirect access)? Ø Petersen, Fox, Posner, et al. (1988) explored this question in PET study − Condition 1 (looking): Participants asked to focus on a fixation point (a small crosshair) in the middle of a screen − Condition 2a (reading silently): Participants were presented with words flashed on the screen but told not to make any response − Condition 2b (listening): Participants listened to the same words being spoken − Condition 3 (reading out loud): Participants were asked to say out loud the word appearing on the screen − Condition 4 (speaking): Participants were presented with nouns on the screen and asked to utter an associated verb Ø Ex: Saying “turn” when presented with word “handlebars” Results: Areas of activation in Conditions 3 and 4 (speaking words) did not include areas in Condition 2a and 2b (reading silently and listening to words, respectively) Conclusions: Patterns of activation identified across the different tasks thus supported a parallel rather than a serial model of single-word processing In addition, the results support the direct access hypothesis: we do not need to sound words out (or subvocalize) to access meaning of words Moreover, research in general has indicated that, though readers use both direct and indirect access when reading, direct access is more efficient Skilled adult readers are more likely to use direct access Beginning and less skilled readers are likely to sound out words to understand meaning The direct and indirect approaches are reflected in two different types of dyslexia (learning disability that interferes with reading despite average or above average intelligence) Phonological dyslexia manifests as severe impairment in reading phonetic script (similar to alphabetic system), but preserved ability in reading pictographic script Surface dyslexia manifests as impairment in reading pictographic script (characters) Second Language Acquisition How do we learn language? Skill building hypothesis: Language is acquired as a result of learning language skills, such as vocabulary and grammar − Use skill building: learn grammar, study vocab lists, do drills, take tests − “No pain, no gain” ★ General public (and government) believe this is the way to learn language Comprehension hypothesis (Stephen Krashen): language skills such as vocabulary and grammar, result from language acquisition – we acquire language in one and only one way: when we understand messages − Use “comprehensible input”: listen to stories, read books, have conversations, watch movies − Immediate gratification: have a good time – the more you enjoy it, the better your comprehension will be ☞ Comprehensible input has won in pretty much every single study comparing the two methods Some evidence in support of comprehensible input Complexity of language learning, e.g., of vocab and grammar, wipes out skill building as a possibility Ø Average native English speaker knows 40,000+ words Study found that second language readers who read a lot have larger vocabularies than native speakers who didn’t read a lot It’s possible to acquire language without any conscious learning Implications: What is of primary importance is not pushing a person to speak from Day One Rather, it is listening, picking up comprehensible input Natural language approach to second language learning 1) Storytelling (Jeff Brown) Find language partner who is fluent in language you are trying to learn, e.g., friend, family member, or language exchange partner Find magazines (20%), then children’s stories (80%) with tons of pictures, e.g., travel magazine with pictures related to travel, food, clothing, etc. Language partner is not going to translate story; will instead describe pictures and ask you simple questions, and you will ask simple questions − Partner doesn’t just describe what is in pictures: they describe the picture the way they might to a young child they love Ø Ex: “This is a spare tire. A spare tire is very important. You could have a blowout and then you would need to use that spare tire.” Rules to tell language partner − No English o If we don’t understand each other, we’ll use gestures and act or draw o If we still can’t understand, we’ll say “It’s not important” in target language − No grammar: don’t teach me any grammar − No corrections: don’t correct me at any time Factors that affect language acquisition Motivation: positive correlation Self-esteem: positive correlation Anxiety: negative correlation ➜ For language acquisition to really succeed, anxiety should be zero Affective filter − Somewhere in brain is a language acquisition device, according to Chomsky. Our job is to get input into that device. High anxiety blocks the input. Ø If a student thinks language class is a place where his weaknesses will be revealed, he may understand the input, but it won’t penetrate 2) Read, read, read – whatever is of interest to you v “Free voluntary reading is the most powerful tool we have in all of language education” Read things that you’re passionate about Ø Watch baseball game, then go read about it in newspaper ☞ This is also the key to teaching kids who don’t want to learn to read TPR (Total Physical Response): acquiring a language through movement Ask language partner to give you a list of commands or actions Ø Ex: Jump, walk, run, turn around, sit down, stand up, dance, talk, yell, complain, look, watch TV, turn TV on, turn TV off, cry, laugh Can use hands to pantomime movements Can also use gestures to represent words Do 50-100 per session Social context of speech: learning language is not merely a matter of learning vocabulary and grammar – the goal is not just to express one’s thoughts but must take into account other people’s thoughts, feelings, and beliefs Establishing common grounds We have social rules for the format of our conversations (“winding down” a conversation before saying “good-bye” on the phone) “Sex brought us together, but gender drove us apart.” Phrasing of directives, a sentence that requests someone to do something – Could you give me a ride? vs. Will you give me a ride? Use of indirect speech acts – I wonder if there’s any butter in the refrigerator? and It’s cold in here There are gender differences in the use of directives Language and Thought v Be impeccable with your word: Speak with integrity. Say only what you mean. Avoid using the word to speak against yourself or to gossip about others. Use the power of your word in the direction of truth and love. – Don Miguel Ruiz (The Four Agreements) Sapir-Whorf hypothesis: view that language determines thought Underlying assumption of use of affirmations in cognitive therapy Ø Language and color: - Natives in New Guinea who have words for two different shades of yellow more speedily perceive and better recall variations between the two yellows (Davidoff, Davies, & Roberson, 1999) - Those who speak Russian, which has distinct names for various shades of blue, remember the blues better (Winawer, Witthoft, Frank et al., 2007) People who are bilingual may think differently in different languages Bilinguals reveal different personalities when taking the same personality test in their two languages (Chen & Bond, 2010) Ø China-born students at University of Waterloo were asked to describe themselves in English or Chinese – When describing themselves in English, they expressed mostly positive self-statements and moods – When responding in Chinese, they reported more agreement with Chinese values and roughly equal positive and negative self-statements and moods (Ross, Xun, & Wilson, 2002) Bilinguals often switch languages, depending on which emotion they want to express (Chen, Kennedy, & Zhou, 2012) When responding in their second language, bilingual people’s moral judgments reflect less emotion – they respond with more “head” than “heart” (Costa, Foucart, Hayakawa et al., 2014) Speech Perception and AI Speech perception is an extremely complicated process (which is why computer voice recognition systems are often problematic!) Need to separate voice of speaker from irrelevant background noises, which might include other simultaneous conversations Pronunciation varies, depending on vocal characteristics of speaker Speakers often slur or mispronounce words Pronunciation of specific phoneme depends in large on previous and following phonemes, e.g., d in idle vs. d in don’t As mentioned, an actual physical event, such as a pause, marks a word boundary less than 40% of the time Ø Children’s mispronunciations of lyrics in Christmas carols and Pledge of Allegiance People use visual cues to facilitate speech perception Ø Study in which participants watched video of woman making one sound (ga) while different sound played (ba) ➜ Responses reflected compromise (participants reported hearing da) (McGurk & MacDonald, 1976) Lipnet, developed by team at Oxford’s AI lab, can now also lipread (i.e., translate lip movements to text) with 95% accuracy Computers can now replicate human voices extremely accurately Company Lyrebird has created program that can replicated voices of people, including powerful political figures, after analyzing only one minute of audio More challenging to replicate associated facial movements Creating video of Obama required 14 hours of Obama high quality footage to train system to translate audio into mouth shapes Natural Language Processing in AI ELIZA: early computer program that could engage in some very elementary forms of conversational exchanges Not capable of anything that really resembled linguistic understanding Simply programmed to respond to certain cues by making one of a small set of responses Basic idea behind ELIZA was to create the illusion of conversation by rephrasing statements as questions and by programming the computer to give certain fixed responses where this is not possible Depending upon who one asks, ELIZA was either based upon or intended to parody typical conversational exchanges between psychotherapists and their patients Terry Winograd’s computer model SHRDLU Initially presented in his 1970 doctoral dissertation at MIT Illustrates how grammatical rules might be represented in a cognitive system and integrated with other types of information about the environment One of the first attempts to write a program that was not just trying to simulate conversation, but was capable of using language to − Report on its environment − Plan actions − Reason about the implications of what is being said to it Programmed to deal with a very limited virtual microworld – everything takes place on a computer screen Micro-world consists simply of a number of colored blocks, colored pyramids, and a box, all located on a tabletop SHRDLU can carry out various actions through a (virtual) robot arm, e.g., pick up blocks and pyramids, move them around, and put them in the box In order to do that, SHRDLU needs to be able to carry out: − Syntactic analysis: needs to parse the sentence to work out which units in a sentence are performing which linguistic functions, e.g., nouns (picking out objects) and verbs (characterizing events and processes) − Semantic analysis: needs to assign meanings to individual words in way that reveals what sentence is stating − Integration of the information acquired with the information the system already possesses in order to obey a command or answer a question Example of how SHRDLU represents different types of knowledge: One of words in SHRDLU’s vocabulary is CLEARTOP We can say that something (say, a block) is CLEARTOP when it does not have anything on it CLEARTOP can also function as a command to remove anything resting on the block CLEARTOP is represented by procedure: ☞ SHRDLU illustrates how linguistic understanding can result from the interaction of many independently specifiable cognitive processes Animals’ Use of Language Can animals learn language? Chimpanzees Baby chimpanzee, Washoe, was taught American Sign Language – By age of five, she had learned more than 130 signs Other chimpanzees trained to use lexigrams (geometrical shapes displayed on a keyboard and linked to a computer) as words showed performances similar to Washoe In general, chimpanzees seem to be able to acquire a vocabulary and comprehension roughly equivalent to a 2-1/2- year-old child Gorillas Koko had a working vocabulary of over 1000 signs Understood approximately 2,000 words of spoken English African grey parrot Alex – a parrot with an “attitude” that apparently had an understanding of what he said Could correctly answer questions about an object’s shape, color, or material Answers to language trivia questions: Average adult has an active vocabulary of around 20,000 words and a passive vocabulary of around 40,000 words − That averages to nearly 7 new words per day between ages 2 and 18! Linguists estimate that 6500 languages exist in the world today – 250 languages are spoken by more than 1 million people – Only 600 languages have speaking populations robust enough to support their survival past the end of the century. Languages need at least 100,000 speakers to survive the ages. 66% of the world’s children are raised as bilingual speakers; only 6.3% of U.S. residents are bilingual Video References Videos excerpted from: Show «The Incredible people». Bella Devyatkina. Polyglot https://www.youtube.com/watch?v=KXGmt0dusdo The Secret of the Wild Child (1994) https://archive.org/details/the-secret-of-the-wild-child-1994 The linguistic genius of babies - Patricia Kuhl https://www.youtube.com/watch?v=M-ymanHajN8 British Council Interviews Stephen Krashen part 1 of 3 https://www.youtube.com/watch?v=UgdMsOcSXkQ Stephen Krashen on Language Acquisition https://www.youtube.com/watch?v=NiTsduRreug Top 5 Uses of Neural Networks! (A.I.) https://www.youtube.com/watch?v=i9MfT_7R_4w Fake videos of real people -- and how to spot them https://www.ted.com/talks/supasorn_suwajanakorn_fake_videos _of_real_people_and_how_to_spot_them?language=en Wise elephant http://fun.mivzakon.com/video/General/2344/2344.html Suda - The Painting Elephant https://www.youtube.com/watch?v=foahTqz7On4

Use Quizgecko on...
Browser
Browser