Learning the Structure of Sentences PDF

Learning the Structure of 6 Sentences A chieving a vocabulary of 60,000 words or more is an impressive learning feat. But it’s not nearly as impressive as the fact that you combine these words in deft and creative ways. To get a quick feel for the scale of combinatorial possibilities that language offers, consider chemistry: with a measly 118 elements in the periodic table, there are trillions of known molecules that combine these elements. Just think what you can do with 60,000 units! The combinatorial power of language allows you to convey entirely novel ideas that have never been expressed in language before. I’m guessing that you’ve never heard the following sentence: It was all because of the lucrative but internationally reviled pink hoodie industry that the president came to abandon his campaign promise to ensure that every household parrot had recourse to free legal counsel. This sentence may be weird, but chances are you had no trouble understanding it. On the other hand, you’d have no hope of understanding that sentence if its words were served up in this order: Industry ensure because that internationally reviled had legal household parrot was it abandon all pink president every of campaign promise the but lucrative hoodie the came to his to that counsel recourse to free. Clearly, being able to string multiple linguistic units together is not enough—they can’t just be tossed together in a bag. There has to be some underlying order or structure. That is, language has a syntax, a set of rules or constraints for how the units can be put together. The syntactic structure of a sentence is intimately tied to its linguistic meaning, or its semantics. You can combine the same assortment of words multiple ways with strikingly different results. As any small child learns, the justice systems of the world care deeply about the difference between the sentences Susie punched Billy and Billy punched Susie. The tight coupling between and semantics makes these distinctions possible. syntax The structure of a sentence, specifying how the words are put together. Also refers to a set of rules or constraints for how linguistic elements can be put together. semantics The meaning of a sentence; the system of rules for interpreting the meaning of a sentence based on its structure. Syntactic structure is extraordinarily complex—so much so that the efforts of many linguists over numerous decades have not succeeded in exhaustively describing the patterns of even one language (English being the most heavily studied to date). It is one of the least-understood aspects of linguistics. And yet small children master the bulk of it in their first few years of life, before they are seen fit to learn even the most rudimentary aspects of arithmetic. Figuring out how they do this, and what they know about syntactic structure at any given stage, is fraught with theoretical and methodological challenges. Researchers who take on these challenges need a healthy appetite for complexity and uncertainty. The combinatorial nature of language becomes evident in children’s speech at a fairly young age. Typically, children move beyond the single- word stage and combine two words at around 18 months of age, and by about 2 years they can speak in telegraphic speech (see Box 6.1), which preserves the correct order of words in sentences but drops many of the small function words such as the, did, or to. At this stage, “kid talk” sounds a bit like the compressed language used in telegrams or text messages: Mommy go store now. Doggie no bite finger. Six months later, by the age of two and a half, most children are speaking in full sentences, though still simple ones. By the time they enter kindergarten, they have mastered virtually all of their language’s syntax. telegraphic speech Speech that preserves the correct order of words in sentences but drops many of the small function words, such as the, did, or to. What’s most interesting is that at no point do kids seem to entertain the possibility that sentences can be made by throwing words together in just any order. Even when their output is restricted to two words, they never seem to violate the basic word order patterns of their language. English- speaking children, for example, always place the subject in front of the verb and the object after the verb, so even in the speech of toddlers, Susie eat means something different than Eat Susie. But figuring out what’s inside children’s heads as they learn to combine words is no easy matter. In this chapter, we’ll explore some ideas about what it is that children have to learn about their language’s structure, what their syntactic knowledge might look like at various stages, and what it is that allows them to ultimately learn such an intricate and complicated system. 6.1 The Nature of Syntactic Knowledge I began this chapter by presenting you with a tossed “word salad” to illustrate how you need the glue of syntax to hold meaning together. But you don’t have to look at long strings of words to get a sense of the importance of syntax. You just have to look at compound nouns. Compositionality In Chapter 5, we talked about how simple nouns can be combined into more complex compound nouns such as houseboat or insurance policy. It turns out that this is one of the rarer instances in language where words get combined in a non-compositional way. Compositionality is the notion that there are fixed rules for combining units of language in terms of their form that result in fixed meaning relationships between the words that are joined together. At heart, the idea is similar to the notion of operations in simple arithmetic: once you learn what addition and subtraction do, you can, in principle, add any two (or more) numbers together, or subtract any number from another. You may never have seen the equation: BOX 6.1 Stages of syntactic development C hildren begin to combine words shortly after their first birthday, when they have about 50– 60 words in their vocabulary. Their utterances gradually become more complex, and some regular patterns show up across children in their progression to more complex syntactic forms. One of the most detailed studies of language development was published in 1973 by Roger Brown, who based his analyses largely on data from weekly recordings of three children taken over a period of several years. Brown found that children of the same age varied a good deal in terms of the syntactic elements they produced in their own speech. For example, by the age of 2 years and 2 months, a little girl named Eve was already producing various prepositions and complex words like mommy’s, walked, and swimming, while at the same age, her peers Adam and Sarah were still eking out telegraphic speech consisting only of content words unembellished by grammatical morphemes. A much better way than sheer age to predict a child’s repertoire of grammatical markers is the measure of mean length of utterance (MLU). This refers to the average number of morphemes in a child’s utterances measured at a given point in time (i.e., at a specific age). Here are some examples of how MLU is computed: mean length of utterance (MLU) The average number of morphemes in a child’s utterances at a given point in the child’s development. Daddy’s porridge allgone. 1 2 3 4 MLU = 4 My mommy holded the rabbits. 1 2 3 4 5 6 7 MLU = 7 Daddy went to the store. 1 2 3 4 5 6 MLU = 6 Based on his analyses, Brown noticed that function words and suffixes tended to emerge in a fairly consistent sequence across children—and that this sequence didn’t necessarily match the frequency with which they were spoken by the children’s parents. He identified five stages of children’s early syntactic development defined by MLU. The table presents a summary of the approximate ages and inventory of grammatical morphemes at each of these stages. Roger Brown’s five stages of syntactic development Stage Age in Overall Morphemes present Examples months MLU I 15–30 1.75 Content words only More juice. Birdy fly. Here book. II 28–30 2.25 Present progressive I falling. in, on Dolly in. Plural -s Eat apples. Baby fell III 36–42 2.75 Irregular past tense down. Mommy’s Possessive -s hat. Is Daddy Full form of “to be” sad? IV Articles This is the 40–46 3.5 mommy. I holded it. Regular past tense - You fixed ed it. Third person regular He likes (present tense) me. Third person She does it V 42–52+ 4.0 irregular (present fast. tense) Full form of “to be” Was she as auxiliary verb swimming? Contracted “to be” as He’s nice. main verb Contracted “to be” as He’s auxiliary verb swimming. After Brown, 1973. A first language: The early stages. Harvard Univ. Press. Cambridge, MA. compositionality The concept that there are fixed rules for combining units of language in terms of their form that result in fixed meaning relationships between the words that are joined together. 457,910,983.00475386 + 6,395,449,002.03 = x and you can’t possibly have memorized such an equation in the way you might have memorized the “fact” that 5 + 2 = 7. But you can easily compute it, just as you can compute any combination of numbers joined together by an arithmetic operator. The same thing applies to simple sentences like Susie punched Billy. The operation of joining a subject (Susie) together with the phrase punched Billy yields a predictable meaning result. We know that Susie is the individual who initiated the action of punching, and Billy is the unfortunate individual at the receiving end of the punch. If we replace the word Susie with Danny, then Danny now stands in exactly the same role in the sentence that Susie used to; changing which word appears in the subject slot doesn’t allow the possibility that the new occupant of that slot (Danny) now refers to the recipient of the action rather than its initiator. We can do the same kind of replacement with the object—slide Fred into the slot where Billy used to be, and now Fred is the recipient of the punching action. These examples seem simple and obvious, but this notion of predictability of meaning from the way the parts are put together is at the very heart of the combinatorial nature of language. And it works for excruciatingly complex sentences— there’s no reason to believe that there isn’t the same kind of tight, predictable relationship between the structure and meaning of more convoluted sentences. But, interestingly, noun-noun compounds don’t behave in this rigid, predictable fashion. For example, consider these words: houseboat, housewife, houseguest, housecoat, house arrest, house lust. Despite the fact that the same structural relationship exists between the component words in all of these compounds, there isn’t a uniform semantic relationship: a houseboat is a boat that is also a house, but a housewife is certainly not a wife that is also a house. While you could construe both a housewife and a houseguest as living in a house (at least some of the time), this certainly can’t apply to a housecoat, which is something you use in a house. And house arrest and house lust fit none of these. For the sake of comparison, let’s see what happens when you join words that do stand in a compositional relationship to one another, say, an adjective with a noun. Consider the very diverse phrases red dog, corrupt executive, long book, and broken computer. A red dog is a dog that is red. A corrupt executive is an executive that is corrupt. A long book is a book that is long, and so on. In fact, any time you join an adjective with a noun, the adjective serves the purpose of identifying a property that the noun possesses. For the more mathematically inclined, we could say that a phrase formed by grouping an adjective with a noun corresponds to a set of things in the world that is the intersection of the sets picked out by the adjective and the noun—in other words, red dog refers to the set of things in the world that are both red and dogs. Given that this holds for any combination of an adjective and noun, the process of joining these two kinds of words is fully compositional. (I’ll leave aside for the moment tricky cases like small elephant: does this refer to the set of things that are both small and an elephant?) To see how useful compositionality can be, watch what happens when you come across a phrase involving a noun that you don’t know, let’s say a red dax. You may not know what a dax is, but you know that it’s colored red. But what does house dax mean? Is it a dax for a house? A dax that lives in a house? A dax that is a house? You can’t tell without more information. Even if you do know both of the words in a newly coined compound (such as house book), it can take a lot of guesswork to figure out the actual relationship between the two parts (for example, try to work out the meanings of the novel compounds in Table 6.1). TABLE 6.1 Novel noun-noun combinations Take a stab at describing a plausible meaning for each of these novel combinations of nouns when they are made into noun-noun compounds (and if more than one meaning comes to mind, provide several). What range of different semantic relationships between the two words do you find yourself drawing on? rabbit phone paper candy flower card rain paint wallet plant window cup book dirt computer organ You can see from these simple examples how a lack of compositionality would seriously hinder the usefulness of sticking words together into longer units. Essentially, you’d have to memorize the relationships between words, or at best infer them by running through some plausible possibilities. But you couldn’t compute them the way you can compute the result of an arithmetic equation. Creating meanings non-compositionally is uncertain enough with combinations of just two words. Now scale that up to sentences of 10 or 20 or 52 words (the last is probably the average length of a sentence in a Henry James novel) and you can see why compositionality plays such an important role in the expressive powers of language. Still, the existence of noun-noun compounds points to the fact that it is possible to combine words in ways that don’t reflect a general rule—in this case, although the combinations are syntactically uniform, always combining a noun with another noun, they don’t result in a fully general semantic pattern. But regularities do exist; there are several common semantic relations that tend to occur over and over again. For example, many noun-noun compounds express a part-whole relationship: computer screen, car engine, door handle, chicken leg, wheel rim, and so on. (Notice that the second noun is always a part of the first; a chicken leg is most definitely not the same thing as a leg chicken—whatever that means.) Another common semantic relation is one in which the second noun is a thing for the benefit of the first: baby carriage, dog bed, cat toy, student center, employee insurance. Some generalization by analogy is possible, even if it is looser than more compositional kinds of meaning combinations. One way to think about the distinction between compositional and non- compositional meanings is in terms of the words-versus-rules debate you met in the discussion of past tense and plural forms in Chapter 5. In that chapter, we saw that researchers came to intellectual blows over a controversy about whether complex words marked as regular plural or past- tense forms (such as dogs and walked) get formed as the result of a general rule that creates larger units out of smaller morphemes, or whether they arise by memorizing the complex words and then extending their internal structure by analogy to new examples. Memorizing meanings and extending them by analogy is exactly what we have to do with the meanings of noun-noun compounds, since there doesn’t seem to be one fixed rule that does the trick. So it’s important to take seriously the possibility that other ways of combining words might also be achieved by means other than a rigid combinatorial rule over abstract categories. In fact, some researchers have argued that what looks like rule-based behavior (for example, making combinations like red ball) is really just an extension of the memorize-and-extend-by-analogy strategy used in combining units like coffee cup. It’s just that it’s more regular. Others see a very sharp division between constructions like noun-noun compounds, whose meanings depend on the specific nouns that occupy those slots, and more abstract structures such as adjective-noun pairings. Researchers in this camp often argue that the full complexity and creativity of adult language would be impossible to achieve without the benefit of something akin to rules. But even if it did turn out to be true that adult speakers accomplish most of their word combinations by means of rules, this doesn’t necessarily mean that combinations of units at all ages are accomplished the same way. This issue is an important one to keep in mind throughout the chapter. Basic properties of syntactic structure Regardless of whether researchers believe there’s a sharp split between compositional and non-compositional meanings, or whether they see the two as occupying two ends of a single continuum, they agree that eventually children have to develop some mental representations that are highly abstract, applying to a broad range of words of the same category in such a way as to allow an endless variety of new combinations. What are some of the properties of such abstract knowledge? And what kinds of categories does it involve? One of the most basic things that knowledge of language structure needs to do is constrain the possible combinations of words into just those that are meaningful. For example, given the bag of words cat, mouse, devoured, the, and the, we need to be able to distinguish between an interpretable sentence like The cat devoured the mouse and other combinations that don’t yield a meaning at all, and are not considered to be possible combinations (marked by an asterisk as not legal or meaningful within a language): *Cat mouse devoured the the. *Cat the devoured mouse the. *The cat the devoured mouse. *The the cat devoured mouse. And we’d like our structural templates to apply not just to this bag of words, but to other bags of words, such as teacher, kid, loves, every, and that, to yield meaningful combinations like: Every teacher loves that kid. and to rule out bad combinations such as: *Every that teacher loves kid. *Teacher kid loves every that. *That every teacher kid loves. Hence, our structures need to be stated in terms of useful category labels as opposed to individual words like teacher or cat—otherwise, we’d never be able to generalize across sentences, and we’d need to learn for each word the possible structures it can fit into. So, it becomes useful to identify words as belonging to particular syntactic categories. Let’s propose some categories, and some useful abbreviations: Det—determiner {the, every, that} N—noun {cat, mouse, teacher, kid} V—verb {loves, devoured} The first line means that Det stands for the category determiner, which is a set that contains the words the, every, and that. Noun (N) is a set that contains cat, etc.; and V stands for the category verb. Using these categories as variables, we can now propose a rule that would limit the possible combinations of units, as specified by the following template: Det-N-V-Det-N This template allows us to indicate the correct way of stringing together these words and to rule out the meaningless ways. But templates that merely specify the correct word order wouldn’t get us very far. Imagine you’re a child learning about the syntactic possibilities of your language, and you’ve come up with the linear sequence as stated above. Now, you encounter the following sentences: She loves that kid. Every kid loves her. There’s no way of fitting these new examples into the sequence Det-N-V- Det-N. You might conclude that now you need another syntactic category that includes pronouns. And of course you’d have to specify where pronouns are allowed to occur in a sentence. So, you might add to your collection of possible templates: Pro-V-Det-N Det-N-V-Pro where Pro stands for pronoun. And, if you encountered a sentence like We love her, you’d have to add a second pronoun: Pro-V-Pro So now you have a minimum of four templates to capture these simple sentences. But the fact that there are four separate sequences misses something crucial: ultimately, structure is as systematic as it is because we want to be able to predict not just which groupings of words are legal, but also what the groupings mean. And, it turns out, the relationship between she and loves that kid is exactly the same as between the teacher and loves that kid. This is obscured by having separate templates that specify what Det + N can combine with and what Pro can combine with. In theory, they should be able to come with different specifications for the resulting meanings—much as the different sequences V + N (e.g., eat cake) and Det + N (the cake) have very different meanings. In other words, if Det + N and Pro are really distinct linguistic units, as is implied by the representations we’ve create so far, it might be possible for The teacher loves that kid to mean that the person doing the loving is the teacher, while She loves that kid might mean that whoever she refers to is the recipient of the kid’s love. But this sort of thing never happens in languages. And what’s more, if you looked at all the places where pronouns are allowed to occur, you’d find that they happen to be the exact same slots where Det + N are allowed to occur. What’s needed is some way to show that the syntax of English treats Det + N and Pro as equivalent somehow—in other words, that Det + N can be grouped into a higher-order category, of which Pro happens to be a member. Let’s call this a noun phrase, or NP. noun phrase (NP) An abstract, higher-order syntactic category that can consist of a single word or of many words, but in which the main syntactic element is a noun, pronoun, or proper name. As soon as we catch on to the notion that words can be clumped together into larger units, called constituents, our syntactic system becomes extremely powerful. Not only can we characterize the patterns of structure and meaning in the preceding examples, but we can explain many aspects of syntactic structure that would otherwise be completely mysterious. For instance, the syntax of English allows certain phrases to be shuffled around in a sentence while preserving essentially the same meaning. Some examples: constituent A syntactic category consisting of a word or (more often) a group of words (e.g., noun phrase, prepositional phrase) that clump together and function as a single unit within a sentence. Wanda gave an obscure book on the history of phrenology to Tariq. Wanda gave to Tariq an obscure book on the history of phrenology. An obscure book on the history of phrenology is what Wanda gave to Tariq. Here the phrase an obscure book on the history of phrenology acts as a single clump. It’s impossible to break up this clump and move only a portion of it around; the following just won’t work: *Wanda gave an obscure to Tariq book on the history of phrenology. *An obscure book is what Wanda gave to Tariq on the history of phrenology. The reason these don’t work is that the unit that’s allowed to be moved corresponds to a higher-order constituent, an NP. The entire phrase an obscure book on the history of phrenology is an NP, so it has to move as a unit. (Notice the NP slot, indicated by brackets, could easily be filled by a much simpler phrase, as in Wanda gave to Tariq [a puppy]/Wanda gave [a puppy] to Tariq. Again, this illustrates the equivalence of NPs in the syntax, regardless of whether they’re instantiated as very simple or very complex phrases.) This movement pattern can easily be captured with a notion of constituent structure—for example, we could specify a rule that allows NPs to be moved around to various syntactic positions. But the pattern would seem rather arbitrary if our syntactic knowledge merely specified the linear order of individual words, because a rule based solely on linear order would give no indication of why the string of words should be cut exactly where it is. (You might be interested to know that music, like language, is often described as being structured in terms of constituents, as illustrated in Figure 6.1.) Figure 6.1 We intuitively group words into phrases, or constituents. It’s been argued that music is structured in a similar way, grouping notes together into constituents based on perceived relationships between pitches and chords. One effect of this is that pausing in the middle of a musical phrase or constituent makes a sequence of notes feel “unfinished” or unresolved. The figure shows an analysis of a musical phrase into its constituent parts. (From Lerdahl, 2005. Tonal pitch space. Oxford University Press. Oxford, UK.) One of the best arguments for the idea that sentences have internal structure rather than just linear ordering comes from the fact that the same string of words can sometimes have more than one meaning. Consider the following famous joke, attributed to Groucho Marx: Last night I shot an elephant in my pajamas. What he was doing in my pajamas, I’ll never know. Not a knee-slapper, perhaps, but the joke gets its humor from the fact that the most natural way to interpret I shot an elephant in my pajamas turns out to be wrong, creating a jarring incongruity (incongruity being an essential ingredient of many jokes). On a first reading, most people don’t group together an elephant in my pajamas as an NP unit, for the simple reason that this reading seems nonsensical. So they assume that the phrase in my pajamas is separate from the NP an elephant. That is, they assign the grouping: Last night I shot [NP an elephant] in my pajamas. rather than the grouping: Last night I shot [NP an elephant in my pajamas]. But the joke ultimately requires you to go back and re-structure the sentence in a different (and very odd) way. The rapid-fire decisions that people make about how to interpret ambiguous structures (see Table 6.2) is a fascinating topic that we’ll explore in Chapter 9. But for the time being, notice how useful the idea of constituent structure can be. If sentences were defined only by word-order templates rather than being specified in terms of their underlying structure, examples like the Groucho Marx joke would seriously undermine the possibility of there being a systematic relationship between structure and meaning—the same word-order template would need to somehow be consistent with multiple meanings. WEB ACTIVITY 6.1 Constituent structure in music In this activity, you’ll explore some parallels between the structure of sentences and the internal structure of music. https://oup-arc.com/access/content/sedivy-2e-student-resources/sedivy2e- chapter-6-web-activity-1 In fact, assigning two possible structures to the string I shot an elephant in my pajamas also explains an interesting fact, namely, that one of the possible meanings of the sentence evaporates if you do this: In my pajamas, I shot an elephant. Now, it’s impossible for the elephant to be sporting the pajamas; it can only be the speaker who’s wearing them. This makes absolute sense if you know about constituent structure and you also know that the rules that shuffle phrases around are stated in terms of higher-order constituents. Remember that under the pajama-wearing elephant reading, an elephant in my pajamas makes up an NP unit, so it can’t be split up (much as in the examples earlier about Wanda and what she gave to Tariq). Splitting apart an elephant from in my pajamas is only possible if these two phrases form separate constituents, as is the case in the more sensible reading of that sentence. TABLE 6.2 Syntactic ambiguities Can you identify at least two meanings associated with each sentence? Some meanings are clearly more plausible than others. The children are ready to eat. You should try dating older women or men. He offered the dog meat. What this company needs is more intelligent managers. Jonathan has given up the mistress he was seeing for three years, to the great dismay of his wife. Now you can enjoy a gourmet meal in your sweatpants. Why did Joanie buy the frumpy housewife’s dress? LANGUAGE AT LARGE 6.1 Constituent structure and poetic effect I ’ve shown that grouping words into larger constituents can account for the equivalence of units like the elephant in my pajamas and she in the larger frame of the sentence. I’ve also argued that without constituent structure, we have no way of explaining why a string of words can have two quite different meanings. But there’s additional evidence that we chunk words into constituents. When you utter a long sentence, notice where you’re most likely to take a breath or pause slightly. No one, for example, can read the following sentence from a Henry James novel (stripped here of its original commas) without slight breaks: He received three days after this a communication from America in the form of a scrap of blue paper folded and gummed not reaching him through his bankers but delivered at his hotel by a small boy in uniform who under instructions from the concierge approached him as he slowly paced the little court. The slashes show where you’re most likely to insert brief pauses: He received three days after this / a communication from America / in the form of a scrap of blue paper folded and gummed / not reaching him through his bankers / but delivered at his hotel by a small boy in uniform / who under instructions from the concierge / approached him / as he slowly paced the little court. Notice that these breaks line up with boundaries between separate clauses or large phrases. As you’ll see in this chapter, large chunks, or constituents, are in turn made up of smaller chunks. Pauses are most natural between some of the largest constituents in a sentence. The deeper down into the structure you go, the less likely there are to be breaks between words. It would be distinctly odd, for instance, to break up the first few phrases like this: He received three days after / this a communication from / America in the form of a / Happily, Henry James made generous use of commas to help his readers group words into the right constituents.* In poetry, although commas may be used, line breaks are often used to even more strongly set off phrases from one another, as in this stanza from Wilfred Owen’s poem “Dulce et Decorum Est”: Bent double, like old beggars under sacks, Knock-kneed, coughing like hags, we cursed through sludge, Till on the haunting flares we turned our backs And towards our distant rest began to trudge. Men marched asleep. Many had lost their boots But limped on, blood-shod. All went lame; all blind; Drunk with fatigue; deaf even to the hoots Of tired, outstripped Five-Nines that dropped behind. With the exception of the second-to-last line, all of Owen’s line breaks segment separate clauses or sentences, carving the stanza at its most natural joints. But poets can skillfully leverage the expectations their readers have about the most natural carving joints in language by creatively violating these expectations (you know what you’ve always heard about what you can do with the rules once you know them). The technique of breaking up a line inside of a natural constituent is called enjambment. Sometimes, it has the effect of forcing the reader to jump quickly to the next line in order to complete the constituent. The poet e. e. cummings uses enjambment (often even splitting up words) to enhance the manic feel of his poem “in Just-”: in Just- spring when the world is mud- luscious the little lame balloonman whistles far and wee and eddieandbill come running from marbles and piracies and it’s spring In this next poem—a work by William Carlos Williams titled “Poem”—constituents are broken up in a way that creates an effect of freeze-framing the deliberate motions of a cat: As the cat climbed over the top of the jamcloset first the right forefoot carefully then the hind stepped down into the pit of the empty flowerpot. In fact, the line breaks and the effect they create are the main reason we think of this piece as a poem in the first place. Think about the poetic effects of breaking up constituents in other poems you encounter. A few examples, easily found on the internet, are: “We Real Cool” by Gwendolyn Brooks “The Waste Land” by T. S. Eliot (even the first 6 lines use a striking pattern of enjambment) “Providence” by Natasha Tretheway * In case you’re curious, with James’s punctuation in place the sentence reads as follows: “He received three days after this a communication from America, in the form of a scrap of blue paper folded and gummed, not reaching him through his bankers, but delivered at his hotel by a small boy in uniform, who, under instructions from the concierge, approached him as he slowly paced the little court.” (Henry James, The Ambassadors, 1903, p. 182.) There is no consensus about the exact way to represent the syntactic structures of even quite simple sentences. Two different approaches (rule- based and constructionist) are illustrated in Box 6.2. But for the most part, researchers tend to agree on a number of important points, including the following: Our knowledge of structure is generative; that is, whatever we know about language structure allows us to recognize and generate new examples of never-before-encountered sentences. generative With respect to language, a quality of language that allows us to use whatever we know about language structure to recognize and generate new examples of never-before-encountered sentences. Our knowledge of language structure must be hierarchical—that is, it needs to reflect the fact that words group together into constituents, which in turn can group together with other words or constituents to form larger constituents. hierarchical Top-down (or bottom-up) arrangement of categories. With respect to language, a quality that involves how words group together into constituents, which in turn can group together with other words or constituents to form ever-larger constituents. BOX 6.2 Rules versus constructions T wo different approaches to syntax are captured by rule-based accounts and constructionist accounts. Rule-based accounts posit a sharp boundary between memorized lexical representations and abstract rules that combine elements in a compositional way. They’re like conventional recipes that provide a detailed list of ingredients and then separately list the procedures for combining those ingredients. Each lexical “ingredient” includes details about its meaning, but also a tag that specifies its syntactic category (Det, Pro, N, V, etc.). Syntactic rules refer to these tags, but don’t care about most of the other details that are part of the lexical representations. One way to capture syntactic procedures is through phrase-structure rules, which provide instructions for how individual words can be clumped into higher-order categories and how these are in turn combined together to create well-formed sentences. For example, to capture the fact that a determiner and a noun can be combined to form a noun phrase, we might write: NP → Det + N rule-based account A syntactic framework that posits a sharp boundary between memorized lexical representations and abstract rules that combine units in a compositional way. constructionist account A syntactic framework that rejects the notion of a strict separation between memorized lexical items and combinatorial procedures, and relies instead on structural templates that combine abstract information with detailed information regarding specific words or phrases. phrase structure rules Rules that provide a set of instructions about how individual words can be clumped into higher-order categories and how these categories are combined to create well- formed sentences. This rule applies blindly regardless of the meanings of nouns, allowing it to generate any of the phrases the table, an elephant, some ideas, several concertos, etc. Given that we can create more elaborate noun phrases like an elephant in my pajamas, we might amend the rule to look like the following, where the parentheses around the prepositional phrase (PP) indicate that it’s an optional element of a noun phrase: NP → Det + N + (PP) prepositional phrase (PP) A syntactic constituent, or higher-order category, that in English consists of a preposition (e.g., in, under, before) followed by an NP. In turn, we need to specify how a PP can be built, using a lexical item tagged as a preposition (P) combined with a noun phrase: PP → P + NP To build a verb phrase (VP), we need to specify that an item tagged as a verb (V) can optionally combine with a noun phrase and a prepositional phrase: VP → V + (NP) + (PP) And now we’re ready to build a whole sentence: S → NP + VP This small set of rules allows us to generate sentences like the following: [S [NP The elephant] [VP ate]] [S [NP The elephant [PP with [NP the scar]]] [VP caressed [NP the trainer]]]. [S [NP The trainer [VP fed [NP the elephant] [PP in his pajamas]]. [S [NP The trainer [VP fed [NP the elephant [PP in [NP my pajamas]]]]]. [S [NP The elephant [PP with [NP the scar]]]] [VP ate [NP the sandwich [PP with [NP the cheese [PP from [NP the store [PP down [NP the street]]]]]]]]]. Quite a few more rules would be needed to capture all possible sentences of English. But because such rules can apply broadly over a great many lexical items, a small set of them can generate a vast number of combinations. Constructionist accounts reject the notion of a strict separation between memorized lexical items and combinatorial procedures, emphasizing the cases in which structural knowledge does have to refer to detailed information about the specific items being combined. At the most extreme end are idioms, phrases whose meanings can’t be separated from the specific lexical items that make up the phrase—for example, the idiom “letting the cat out of the bag” describes a very different action from the compositional phrase “letting the hamster out of the bag.” A rule- based linguist might shrug and say that idioms are just longer, multiword lexical items that are not formed by rule. But constructionists point out that there is no clear boundary between idioms, whose idiosyncratic meanings have to be memorized, and compositional phrases, whose meanings are predictable based on their structure. For example: idiom A phrase with an idiosyncratic meaning (e.g. “let the cat out of the bag”) that cannot be predicted compositionally on the basis of the combination of its individual words. The apple tree came into bloom. The apple tree came into leaf. The apple tree came into fruit. The magnolia came into bloom. *The apple tree came into twigs *The apple tree came into aphid. Unlike the fixed phrase let the cat out of the bag, this frame allows for some flexibility—just not very much—allowing words like bloom, leaf, and fruit to be interchangeable. The following allows for yet more flexibility: A dog of his own A fancy car of my own Some money in the bank of his own A few good ideas of your own Three children of their own A doll that she loved more than anything in this world of her own This frame allows for many different substitutions, but still has some fixed elements to it. You might represent it with a mix of highly abstract and highly specific elements, somewhat like this: [NP] of [possessive pronoun] own Rather than rules, then, our syntactic knowledge might rely on constructions. Constructions are templates that specify the structure and corresponding meanings of phrases, and they can vary in how narrowly or broadly they apply. At one extreme, they require very specific words in each of their slots; at the other extreme, they can apply as generally as the phrase structure rules listed here. At its heart, this approach proposes that you don’t need deeply different types of knowledge and mental operations to deal with the well-behaved compositional meanings on the one hand and the idiosyncratic, exceptional ones on the other. constructions Templates that specify the structure and corresponding meaning of a phrase, and that vary in how narrowly or broadly they apply. The generative and hierarchical qualities of language allow for recursion, permitting syntactic embeddings that resemble loops that (in theory) could go on forever as in the examples below that embed NPs within NPs or clauses (S) within other clauses (but see Box 6.3): recursion Repeated iterations. With respect to language, refers to syntactic embeddings that nest constituents (such as clauses or NPs) within other constituents in a potentially infinite manner. Last night I shot [NP an elephant in [NP my pajamas]]. [NP The little girl who owns [NP a puppy with [NP a limp]]] lives next door. [S My best friend insisted that [S her brother would quit his job within days if [S he got annoyed with his boss]]]. Now back to the big question: How do kids ever figure out that their language has the syntactic structure it does? Over the course of hearing many sentences, they would need to register that certain words tend to occur in the same syntactic contexts, and group those words into the same type of syntactic category. They’d then have to notice that certain words tend to clump together in sentences, and that the various clumps occur in the same syntactic contexts, forming interchangeable constituents. They would also have to clue in to the fact that these interchangeable word clumps always have the same relationship to the meaning of the sentence, regardless of the clumps’ specific content. Finally, as we’ll see in more detail in Section 6.4, the kids would have to figure out the possibilities for moving constituents around in a sentence while keeping track of relationships between them over long spans of words. Any one of these learning tasks involves hearing and tracking lots and lots of sentences and then drawing quite abstract generalizations from these many sentences. WEB ACTIVITY 6.2 Rules and constructions In this activity, you’ll try to specify phrase structure rules and constructions that capture certain structural generalizations in English. https://oup-arc.com/access/content/sedivy-2e-student-resources/sedivy2e- chapter-6-web-activity-2 BOX 6.3 Varieties of structural complexity I n many languages, recursive structures are varied and abundant. For example, in English, recursion makes it possible to create billowing sentence like the opening of American Declaration of Independence: When in the Course of human events it becomes necessary for one people to dissolve the political bands which have connected them with another and to assume among the powers of the earth, the separate and equal station to which the Laws of Nature and of Nature’s God entitle them, a decent respect to the opinions of mankind requires that they should declare the causes which impel them to the separation. On a more modest scale, it allows us to create useful sentences like these: John’s brother’s house is big. Frog and Toad are friends. Cyprian will either marry his childhood sweetheart, or he’ll spend his life alone. Ernie thinks that Bert wants to get a pet pigeon. The letter that Juliet sent to Romeo went astray. If you looked only at European languages, it would be easy to assume that recursion was a core feature of language. But a much broader sampling of languages reveals that not all languages use much recursion—or perhaps even at all. Dan Everett, a linguist who worked among the Pirahã people of the Amazonian jungle (see Figure 6.2), has claimed that sentences like those above, with their embedded constituents, don’t exist in Pirahã (Everett, 2005). Instead of saying John’s brother’s house, speakers of this language would say something like: Brother’s house. John has a brother. It is the same one. Or, instead of saying The tiger got Jake and Lisa, a Pirahã speaker would say: The tiger got Jake. Lisa also. Everett’s claims about Pirahã have been challenged by other linguists who question his analysis of the language, and because of the limited access to this small, remote group of speakers, the controversy has yet to be resolved. But linguists have long noted that speakers of many languages produce sentences that are syntactically less complex than your typical English sentence; instead, they pack complexity into the structure of words. For example, the Mohawk word sahonwanhotónkwahse conveys as much meaning as the English sentence “She opened the door for him again.” And in English, you need two clauses (one embedded inside the other) to say “He says she’s leaving,” but in Yup’ik, a language spoken in Alaska, you can use a single word, Ayagnia. (Ayagniuq, in contrast, means “He says he himself is leaving”; Ayagtuq means, more simply, “He’s leaving.”). In general, languages show a trade-off between the complexity of structure of their words versus their phrases. And in many cases, the systems of complex words in these languages are riddled with exceptions and irregularities—exactly the kind of patterns that are hard to capture with general, fully abstract rules. This trade-off raises some deep questions about language and the mind: Does this mean that there is a natural continuum after all between words and phrases? Or, if there truly is a stark division between words and rules, have speakers of different languages acquired dramatically different language skills? Either way, it suggests that our capacity for language needs to be able to accommodate both types of complexity. Figure 6.2 The Pirahã people live on the banks of the Maici River in Brazil’s Amazon Basin. As of 2018, there were reportedly about 800 individuals in this small community. One of the most hotly debated issues in psycholinguistics is whether children are helped along in the learning process by genetic programming that outfits them with specific assumptions about how languages work. Such innate programming would no doubt make learning syntax less daunting. And, despite the fact that languages of the world come in a great variety of syntactic flavors, there are a surprising number of ways in which they overlap in terms of their structural properties. Of all the possible systems that languages might have evolved to combine words, it seems that only a pretty restricted subset of these options ever turn up. And, interestingly, kids rarely seem to follow dead-end leads that would cause them to posit grammatical systems that stray from what we think of as a human language. As a result, a number of researchers have proposed that children come into the world with a set of constraints that steer them away from ever considering grammars that fall outside the range of human grammars. In other words, children might have in place an innate universal grammar that provides them with a set of learning biases that line up nicely with the syntactic systems of human languages. universal grammar A hypothetical set of innate learning biases that guide children’s learning processes and constrain the possible structures of human languages. Opponents of this view argue that no such extra help is needed. Kids, they suggest, are equipped with robust general-purpose learning machinery that is perfectly up to the task of learning syntactic generalizations from huge quantities of data. The reason children don’t work themselves into a tight corner thinking that English sentences are generated by an alien grammar is because they have access to plenty of data that would quickly disabuse them of syntactic generalizations that fall too far from the mark. What’s more, there are alternative explanations for the syntactic similarities across languages, as explored in Chapter 2. Languages may share certain similarities not because the human brain is genetically programmed for certain structures, but because certain structures do a better job of meeting communicative needs or because they line up better with the brain’s strengths and weaknesses for learning and processing information more generally. WEB ACTIVITY 6.3 Discerning the structure This activity will give you a taste of the task facing children who have to learn about the right syntactic generalizations for their language. You’ll see sets of data from various foreign languages, as well as made-up “data” from invented alien languages. Can you spot the alien languages? What is it about them that makes their syntax seem non-human- like? https://oup-arc.com/access/content/sedivy-2e-student-resources/sedivy2e- chapter-6-web-activity-3 The disagreements between the nativist and data-driven views of syntactic learning tend to come down to several issues: First, what information is there in the input that a child hears—that is, is there actually enough evidence in the input to promote the right generalizations and circumvent the incorrect ones that kids might be tempted to make? Second, what kinds of learning mechanisms are available to kids throughout their learning trajectory? Third, what is it that children know about language structure anyway? Does their knowledge actually correspond to the kinds of abstract rules or representations that we’ve posited for an adult’s knowledge of syntax? In the rest of this chapter, we’ll see that these questions have yet to be definitively answered. However, we have an ever-growing set of theoretical and methodological tools with which to address them. 6.1 Questions to Contemplate 1. How would you test someone to know whether their knowledge of a language included abstract knowledge of structure and its relationship to meaning, rather than just the meanings of specific words or phrases they have committed to memory, as if they’d memorized the contents of a foreign-language phrase book? 2. What’s the evidence that structural representations have hierarchical structure, with groups of words or phrases clustered into higher-order syntactic units, rather than just specifying the linear order of classes of words such as nouns, verbs, or pronouns? 6.2 Learning Grammatical Categories To crawl back into the mind of a young child in the early stages of language development, it always helps to consider an unfamiliar language. Let’s look at Zapotec, an Indigenous language of Mexico that you probably haven’t encountered before. Based on the following sentences, take a stab at drawing some syntactic generalizations, at least at the level of basic word order. What regularities can you see? (Warning: it may take a little while to work through these examples.) ytaa’az gyeeihlly li’eb bgu’tya’ bzihny ytoo’oh li’eb ca’arr naa’ng banguual naa li’eb banguual gwua’ilreng li’ebr rcaa’za ygu’tya’ bzihny binydyang dolf ytoo’oh pa’amm ca’rr re’ihpy pa’aamm laa’reng gwua’llreng li’ebr Got it? Good. Now, what is the correct way to order the following three words: juaany, be’cw, and udiiny? (Seriously, give this a try before reading any further.) Actually, I suspect you can’t tell—this wasn’t an entirely fair exercise. You simply didn’t have enough information in this little language sample to be able to figure out much of anything. But things change quite dramatically if I give you a bit of information about how the individual words map onto meanings. Using the following mappings, try again to derive some syntactic generalizations and figure out the correct order of juaany, be’cw, and udiiny. ytaa’az—beat bgu’tya’, ygu’tya’—kill ytoo’oh—sell naa’ng, naa—be gwua’llreng—read rcaa’za—want binydyang—hear re’ihp—tell bzihny—mouse ca’arr—car li’ebr—book banguual—old udiiny—hit be’cw—dog gyeeihlly, li’eb, pa’amm, dolf, and juaany—all proper names This information makes all the difference. You should now be able to tell that udiiny juuany be’cw would give you a well-formed sentence. How do children know about grammatical categories? It’s worth thinking about why it’s so helpful to have information about how the individual words map onto meanings. It’s useful because of the following assumptions you were implicitly making: (1) words that map onto the same general kinds of meanings (for example, actions rather than people or things) will occupy the same slots in the syntax, and (2) the syntactic patterns are affected by the role that entities play in a sentence (distinguishing, for example, the agents that instigate actions from the entities that are acted upon). These assumptions allowed you to infer grammatical categories based on words’ meanings, and then to derive generalizations about how Zapotec structures its sentences in terms of these categories. Some researchers argue that this is exactly what young children do in the beginning stages of learning syntax; according to the semantic bootstrapping hypothesis, children might jump-start syntactic learning with the expectation that key concept types align with grammatical categories, assuming that nouns tend to be used to refer to objects, for example, or that the subject of a sentence is typically the agent of the action that’s being described. semantic bootstrapping hypothesis The idea that children’s learning of syntactic structure relies on word meanings together with the expectation that certain types of meanings align with certain grammatical categories (e.g. assuming that nouns tend to be used to refer to objects or that the subject of a sentence is typically the agent of the action). But where do these assumptions come from? In your case, quite possibly from your knowledge of English, in which syntactic categories do tend to be made up of words that are similar in meaning. But these assumptions turn out to be universally true of languages. Given that this is the case, and that these assumptions do so much work in breaking into the syntax of a new language, it seems sensible to ask whether children come into the world equipped with certain basic preconceptions about the relationship between language structure and meaning. In order for these assumptions to be accessible at the very outset to a child learning a first language, they’d have to be innate. As useful as it might be for babies to have such innate expectations, this doesn’t necessarily mean that they have them. We need to evaluate the possibility that babies start off ignorant of even the most basic facts about how meaning and syntax relate to each other, that they have to build from the ground up the notions that words fall into specific grammatical categories, and that they eventually learn that such categories constrain both their meanings and their structural patterns. The arguments for innately based assumptions about how meanings map onto grammatical categories become weaker if we can show that babies are able to learn these mappings easily and without any built-in assumptions. To get a feel for how this might happen, consider again the Zapotec sentences you saw earlier. What would happen if you didn’t assume from the get-go that words for people and things could be grouped together into a coherent syntactic category that patterned systematically in the language’s structure? What if you had to rely only on the evidence that was there in front of you in the sentences themselves, without the benefit of any preconceptions that words that are similar in meaning might behave similarly in the syntax of a sentence? How would you form a concept of a grammatical category, and how would you figure out which words belong to it? You likely could eventually break into the system, but it would take you many more example sentences than just the few that you were offered here. But eventually, with enough data, you might notice, for instance, that certain words like ytaa’az or re’ihpy only ever occur at the beginnings of sentences, so you’d begin to group them as belonging to the same class of words (let’s call this Class A for the moment). You would then also notice that only certain words can appear immediately after the Class A words, words like li’eb or bzihny, which we’ll call Class B words. Given enough distributional evidence of this sort—that is, evidence about the tendencies of words to appear in certain syntactic contexts—you could come up with some generalizations about word order. Once you moved on to learning the meanings of some of these words, you might then notice that Class A words tend to be words that refer to actions, and that Class B words tend to refer to people, things, or animals. This would then allow you to make very reasonable guesses about whether new words you meet should belong in the syntactic categories of Class A or B, once you had some idea of their meanings. From this point on, you would be in a position similar to the one you started the exercise with—that is, with certain ideas firmly in place about the mapping between syntactic categories and meanings. It just would have taken you a while to get there. distributional evidence The tendency of words or types of words to appear in certain syntactic contexts, allowing extrapolation of these tendencies to newly learned words. Is distributional evidence powerful enough? To make a convincing case that youngsters can form syntactic categories by tracking distributional evidence, we’ll need to answer several questions. The first of these is: If we look beyond just a small sampling of language, how reliable would this distributional evidence be? And if distributional evidence does turn out to be a reliable source of information for forming grammatical categories, our second question would be: Is there any evidence that small children are able to track distributional evidence and group words into categories accordingly? Let’s start with the first question regarding the reliability of distributional evidence. In arguing for a nativist version of the semantic bootstrapping hypothesis—that is, for the position that some assumptions about the mapping between meaning and syntax are innate—Steven Pinker (1987) noted that a child could run into trouble if he were relying solely on distributional patterns in sentences such as these: (1a) John ate fish. (1b) John ate rabbits. (1c) John can fish. If our alert toddler happened to notice that fish and rabbits occur in exactly the same syntactic environments in Examples 1a and 1b, he’d be in danger of concluding that the following sentence is also perfectly good: (1d) *John can rabbits. The problem here is that a single word, fish, actually falls into more than one syntactic category; it can act as either a noun or a verb, while rabbits can’t. But how could our child tell that fish in 1a is in a different category than fish in 1c? If he knew that it referred to a thing in 1a but an activity in 1c, and if he knew that these meanings mapped to different categories, he’d manage not to go astray. But without these assumptions in place, he could be misled. Distributional evidence might be messy in other ways that would preclude our child from learning the right categories for words. For example, if he were paying attention to which words can occur at the beginnings of sentences in English, he’d come across examples like these: (2a) John ate fish. (2b) Eat the fish! (2c) The fish smells bad. (2d) Whales like fish. (2e) Some of the fish smells bad. (2f) Quick, catch the fish! (2g) Never eat fish with a spoon. In each of the preceding examples, the first word of the sentence belongs to a different syntactic category. Assuming that children aren’t fed a carefully regimented sample of speech that manages to avoid these problems until they’ve properly sorted words into categories, how useful could distributional evidence be? It’s obviously going to matter what distributional evidence we consider. Looking just at the left edge of a sentence doesn’t produce great results for English, but other patterns could be much more regular. For example, looking at lexical co-occurrence patterns (that is, at information about which words tend to appear adjacent to each other in a data set) could prove to be more promising. For instance, in English it turns out to be fairly easy to predict which category can occur in which of the following slots in these word pairs, or bigrams: lexical co-occurrence patterns Information about which words tend to appear adjacent to each other in a given data set. bigrams Sequences of two words (i.e., word pairs). (3a) the __. (3b) should __. (3c) very __. Chances are, if asked, you’d supply a noun in 3a, a verb in 3b, and an adjective in 3c. And lexical co-occurrence patterns get even more useful if we look at sequences of three words, or trigrams: (4a) the __ is (4b) should __ the (4c) very __ house Note, for example, that either an adjective or a noun can occur after the, but only a noun can occur between the and is. Similarly, examples 4b and 4c are more constrained with trigrams than are the bigrams in 3b and 3c. Pursuing this idea, To by Mintz (2003) set out to measure the reliability of sequences, or “frames,” such as the __ is or should __ the. He looked at a large database containing transcriptions of the recorded speech of parents talking to toddlers and pulled out the 45 most frequent sequences of three words in which the first and last words were the same. So, for instance, the sequences the doggy is and the bottle is count as two instances of the frame the __ is. He then measured, for each of these frames, how accurate it would be to make the assumption that the middle words in the trigrams would always be of the same grammatical category. This measure of accuracy reveals just how predictive the frame is of that intervening word’s category. Mintz found that by relying just on these frequent frames in the database, it would be possible to correctly group the words that occurred in the frames with an accuracy rate of better than 90%. So it seems there’s pretty sturdy statistical information to be had just on the basis of distributional evidence (provided, of course, that small children are able to tune in to just this type of handy statistical evidence while perhaps ignoring other, less useful distributional evidence). Using Mintz’s “frequent frames” is only one possible way of capturing statistical regularities for words of the same grammatical category. Other formulations have also proven useful to greater or lesser degrees, and the usefulness of various statistical strategies may vary from language to language. Leaving aside for the moment the issue of exactly how best to capture distributional regularities, it’s at least fair to say that information from lexical co-occurrence is reliable enough to allow kids to make reasonably good guesses about the category membership of a large number of words, once they’ve been exposed to enough examples. So, what about our second question: Are very small children able to make use of these regularities? We saw in Chapter 4 that babies as young as 8 months can track statistical patterns over adjacent syllables and are able to use this information to make guesses about word boundaries. So, there’s already good reason to suspect that they might also be attuned to statistical patterns in inferring the grammatical categories of words. And some recent studies of babies in their second year provide more direct evidence for this view. For instance, Mintz (2006) studied how 12-month-olds would react upon hearing novel words in frequent frames within sentences such as She wants you to deeg it or I see the bist. Some words, like deeg, consistently appeared in verb slots, while others, like bist, appeared in noun slots during a familiarization phase. Mintz then measured looking times during a test phase and found that the babies distinguished between grammatical versus ungrammatical sentences that contained these novel words. That is, they looked longer at the “ungrammatical” sentence I bist you now than at the “grammatical” I deeg you now (see Figure 6.3). Figure 6.3 Mintz used the head-turn preference procedure (see Method 4.1) to measure the ability of infants to infer syntactic categories from distributional evidence. (A) Infants heard nonsense words within bigrams (word pairs, shown in italic) or trigrams (three words, also shown in italic) in either noun-supporting or verb-supporting sentence frames, unaccompanied by any semantic context. To make sure some nonsense words weren’t intrinsically more “noun-y” or “verb-y” than others, the subjects were divided into two groups. The words presented to Group 1 as nouns were presented to Group 2 as verbs, and vice versa. (B) In the test phase, the words were presented to the infants in grammatical and ungrammatical sentences. (C) Results, showing mean listening times to grammatical and ungrammatical sentences, by frame type. (Adapted from Mintz, 2006, in Hirsh- Pasek & Golinkoff (eds.), Action meets word: How children learn verbs. By permission of Oxford University Press, USA.) Distributional information in the form of frequent frames seems to work well for English. But what about other languages? Frequent frames are much less likely to be useful in a language like Turkish or Czech, which have very flexible word orders—unlike in English, which has a rigid subject-verb-object structure, these elements can appear in just about any order, with subjects and objects distinguished by the morphological affixes they sport rather than by where they appear in the sentence. In these cases, children might be better off identifying grammatical categories based on their morphological frames rather than their syntactic frames. Even a language like Spanish, which is fairly similar to English, may pose challenges to kids trying to infer categories based on frequent frames. For example, Spanish has a number of ambiguous function words, such that the same word shape corresponds to different functions—los can be either a determiner (los niños juegan; “the children play”) or an object pronoun (Ana los quiere aquí; “Ana wants them here”) and the word como can mean any of “how,” “as,” or “eat.” Each of these meanings will appear in a different frame, making it hard to find regularities in frames that involve these ambiguous words—a problem made worse by virtue of the fact that function words occur so frequently. Furthermore, Spanish allows you to drop nouns in certain contexts, so that if were discussing sweaters, you could say Quiero una azul (“I want a blue”) instead of “I want a blue sweater.” This means that the frames for adjectives and nouns will often be identical, possibly causing kids to confuse these categories. As researchers examine a variety of languages, we may learn a lot about the kinds of distributional evidence that children are able to use and whether such information is available in all languages (see Researchers at Work 6.1). WEB ACTIVITY 6.4 Finding syntactic categories Are you smarter than a 15-month-old? Try this activity to see if you, too, can pull syntactic categories out of a sample of data from an artificial language. The stimuli come from a study by Rebecca Gómez and Jessica Maye (2005). https://oup-arc.com/access/content/sedivy-2e-student-resources/sedivy2e- chapter-6-web-activity-4 RESEARCHERS AT WORK 6.1 The usefulness of frequent frames in Spanish and English Source: A. Weisleder & S. R. Waxman. (2010). What’s in the input? Frequent frames in child- directed speech offer distributional cues to grammatical categories in Spanish and English. Journal of Child Language 37, 1089–1108. Question: Are frequent frames as useful for inferring grammatical categories in Spanish as they are in English? Hypothesis: A statistical investigation of speech directed at young children in English and Spanish will show that frequent frames provide systematic cues to the grammatical categories of noun, adjective, and verb in both English and Spanish, though not necessarily to the same degree or equally for all categories. Test: Data from six parent–child pairs, three English-speaking and three Spanish-speaking, were extracted from the CHILDES database (see Method 6.1). All of the speech samples were from the parents’ speech directed at their child before age 2 years, 6 months. The speech samples were analyzed over three steps. Step 1: Selecting the frames for analysis. A frame was defined as two linguistic elements with one intervening word. This included mid-frames (A__B), in which the framing words were A and B, and the intervening word varied. It also included end-frames (A__.), in which A was a word, followed by the varying word and the end of the utterance. Every adult utterance was segmented into its three-element frames. For example: “Look at the doggie over there” yielded the following five frames: look__the at__doggie the__over doggie__there over__. The frequencies of each frame drawn from the database were tabulated. The 45 most frequent mid-frames and the 45 most frequent end-frames were identified as frequent frames and analyzed further. Step 2: Identification of the intervening words and their categories. The researchers listed all of the individual words (e.g., dog, pretty, run) that occurred as intervening words and identified the grammatical category (noun, adjective, verb) of each intervening word. Step 3: Quantitative evaluation of the usefulness of the frames. An Accuracy score was computed as follows: Every pair of intervening words that occurred in the same frequent frame was compared. If the intervening words were of the same grammatical category, the pair was scored as a hit; if they were of a different category, it was scored as a false alarm. For example, if the database included the sequences “doggie over there,” “doggie sleeps there,” and “doggie go there,” the first two sequences would be scored as a false alarm; the second and third sequences would be scored as a hit; and the first and third sequences would be scored as a false alarm. The Accuracy score for each frame was calculated as the proportion of hits to hits together with false alarms: Accuracy = hits/(hits + false alarms). A statistical analysis was performed to determine whether the Accuracy score for each frame was significantly higher than would be expected by chance, given the contents of the frames. Results: Table 6.4 shows the average Accuracy scores for both mid- and end-frames for each of the parent–child pairs (identified here by the child’s name), as well as baseline Accuracy scores— that is, the scores that would be expected by chance. For all the pairs, overall Accuracy scores for both mid- and end-frames were significantly higher than the baseline. Further statistical analyses showed that: 1. Accuracy scores for English frames overall were higher than for Spanish frames. 2. Accuracy was higher for mid-frames overall than for end-frames. 3. Accuracy was higher for frames in which the most frequent intervening word was a verb (verb-frames) than it was for noun-frames, which in turn had higher Accuracy scores than adjective-frames. (Note that this information is not apparent in Table 6.4, which averages scores across these grammatical categories rather than presenting them separately.) Conclusions: Frequent frames in both English and Spanish child-directed speech provide clearly non-random information about the grammatical categories of intervening words, suggesting they may be a useful source of information for children who are beginning to identify grammatical categories. However, this particular type of distributional information is more reliable in English than it is in Spanish. Moreover, in both languages, frequent frames offer more reliable information about verbs than about nouns, and even less reliable information about adjectives. Questions for further research 1. Given that frequent frames appear to be less reliable in Spanish than in English, do learners need a larger sample of Spanish speech to infer grammatical categories than they do in English? 2. Is there evidence that Spanish-learning children acquire certain grammatical categories later than English-learning children? 3. Spanish offers other distributional cues, such as morphological markers, to a greater extent than English. Given that other useful information may be available to them, do Spanish- learning children pay less attention to frequent frames than English-learning children? For example, if both groups were learning an artificial language with reliable information from frequent frames, would Spanish-learning children make less use of the same information that English-learning children? TABLE 6.4 Accuracy scores for parent-child pairs Parent–child pair Mid-frames (A__B) End-frames (A__.) English-speaking Eve 0.93 (0.48)a 0.83 (0.40) Nina 0.98 (0.46) 0.92 (0.50) Naomi 0.96 (0.47) 0.85 (0.43) Mean 0.96 (0.47) 0.87 (0.44) Spanish-speaking Koki 0.82 (0.24) 0.75 (0.38) Irene 0.68 (0.33) 0.52 (0.34) Maria 0.75 (0.34) 0.66 (0.33) Mean 0.75 (0.30) 0.64 (0.35) Adapted from Weisleder & Waxman, 2010. aBaseline scores—the scores that would be expected by chance— are in parentheses. Finding evidence that children can form categories on the basis of distributional evidence alone does not, of course, mean that this is the only, or even the most important, method in which kids learn categories. It doesn’t rule out the possibility that children also tap into innate preconceptions about the relationship between grammatical categories and their typical contributions to meaning. Nor does it rule out the possibility that children very quickly learn systematic mappings between structure and meaning and use this to refine their notion of categories. It’s perfectly possible that children lean on both kinds of information. In fact, some researchers are deeply skeptical that frequent frames alone could play a major role in children’s learning. Frames such as the __ is and should __ the don’t correspond to meaningful units in any way (in the way that eat the cake or the nice doll do). Presumably, children’s main goal is not to build abstract syntactic structure, but to figure out how complex phrases and sentences convey meanings. What if they achieve this not by first identifying abstract categories such as nouns and verbs and figuring out a set of compositional rules that can generate any sentence imaginable, but by starting more modestly, with small phrases whose meanings they’ve memorized, and working their way up to abstraction from there? The next section explores how this might work. 6.2 Questions to Contemplate 1. How could innate knowledge of certain grammatical categories, along with their mappings onto meanings, make it easier to learn the syntactic structure of a language? 2. If you were learning a new language by reading text in that language, but without knowing what any of the text meant, how could you possibly learn the regularities of its syntax? And what would that syntactic knowledge look like? 3. How might languages differ in the statistical information they offer to learners about their syntactic categories and how these categories are combined? 6.3 How Abstract Is Early Syntax? So far, I’ve tried to persuade you that, to capture what you know about how words can be strung together into sentences, you need to have (1) internalized an abstract set of categories and (2) learned some general patterns for combining these categories. Without these, it’s hard to see how we could account for the flexible and systematic nature of your syntactic knowledge. I’ve also discussed a bit of evidence that children start to distinguish among syntactic categories in the second year of life, along with some ideas about how they might be able to pull this off. And it’s also during the second year of life that toddlers begin to combine words together in their own speech. So does this mean we’re now seeing evidence that children know about general patterns of syntax or phrase structure rules? Do children understand structure in the same way as adults? Not so fast. To sing the refrain once again, we should avoid making assumptions about what children know at any given stage based on what we know about language as adults. After all, it’s possible that their early combinations of words rely on knowledge that looks very different from ours. It’s easier to imagine this if we think of a species that we don’t normally consider to be linguistically sophisticated (and hence, quite likely to be very different from us cognitively). Let’s suppose, for example, that you’ve taught your pet pigeon to peck on pictures that correspond to words or phrases of English in order to earn food pellets. I have no doubt you could do this if you were patient enough. I also suspect that you could teach your bird to distinguish between sentences like The man bites the dog and The dog bites the man so that it would peck on different pictures for each of these, each picture corresponding to a depiction of the sentence it was paired with. Would you be credible in claiming that your pigeon had learned syntax? Not at all. You would have no evidence that the bird wasn’t simply treating each of these sentences as if they were long words, each with their particular associations or “meanings.” (Actually, it’s not even clear that the pigeon has a notion of meanings, based on this test.) The bird probably just memorized which string of sounds goes with each picture, much as we have to simply memorize the meanings of individual words or morphemes rather than composing them out of meaningful parts. To get evidence for syntax, we would need to see much more impressive data. We should be as skeptical and conservative in imputing linguistic knowledge to babies as we are to pigeons. Sure, the knowledge of young kids is more likely than that of a pigeon to resemble our adult linguistic system, but where’s the evidence? That is: When kids are beginning to form larger units of meaning in which they put together recognizable words, are we really seeing evidence of fully abstract syntactic knowledge such as rules? Or are they doing something less flexible and creative, akin to memorizing word-like structures? Or another possibility: they might be doing something in between. For example, they might know that their existing collection of words can be combined together with other meaningful words, and they might know that the order in which words get combined matters, but they might not yet have a fully abstract set of rules. They might instead know, for example, that the verb bite can be put together with two other words that pick out people or animals, and that the word that goes before the verb bite picks out the offending biter while the word that follows the verb picks out the bitten victim. This would allow them to form a rudimentary sentence with the verb bite, combining it with their repertoire of nouns. But maybe they don’t yet know that it’s possible to accomplish similar tricks with similar effects on meaning with just about any combinations of words that happen to be nouns and verbs. Looking for evidence of abstract knowledge Where would we look for evidence of abstract knowledge such as syntactic rules? Obviously, the usefulness of having rules in the first place lies in the fact that they can be extended to any novel words that we might learn. If you, with your fully fledged syntactic system in place, learn a new adjective like avuncular, for example, you don’t also need to learn where you can shoehorn it into a sentence—this comes for free once you have a fully formed syntax. You might have first learned it in a sentence such as My boss has somewhat of an avuncular air about him, but you can immediately trot out your newly acquired vocabulary gem in new word orders such as The president strikes me as avuncular or The avuncular but rigorous professor has a loyal student following, and so on. In fact, you don’t even need to know the meaning of the word in order to confidently recognize that if the first sentence is grammatical, then so are the others. You can accomplish that feat by substituting the word brillig for avuncular, for example. So, the best test for the abstraction and generality of toddlers’ syntactic knowledge would be to check to see how readily their knowledge extends beyond specific words or phrases and can be applied to new words—can the child take a new word and plug it into all the slots that call for a word of that category, even if they’ve never seen the word in those slots? In fact, mistakes may provide some of the best evidence for abstract knowledge; when kids overstep the bounds and extend a pattern too generally, producing forms that they’ve never heard their parents say, this gives us some compelling clues about the learning that’s taken place underneath. Remember from Chapter 5, for example, certain “creative” errors in kids’ speech such as “He goed to the store” or “The teacher holded it.” We saw that one way of interpreting these errors is to say that kids have figured out that there’s an abstract rule in which you can take a verb stem and tack the morpheme -ed onto it to signal past tense. The reason these are errors is because English happens to have exceptions—at some point, you have to memorize which verbs are exempt from the rule. The above errors could be seen as evidence that children have the rule, but haven’t yet sorted out the exceptions—they’ve gone a bit wild in their application of the rule. There’s some reason to think that kids do this at the level of syntax too. Sometimes they do appear to overapply structures of syntax, as well, and assume that these structures involve a more general category than they actually do. Melissa Bowerman (1982) documented some examples of errors like these: Don’t giggle me! (Meaning “Don’t make me giggle!”) You can push her mouth open to drink her. (Meaning “You can make her drink.”) She falled me. (Meaning “She made me fall.”) These seem like fanciful inventions that stray far away from the grammar of actual English, until you realize that there are examples like this in adult English: Don’t melt the butter. (Meaning “Don’t make the butter melt.”) You can bounce the ball. (Meaning “You can make the ball bounce.”) You broke it. (Meaning “You made it break.”) All of a sudden, it looks less like the kids are just pulling syntax out of their ear, and more like they might actually be on to something—perhaps they’re noticing that verbs that appear in the intransitive form (with just a subject) can also appear in the transitive form (with a subject and an object), and that when they do, the object in the transitive form takes on the same role in the event that the subject takes in the intransitive form: intransitive verbs Verbs that take a subject but no object, such as (Joe) sneezes or (Keesha) laughs. transitive verbs Verbs that take both a subject and an object, such as (Joe) kicks (the ball) or (Keesha) eats (popcorn). The butter melted. You melted the butter. The ball bounced. The girl bounced the ball. The glass broke. She broke the glass. Unfortunately, the children abstracted a pattern that is slightly too general (see Table 6.3). It turns out that for English, this equivalency in syntax can’t be applied willy-nilly to all verbs. Rather, it’s restricted to particular classes of verbs that tend to signal a change of state or location. As it happens, some languages can apply a rule like this to more or less all verbs, so it’s not a crazy hypothesis for the kids to make. Like the overgeneralization of regular past-tense forms, it might be seen as evidence that kids are learning to syntactically manipulate abstract categories like verbs and noun phrases, in much the same way as our phrase structure rules in Section 6.1 involved the manipulation of categories like verbs or noun phrases. Children as cautious learners of syntax Some researchers have argued that overextensions like She falled me actually play a fairly marginal role in children’s very early syntax. On the contrary, these scientists suggest, when children first start to combine words in short sentences, they display a startling lack of adventurousness in generalizing patterns to new forms. Brian MacWhinney (1975) proposed that when children first start to combine words, they’re essentially producing non-compositional sequences whose complex meanings they’ve memorized, much like we adults can produce idiomatic phrases like kick the bucket or cut the apron strings. For example, when a child first produces a two-word sequence like nice kitty, this doesn’t mean she’s figured out how to combine adjectives with nouns —all she knows is that the phrase nice kitty refers to a particular kind of cat, presumably one that doesn’t scratch or hiss. Eventually, she may learn that nice can also combine with other nouns that refer to animals or people (nice doggie, nice baby, nice daddy, etc.). At this stage, her template may look something like: nice + animate entity, with a more general understanding of what the word nice contributes, aside from not scratching. A little later, she encounters nice dress or nice picture and extends her template to nice + object. At this point, she may also be picking up the sequences good doggie, pretty dress, and pretty flower. Like nice baby, these might start out as memorized sequences, but as she extends the templates for good and pretty, she may notice the overlap between things that can combine with nice and things that can combine with good and pretty; this may accelerate her generalization about the slots that go with these words. Eventually, she’ll be nudged into adopting the fully abstract sequence adjective + noun, based on pairings like nice idea, good kiss, pretty dance, and so on. TABLE 6.3 Some examples of children’s creative overgeneralizations Child’s utterance Speaker’s age (years; months) From Bowerman, M. 1988 in J.A. Hawkins (ed.), Explaining language universals. Blackwell. Oxford, UK. I said her no. 3; 1 Don’t say me that or you’ll make 2; me cry. I want Daddy choose me what to 2; 6 have. Button me the rest. 3; 4 I’ll brush him his hair. 2; 3 Do you want to see us disappear 6+ our heads? I don’t want any more grapes. They 2; 8+ just cough me. I want to comfortable you. 5; 9 It always sweats me. (Said while 4; 3 refusing a sweater.) If you don’t put them in for a very 3; 6 long time, they won’t get staled. (Referring to crackers in a bread box.) Mommy will get lightninged. 5; 2 (Meaning “struck by lightning.”) Can I fill some salt into the bear? 5; 0 (Referring to a bear-shaped salt shaker.) I’m gonna cover a screen over me. 4; 5 He’s gonna die you, David. (Then 4+ mother is addressed.) The tiger will come and eat David and then he will be died and I won’t have a brother any more. From C. Hudson Kam (unpublished observations of her son) Why did you fall me off, Daddy? 2; 10 I was just getting you afraid. Did I 3; 3 afraid you? I wuv cheesecake! It bounces me 3; 4 off walls! This is colding me. (Said while 3; 4 waving hands in front of face.) I am running while fastening my 3; 6 legs. (Meaning “making them go faster than they are supposed to go.”) She deadened him. (Meaning “she 3; 7 killed him.”) As evidence for this view, MacWhinney pointed to the fact that children undergeneralize many forms. Detecting undergeneralizations is more involved than spotting overgeneralizations; the latter jump out as errors, but evidence of undergeneralization has to be found in what children don’t say. This requires combing through a large sample of the child’s speech to see if an adjective like nice only ever combines with one or a few nouns, or whether it gets attached to a large variety of them—if you look at too small a sample, there may be too few occurrences of nice to be able to draw any conclusions about what it can combine with (see Method 6.1 later in this chapter). When MacWhinney analyzed 11,077 utterances produced by two children, he found that between 85% and 96% of them were produced by a set of 40 templates that were anchored to very specific lexical items. Along similar lines, Michael Tomasello (1992) documented his own daughter’s early speech and discovered that any particular verb that entered her vocabulary was often produced in only one syntactic frame for quite some time afterward, rather than being incorporated into other possible syntactic frames with verb slots. This happened even though different verbs might enter her usage as part of different frames. So, a verb like break might appear along with a subject and object, as in Mommy break cup, while cut might appear only with an object, as in cut string. His daughter rarely produced strings like break dolly or Daddy cut meat. In other words, break and cut were treated as if each verb had its own construction; the verbs were clearly not considered interchangeable. This led Tomasello to suggest that it takes children a fairly long time to integrate verbs into a full system of general rules. If they did have such a system, he argued, new verbs would appear in multiple syntactic frames, wherever verbs in general are allowed to appear, much as is the case whenever you’re introduced to a new adjective like avuncular or brillig. Tomasello and his colleagues have since conducted a variety of experiments showing that when it comes to verbs, at least, children are conservative learners rather than eager generalizers. In a 1997 study, they compared how readily children generalized nouns and verbs by repeatedly exposing toddlers (average of 2 years and 10 months) to novel words over a period of 10 days. For instance, children saw pictures accompanied by comments such as “Look! The wug!” or “Look what Ernie’s doing to Big Bird. It’s called meeking!” The children easily learned the meanings of both the nouns and verbs, and used them regularly in their own speech. However, while the nouns tended to occur in lots of different syntactic environments (I see wug; I want my wug; Wug did it), the children hardly ever combined the new verbs with any other words at all—and when they did, they almost always used the same syntactic frame that they had heard the experimenter use. In another study by Brooks and Tomasello (1999), children of a similar age heard sentences that appeared either in the active form (The cat is gorping Bert) or in the passive form (Ernie is getting meeked by the dog). They were then asked a question that nudged them toward producing an active from: What is the cat doing? (Or, if appropriate, What is the dog doing?) The children almost always produced the new verb in an active transitive form if that’s what they’d heard earlier, but if they had been introduced to the verb in the passive form, they used the active form only 28 percent of the time, despite the fact that this particular syntactic frame is the most frequent frame to be found in English. The puzzle, according to these researchers, is not why children make errors of overgeneralization, like those documented by Melissa Bowerman. Rather, we might wonder why they seem to generalize so little. Like MacWhinney, Tomasello and his colleagues have proposed that children are reluctant to generalize because their early structure doesn’t involve general rules at all. Instead, it involves snippets of verb-specific structures that allow for nouns to be slotted in like this: [[N (hitter)] hit [N (hittee)] [N (thing broken)] break That is, kids don’t have general, broad concepts such as NP subjects or objects. Instead, they have much narrower concepts that specify the roles that nouns play in highly particular syntactic frames involving individual words—verb islands, to use Tomasello’s terminology. If we were to ask whether children’s early structures involve memorized words or general rules, the answer might be that they’re hybrids. Not rules, but little “rulelets”—or constructions that mix abstract and specific knowledge. verb islands Hypothetical syntactic frames that are particular to specific verbs, and that specify (1) whether that verb can combine with nouns to its left or right and/or (2) the roles that the co-occurring nouns can play in an event (e.g., the do-er, the thing that is acted upon, and so on). But the fact that children are shy about generalizing verbs in their own speech isn’t necessarily a knockdown argument that they’re unable to form more general rules or patterns. True, clear evidence of generalization or (especially) overgeneralization would provide some solid clues that chi

Learning the Structure of Sentences PDF

Document Details

Tags

Related

Summary

Full Transcript