• Tidak ada hasil yang ditemukan

Prosodic bootstrapping The rhythmic structures, or prosody, of language have been argued to

3.3 How could the problem be solved?

3.3.1 Prosodic bootstrapping The rhythmic structures, or prosody, of language have been argued to

play a fundamental role in first language acquisition, perhaps helping children to crack the code and solve the segmentation problems we raised above, as in (24).

24. Prosodic Bootstrapping Hypothesis

Prosodic units which are acoustically signaled in the speech stream may provide critical perceptual cues by which children first discover the existence of linguistic units.15

Could such cues derive from general perceptual capacity, e.g., common to music and/or from other more general cognitive abilities? Certain prosodic features may correlate with constituent structure in language. For example, several acous- tic features may mark the end of a unit of speech: (a) lengthening of the terminal segment; (b) fall in fundamental frequency; and (c) decrease in amplitude. Such prosodic markers can aid adults in the acquisition of an artificial language (Mor- gan, Meier and Newport 1987).16

Very young infants have demonstrated sensitivity to musical phrase structure.

Presented with tapes of Mozart minuets, 4.5-month-old infants distinguished those sequences which were broken at well-formed musical phrase boundaries from those that were broken unnaturally within the phrase, listening longer to the well-formed musical phrases.17Rhythm perception has been attested in early infancy,18as well as numerous prosodic sensitivities (see chapter 8). Morgan 1996 argues for a “rhythmic bias in preverbal speech segmentation.” Mehler et al. 1996 and Ramus, Nespor and Mehler 1999 propose that classifiable rhythmic properties of speech ground infants’ first processing and representation of language.

In general, prosodic information in a language does not provide one-to-one mapping to linguistic units so that infants could directly and systematically infer linguistic structure from prosodic structure (see Lieberman 1996). This is because

“many of the same acoustic changes that frequently coincide with important syn- tactic units in speech also occur in utterances for nonsyntactic reasons” (Jusczyk 1997a, 141). If listeners “were to rely on any one of these cues for informa- tion about grammatical units, they still would need some other mechanism to let them know when the cues were actually relevant to syntactic matters” (Jusczyk 1997a, 141). If a form of mapping from prosodic units to linguistic units is to be

15See Morgan and Demuth 1996, and Jusczyk and Kemler-Nelson 1997a for reviews; Gleitman and Wanner 1982, 26; Allen and Hawkins 1980.

16See Jusczyk 1997, 140, for review; Jusczyk et al. 1992 for experimental study. See Lieberman 1996 for review of language intonation and perception; Inkelas and Zec 1990 for cross-linguistic work; Nespor and Vogel 1986; Couper-Kuhlen 1993 on English speech rhythm. For general review combining adult prosody and child acquisition issues see Gerken 1996; Fowler 1977 for example of psycholinguistic work on speech production and serial ordering; Cooper and Paccia-Cooper 1980 for example of discussion of the relation between temporal effects and syntactic phrase structure and related psycholinguistic studies.

17Jusczyk and Krumhansl 1993; Krumhansl and Jusczyk 1990. Current research attempts to distin- guish the precise acoustic cues which determine this effect; e.g., Jusczyk 1997, 146–147.

18Demorny and McKenzie 1977.

42 c h i l d l a n g ua g e

achieved, then it appears that the linguistic units must already be known; “lan- guage learners would already need to have a tacit understanding of the relation between prosodic cues and syntactic structure in order to reconstruct the speaker’s intended bracketing” (Gerken 1996, 347).

Language-specific variation in prosodic units must be acquired. Tones, accents and stress vary from language to language. Stress may fall on inflection or on word stems. While English rhythmic structure involves alternating stresses on syllables which contrast by being either strong or weak, French rhythmic struc- ture is “syllable-timed,” where syllables tend to have equal timing. Knowledge of language-specific timing acquired in a first, native language, may persist in second-language acquisition.19

Children’s control of prosody advances during the first twelve months, when integration of syntactic and prosodic units develops on the basis of specific lan- guage experience (Morgan 1994, 402; Jusczyk 1997a). As we will see, there is empirical evidence that children make use of prosodic factors in language acqui- sition, but prosody must integrate with other linguistic knowledge in order to be effective. “[Y]oung language learners use prosodic information to discover prosodic structure, not syntactic structure” (Gerken 1996, 348).

3.3.2 Phonological bootstrapping

The “prosodic bootstrapping hypothesis “– in its strong form – is actually a misnomer. “Most proponents of this view assume that learners are drawing on a range of information available in the speech signal that extends beyond prosody” (Jusczyk 1997a, 38). What has been described as a “prosodic bias” in language acquisition might be best categorized as a prosodic factor.

“Prosodic bootstrapping” may be reformulated as “phonological bootstrapping”

wherein it is recognized that “several forms of information are available in input speech – phonetic, phonotactic, prosodic, stochastic – and any or all of these could contribute to syntactically rich representations of input utterances” (Morgan and Demuth 1996, 2). Phonological bootstrapping reflects a type of potential

“linguistic bootstrapping” wherein one form of linguistic knowledge may interact with and aid another (cf. chapter 11).

3.3.3 Semantic bootstrapping

Perhaps if language acquisition is not perceptually based, it is con- ceptually based. Could childrenfirst determine what the meaning of language is, and use this external knowledge to begin to crack the code of language?

Could this meaning aid children in discovering the formal linguistic units and system? Could language acquisition initially result from a unidirectional mapping from meaning to form? Various forms of this hypothesis have been posed under the term “semantic bootstrapping”, e.g., (25) or (26). Is language

19Cutler, Mehler, Norris and Segui 1986, 1992; Cutler 1994, 80.

acquisition based initially on children’s ability to observe the world around them, and on the basis of this context, to induce meanings of words as well as gram- matical knowledge?

25. “The beginnings of language are learned ostensively. The needed stimuli are right out there in front, and mystery is at a minimum . . . Language bypasses the idea and homes on the object . . . we learn the language by relating its terms to the observations that elicit them.” (Quine 1973, 35–37) 26. Semantic Bootstrapping Hypothesis

Initially, children do not have access to language form, but do have access to extra-linguistic forms of meaning. On the basis of these meanings, children “bootstrap” to formal knowledge of language, i.e., to its forms and its units.20

Under this hypothesis, children first observe “real world” situations and then use these observations to formulate word meanings and aspects of grammatical structure. Hearing the word “dog” in the context of dogs, or the word “push” in context of pushing will lead children to “induce” the meanings of these words;

they will extract the relevant regularities from the contexts observed. In its strong form, this hypothesis also cannot provide a complete explanation for the language acquisition problem (cf. chapter 11).

3.3.3.1 Meaning is limitless

There is a limitless set of possible meanings in any particular context.

Before knowing a language, how could a language learner possibly determine what meanings to assign?

3.3.3.2 Reference is inscrutable

The philosopher Quine explicated the problem here with a famous example (1960, 52), where the linguist goes out to the jungle to determine an unknown language. “A rabbit scurries by, the native says ‘gavagai’, and the linguist notes down the sentence ‘rabbit’ (or ‘lo, a rabbit’), as tentative translation” (27), which he then subjects to further tests by asking the native to assent or dissent to possible translations. Any of an infinite set of possible meanings could exist and they would all trigger the native’s assent under similar situations:

27. Quine’s reference to “gavagai”:

stages or temporal segments of rabbit an integral part of a rabbit

all and sundry undetached parts of rabbits rabbit fusion of parts

the concept of rabbithood the place where rabbithood is manifested

whole enduring “rabbit”

20See Grimshaw 1981 and Macnamara 1982 for early versions; Pinker 1987, 1984, 1989 for overview and summary; Bloom 1999.

44 c h i l d l a n g ua g e

Only on the assumption that the native performs the same translation as English could a shared meaning of “gavagai” be determined, e.g., “whole enduring ‘rab- bit,’” if this is in fact what the native had in mind. Quine terms this general problem

“indeterminacy of translation.” But this is what we are trying to explain: how do children acquire the correct translations for English, or any language? Children must face the problem of “inscrutability of reference”: the relation between the word and the thing it labels is complex, non-direct and indeterminate (Quine 1960, 80; 1971, 142).

Is it possible that, for children, pointing or gestural reference (i.e., a form of manual ostension) could determine reference and meaning? As Quine points out,

“Point to a rabbit and you have pointed to a stage of a rabbit, to an integral part of a rabbit, to the rabbit fusion, and to where rabbithood is manifested” (52).

Pointing alone does not in itself resolve the problem.21

If meaning is so indeterminate, how could it possibly be the source of children’s ability to crack the code of the PLD? Even if linguistic units are already known, e.g., the word or sentence, the problem of determining meaning remains. “The difficulty is that neither words nor sentences, nor even propositions, are in any direct way encodings of scenes or situations in the world” (Gleitman and Wanner 1982, 9). When someone says “The cat is on the mat,” (28) gives only a few of the possible meanings which this utterance could have.22

28. Interpreting context

A mat is supporting a cat A mat is under the cat The cat is ruining the mat

The floor is supporting the mat and the cat The cat is sleeping

The cat has come in again

What a good bed that mat makes for the cat How could children know which interpretation to choose?

In a large corpus of mother utterances to 13–23 month olds, out of 8,000 utter- ances which contained a verb, 3,000 did not refer to an ongoing event (Beckwith, Tinker and Bloom 1989). For the verb “open,” only 37.5 percent of utterances actually involved the “here and now” (Gleitman 1994). Parents tend to use verbs in contexts like “Put it in here” or “Do you want to roll it to me,” which are nonostensive (Tomasello, Strosberg and Akhtar 1996, 158).

In many adult utterances, the same event can be described by different verbs, e.g., “chase” or “flee” (Gleitman 1994). Many verbs cannot be based on obser- vation, e.g., “think” (Gleitman 1994, 188). Even when contexts are ostensive in some way, the context in itself does not reveal word meaning. Adults were shown videotapes of mothers playing with their young infants (aged about 18 months),

21Blake 2000 provides discussion; Bruner 1974/64 is a classic study; Schick 2000 provides empirical study of pointing in hearing and deaf children.

22See Gleitman and Wanner 1982; Gleitman and Gleitman 1994; Gleitman and Gillette 1999.

with audio eliminated (Gleitman and Gillette 1999, 279). Even with leading infor- mation (that the mother is uttering a “target noun” when a beep sounds), adults were only correct in guessing this referent about 50 percent of the time on the basis of the observational context alone. Fewer than 10 percent of the verbs are identified correctly, even with only frequent child-directed verbs when the adults knew these in advance (281). Thus, even for an adult given leading cues, there is no determinate 1:1 mapping between observational context and word units.

Pinker (1984) suggests that a probabilistic word–world mapping would be possible for word learning if we assume children compute over several cross- situational contexts. Some such computation must be involved in the acquisition of concepts and word meaning. Consider the task of determining where the bull’s eye is from a target punctured by arrow holes. One or two holes are not highly informative, but multiple holes may be.23 However, the availability of multiple contexts does not in itself solve the essential problem we have raised here. As in the bull’s eye example, if multiple contexts are to become informative for children, we must posit some form of hypothesis of what is being looked for, i.e., the bull’s eye, and analysis of what is alike about these contexts so they can be compared (cf. abduction in chapter 4). How would children attain this initial hypothesis?

Children’s fast mapping of new word meanings (cf. chapter 10) also appears to challenge the degree to which cross-situational comparisons are necessary for initial acquisition.

3.3.3.3 Learning in non-ostensive contexts

Children do not depend on ostensive contexts for early word learning.

They learn new words by overhearing them as well (Akhtar, Jipson and Callanan 2001). In some cases, young children learned a novel verb best when it was “said in anticipation of an impending event or action” (Tomasello 1995). Infants as young as 18 months were introduced to a new word for a new toy. In an “ostensive condition,” the toy was immediately found by an adult as the word was introduced.

In a “non-ostensive context,” the adult first found an incorrect toy, frowned at it and replaced it, only eventually finding the correct toy. Children learned the new word equally well either way when tested in either comprehension (asked to select the toy) or production (asked to name the toy). Children learned a new word for an adult’s intended referent which wasn’t seen at all until a later comprehension test.24Joint attention between child and caregiver does not determine or explain infant word learning (Carpenter, Nagell and Tomasello 1998) (chapter 10).

3.3.3.4 Learning in the blind child

Children’s lack of direct dependence on ostensive contexts for word learning is revealed also in the congenitally blind child. The blind child’s

23Suggested by Neil Smith.

24Tomasello, Strosberg and Akhtar 1996; Akhtar and Tomasello 1996; Tomasello and Barton 1994;

Baldwin 1993b, Baldwin et al 1996, Tomasello and Kruger 1992.

46 c h i l d l a n g ua g e

understanding of “look” and “see” develops similarly to that of the sighted child (Landau and Gleitman (1985)). Asked to make it so Mommy cannot “see” a toy, for example, a blind child put the toy in her pocket. Asked to “look behind you,” she explored the area behind her with her hands (see also Gleitman and Gillette 1999, 283). The blind child’s acquisition of reversible pronouns (I, you) was found to be no later than that of the sighted child’s. Even color terms are acquired. This indirect relation between language, meaning and referential con- text form the basis for the development of an alternative “syntactic bootstrapping hypothesis” of acquisition of word meaning (see chapter 11).

3.3.3.5 Ontological primitives

Are human beings born predetermined to realize that certain meanings are possible and others not? Something like this must be true to some degree,25but it cannot alone solve the language acquisition challenge for children in the Initial State. We might consider certain concepts unnamable, e.g., a certain “arrangement of leaves on a tree,” but as Chomsky (1975b, 44) argues, an artist could develop such a reference in his work, using it to connote, for example, serenity. It is not clear how and where meaning units can be presumed, nor which lexical mappings can be presumed. The term “concept” is notoriously difficult to define. It does not provide us witha prioriunits which link to the lexicon. Attaining a concept would not in itself fully determine the acquisition of word meanings (e.g., chapters 2 and 10). Languages differ in concept lexicalization. Consider words which we might consider to be among the most important possible meanings: “hope” or “love” or

“tomorrow.” Sinhala has no word for “hope.” There is no direct way in Sinhala to say “I hopethis book is in the library” (meaning “it is good to me if it were there”).

Malayalam does not have a word closely corresponding to the English “love,”

but distinguishes types of love. In some African languages, the verb “defeat” is used to capture the meaning of “is bigger than.” In many languages, the verb

“smoke” is equivalent to the verb “drink” (Heine 1992). Children cannot assume, on the basis of general non-linguistic cognition alone, which concepts have been lexicalized in a language or how. Again, paradoxically, it appears that children must know the language in order to know the meaning.

3.3.3.6 Meaning is necessary

There may be something innate about the human assumption of a relation between linguistic form and meaning.26The evidence above suggests only that meaning in itself cannot be a unique and independent first step, independent of linguistic knowledge, which can solve the essential language acquisition problem

25Cf. Chomsky, 1975b, 44 and fn. 15; see also Fodor 1975 on a “language of thought.”

26Blake and Fink 1987, 229 provide argument that infant babbling may involve “sound–meaning relations.”

Malayalam? Chinese? Sinhala? English? Spanish? Welsh?

continuous speech stream digital units of language knowledge

Fig. 3.1 The child.

for children in the Initial State. A relation between external context and linguistic form will not work unaided (Gleitman 1994, 188). In order to make specific forms of semantic bootstrapping work, the formal linguistic categories which are mapped to meaning categories would presumably have to be made available to children so that they could map to them. As Pinker (1987, 1984; 1989a, 361) suggests, “the syntactic nature of the rules acquired is the basis for the real power of the bootstrapping theory” (1987, 409) (cf. chapter 11).