The nature of the evidence: searching the speech stream for the units

12.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

[ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ]

As in examples (5) and (8), children must discover the sounds (which combine into words), the words (which combine into phrases), the phrases (e.g., subjects and predicates which combine to form clauses), and the clauses and clause com- binations (which form sentences) (chapter 2).

3.1.1.7 Summary

While both positive and negative evidence appear to be necessary for children to acquire a language, neither appears to be directly available to them.

3.2 The nature of the evidence: searching the speech

34 c h i l d l a n g ua g e

when and where elision occurs, a problem which baffles most second language learners and linguists alike.

3.2.1.2 Finding the word units

Words are more difficult to perceive and are less clearly articulated in fluent speech than when isolated (Lieberman 1963; Pollack and Pickett 1964). A sentence of only about seven words can result in “millions of alternative possible word strings” (Jusczyk, Cutler and Redanz 1993; Klatt 1989). Parents do not first present all words individually to their children (Aslin 1993; Brent and Siskind 2001; cf. chapter 6).

3.2.1.3 Finding the sound units: the nature of speech perception The linguistic units of sound which underlie words do not exist in the speech stream. Table 3.1 provides a list of critical results from the study of adult speech perception showing this. The continuous speech stream underdetermines the discovery of digital sound units (e.g.,iandiion table 3.1), much like a motion picture does not reveal individual images which compose it.

3.2.1.4 Opacity of the speech stream

In normal speech, thesamesound is heard asdifferentin certain cases (15a); in other cases,differentsounds are heard as thesame(15b), corresponding toiiiandivon table 3.1. Sound units may be null; e.g., in English pronunciation of “pants,” the /t/ may have no phonetic realization.

15. Opacity of the speech stream

a. [ ] b. [ ] [ ]

/ \ \ /

[ ] [ ] [ ]

Given the facts summarized in table 3.1, there are not “criterial” acoustic invari- ants in the speech stream which regularly and necessarily correspond to the sound units which children must discover. Children make a fundamental con- version from a continuous speech stream to a discontinuous (digital) representation, schematized in (12). The evidence for the units is not direct, concrete or regular.

We do not perceive speech by analyzing individual segments sequentially, like beads on a string. How thendowe perceive and understand speech?⁸Even more puzzling is the question: how can and do children come to discover the relevant

8These issues continue to pose a challenge to theories of speech perception. See Akmajian, Demers, Farmer and Harnish 2001 and Matthei and Roeper 1983 for general introduction ; and Klatt 1989, Remez et al 1994 for overviews. Liberman 1996 provides a solution in terms of a “motor theory of speech perception.”

Table 3.1 Critical results from the study of speech perception opacity of the speech stream

i. The rate of transmissionof relevant information in the speech stream is very fast (Liberman 1996, 32; Miller 1981, 75). 20–30 sound segments per second are possible (Liberman 1996), a rate faster than that at which we can reliably identify individual sounds in a sequence, i.e., 7–9 per second (Liberman 1970).

ii. Coarticulation.In production, “coarticulation folds information about several successive segments into the same stretch of sound” (Liberman 1996, 33).

Coarticulation is necessary; consonants can not be identified without adjacent vowels for example (e.g., Delattre, Liberman and Cooper 1955; cf. Jusczyk 1997;

Liberman 1996, 33). When sounds combine into larger units such as syllables or words, e.g, b – a – t [b a t], “the acoustic cues that characterize the initial and final consonants are transmitted in the time slot that would have been necessary to transmit a single isolated vowel” (Liberman 1996, 207, 223), reflecting what has been termed “parallel transmission” of the information regarding individual units and coarticulation of the combined units.

iii. The same phone (sound unit) may take on different properties in different environments. The same sound can be perceived differently depending on its context; e.g., [p] when clipped from [pi] and inserted before [a], as in [pa], is heard as [ka]. The same [p] when inserted before as in [pu] is heard as [pu]. Similarly, silence (75 msc of blank tape) inserted in “s#lit” is heard as split; inserted in s#ore is heard as store (Cooper, Delattre, Liberman, Borst and Gerstman 1952; Matthei and Roeper 1983; Akmajian et al. 1995, 407).

iv. Different phones (sound units) may appear the same in different environments.Adult speakers judge sounds to be identical which are distinct phonetically. For example, in many American English dialects, the unit [t] “has as many as eight distinct pronunciations,” one of which may be complete silence (Kenstowicz 1994, 65). In many American English dialects, the /t/ in “write” and the /d/ in “ride” will appear as the same sound, a “flap”, in “writer” or “rider”.

units when cracking the code from the speech stream for the first time, without knowing a language? While adults can test hypotheses regarding the specific language they know, and can search the speech stream for cues to these relevant units, infants in the Initial State do not yet know a specific language and must discoverthese units when they do not actually exist directly in the data which they experience.

ACOUSTIC CUE:Some property of the physical embodiment of language may correlate with linguistic units, e.g., loudness or stress. For example, vowels are differentiated in terms of their formant frequencies (involving rate of variation in air pressure which correspond to shape and use of the vocal tract) (Ladefoged 1993, 2001).

36 c h i l d l a n g ua g e 3.2.1.5 Finding the cues

Cues lie in the speech stream; otherwise we could not accomplish a mapping from sounds to language. Yet how are children to know what constitutes a cue and which cues to use? The cues that indicate unit boundaries in different languages are “apt to be closely tuned to the underlying organization of the sound patterns for a particular language” and differ from language to language.

“Consequently, among the things that one has to learn in order to speak and understand a native language is what the correct cues are for segmenting words from fluent speech in that language” (Jusczyk 1997a, 5).⁹

3.2.1.6 Confounding the search

Every time a word is uttered, e.g., “hat,” it differs physically (acous- tically). Variations in different speakers, genders and ages, amplitudes and tones add more variability. Whether the story in (4) or (7) is read by a man, woman or another child, in a soft or loud voice, in a lullaby or story-reading context, with varying intonation, the same segmentation must be captured. The same units must be discovered.

3.2.2 The linguistic evidence

3.2.2.1 Where are the words?

The same units which may be words in one language may be parts of words or multiple words in another. The Arctic Inuktitut language in (16) comes from the natural speech of a two-year-old; (17), from experimental studies.

In Inuktitut and other polysynthetic languages like it, the word, e.g., the verb in (16) or (17), is morphologically complex, capturing the information which might be represented primarily by isolated word units in a language like English.

How are children in the Initial State to know which form of units to be searching for?¹⁰

16. tamaaniiqujinngitualu

ta – ma -ani -it -qu -ji -nngit -juq -aluk

PRE -here -LOC -be -want -ANTP -NEG -PAR.3sS -EMPH He doesn’t want (me) to be here (Juupi 2.0; Allen 1994, 133).

17. Nattirmik qungutuqturmik quqhugturmiktikkuarit!

Nattiq – mik qungutut- jug – mik quqhuqtuq – mik tikkuag – nit Point.to-IMP.25s

Seal–INST.sg smile -NOM-INST.sg yellow -INST sg

Point to the smiling yellow seal (Inuktitut, Parkinson 1999, p. 312).

9 “Cue” is a “term of convenience, useful for the purpose of referring to any piece of signal that has been found by experiment to have an effect on perception . . . any definition of an acoustic cue is always to some extent arbitrary” (Liberman 1996, 22).

10Current research compares acquisition of Inuktitut and English: Allen 1996; Allen and Crago 1993a, b; Fortescue and Olsen 1992; Fortescue 1984/5; Parkinson 1999; Mithun 1989; Pye 1980, 1992.

3.2.2.2 Where are the sounds?

Children must be able to perceive the same sounds but categorize them differently, or perceive different sounds and categorize them similarly, depending on the “configuration” or “system” of the language being acquired.

Variations in aspiration [+h or−h] occur in English, e.g., distinguishing the acoustic properties of the [k] in the beginnings of words [+h] from those in the middle of words [−h], as (18a) and (18b) exemplify. (Here ‘h’ signifies aspira- tion.)We recognize a /k/ in each word, regardless of whether the sound involves aspiration. However, in Hindi, an aspiration distinction is linguistically significant (“contrastive”) as in (19a) and (19b); new words result from this difference.

In acquiring Hindi or English, children must consult acoustic variation in aspiration and categorize it differently, depending on the system (phonology) of the language.

18. a. k^hit b. skit

19. a. kal – yesterday, tomorrow b. k^hal – rogue

Children must discover a unit which categorizes all variations of a sound which are similarly significant in a language. The unit to be acquired, traditionally called a “phoneme,”¹¹ is not a physical but a cognitive unit. It is not a sound, but an abstract category of potential sounds.¹²

3.2.2.3 Discovering the system

Children cannot knowa prioriwhich sound variability isphonemicor significant in the system of their language. The number and nature of phonemes varies widely across languages, from eleven to over 100. English is generally thought to have thirty-five to forty-five, while Rotokas (Papua, New Guinea) only eleven (five vowels, six consonants; Comrie, Matthews and Polinsky 1996).

Children must somehow discover how “sounds must be placed” in relation to each other according to “the inner configuration of the sound system of a language”

(Sapir 1925, 25).

The child’s task is not discovery of physical entities but discovery of a linguistic system. The linguist Sapir explicated this fact about language knowledge long ago: “phonetic phenomena are not physical phenomenaper se, however necessary it may be to get at the phonetic facts by way of their physical embodiment” (1925, 25).

11The “phoneme” has been debated since its original discovery. Weisler and Milekic 2000, 41–44 introduce the concept.

12Phonetic forms corresponding to particular sounds are annotated in brackets (e.g., [k]), while phonemes, corresponding to the abstract linguistic category are annotated as /k/.

38 c h i l d l a n g ua g e

3.2.2.4 Making variability tractable: knowing the rules

We are not usually deceived by variability because we “know the rules” and “processes” which underlie speech sound alternations.¹³

PHONOLOGICAL PROCESSES AND RULES

A “phonological process” operates on sounds or features of sounds, changing them in certain ways, e.g.,assimilatingthem to each other,substitutingfor them,deleting them. If such a process is regular, and generalizable, and we can specify the conditions (or contexts) under which it applies, we use the term “phonological rule.”

The sound assimilation rule in (20) is an example where the plural /s/ appears in several forms depending on context. (Here the notation ‘+/−V’ refers to whether or not the sound’ is voiced; cf. chapter 8).

20. Assimilation of sound features in English plural rule

/s/

top [s] bug [z] dish [ I z]

[-V] [-V] [+V] [+V]

In any language, we know these rules and/or processes tacitly, only occasionally becoming conscious of them. The French second language learner of English may be recognized pronouncing words like “ten” because they may not demonstrate the aspiration rule which produces [t^h] for [t] word initially in English.

21.

/t e n /

[t

e n]

Phonological rules and processes provide a regular way of mapping from an

“Underlying Representation” of what we know about the structure of the perceived word to the variable surface form.

13Halle and Clements 1983, 9–10, Weisler and Milekic 2000, 41–45; Cipollone et al. 1998;

Kenstowicz 1994 introduces “phonological rules.”

Even the preschool child productively applies the plural rule in (20) to nonsense words where these could not have been learned, as in a classic study which tested children on examples like (22).

22. This is a wug.

Now there is another one.

There are two of them.

There are two .

The preschoolers correctly provided the plural “WUGZ” in 76 percent of cases, and the plural “HEAFS” in 79 percent of cases (Berko 1958, 159; cf. Potts et al. 1979, Pinker 1999), unconsciously but regularly assimilating the sound of the plural /s/ to the sound of the final consonant of the word.

While adults know the rules specific to their language, and thus are not deceived by surface variations, children cannot start with language-specific rules. They must acquire them.¹⁴ In Sinhala, for example, a plural form (23b) has a null inflection (with inanimate nouns), while the singular is inflected (23a).

23. a. Sinhala singular poT∂ book kaduw∂ sword b. Sinhala plural

poT books kadu swords

How do children come to know which sound changes are regular and systematic variations in their language? Should they attend to the beginning of the words, as necessary for Welsh (e.g., Meara and Ellis 1982), to the middle of words (e.g., in Inuktitut as in (16) or (17)), or to the ends of words, as necessary for English?

When is a null form possible?

3.2.2.5 Relating levels of representation: phonology, morphology and syntax

As we saw in chapter 2, language knowledge involves digitization at several “levels of representation” and these levels must be related to each other (figure 2.2). Children must discover the units at each level – and the computation which relates them – in order to acquire a language. In (5) or (8), not only must sounds be grouped to words (e.g., to “Cinderella” or “amma”), but words to phrases, phrases to clauses, and clauses to complex sentences. Even if a language learner is able to determine the relevant phonological or sound units in the speech stream, they still must determine the structural relations among them: that “about Cinderella” modifies the “story” in (4), or that “nattiq” is the object of the verb

“tikkuaq (point to)” in Inuktitut in (17). We can assume that the architecture of

14The surface form is opaque in another way: rules may interact and even contradict themselves in deriving the surface forms.

40 c h i l d l a n g ua g e

the Language Faculty must inform this computation, which must exceed either positive or negative evidence.

3.2.2.6 Summary

Direct negative evidence may not exist for children or have influence when it does; indirect negative evidence depends on reference to the learner’s pre-existent hypotheses.

Positive evidence is available to the infant in the Initial State, but:

r This evidence is variable and degenerate.

r The speech stream is fundamentally opaque with regard to the units which must be discovered.

r The stimulus provided to the infant is continuous, while language knowledge requires a digital representation.

r There is cross-linguistic variation in the units which must be discovered and in the mode of their realization in the speech stream.

r The units to be discovered are cognitive (linguistic), not physical, and they involve multi-level linguistic computation.

r Children must discover the linguistic units by discovering the linguistic system for their language.

r In order to discover the grammar, children must transform the PLD to which they are exposed. They mustcreatea grammar.

Dalam dokumen Child Language: Acquisition and Growth - Spada UNS (Halaman 55-62)