We saw in chapters 2 and 3 that newborns must convert a continuous speech stream into units of sound which provide adigitalrepresentation of lan- guage, and must create a representation of how these units are sequentially and systematically related. This analysis of the speech stream and a “combinatorial principle” which applies to the sound units are necessary for children to both pro- duce and perceive any of an infinite number of possible new words and sentences, e.g., (1).
1. We like to hop on top of pop. Stop. You must not hop on Pop. (Seuss, 1963).
CRACKING THE CODE:Discovering the essential units of the sounds of a language and their system of combination, i.e., thephonologyof a language, is a necessary and primary step in “cracking the code” of the language surrounding the child.
Over the past several decades, research on development of bothspeech percep- tionandspeech production in young children has exploded with new scientific evidence (see 8.5). In this chapter, we will summarize highlights of research results in this area. Appendices 2a, 2b and 3 summarize developmental results for infant speech perception and production. Appendix 7 provides some common notational conventions in this area.1
8.1.1 What must children acquire?
Children must:
(a) Discover the units required2 in order to map from the continuous acoustic stimulus to a digital knowledge of language.
1See footnote 7, chapter 3 regarding the International Phonetic Alphabet.
2Linguists are not agreed on which units most fundamentally characterize the knowledge of a sound system of a language. Some have characterized phonemes as “bundles of distinctive features” (Halle and Clements 1983, 3), abstract properties of sounds which have linguistic significance. These features (about twenty total) are hypothesized to be sufficient to characterize all the significant sound distinctions of the world’s languages, to differentiate and classify them. Linguists vary in their exact formulation of these features and in the number proposed and some do not admit them at all. Some propose that features are not simply grouped in bundles, but are hierarchically organized in a multi-tiered structure, organized into subgroups with their own geometry (Clements 1999).
143
144 c h i l d l a n g ua g e
(b) Make fine distinctions in both perception and production. If they do not distinguish the initial sounds in “bop” and “pop”, for example,
“Bop the pop” will be indistinguishable from “Pop the bop.” For this, infants must distinguish specific features of sounds, e.g., the “+/− voice” feature which, in English, makes [p] and [b] discrete.
(c) Discover which differences are linguistically significant and which are not, i.e., which arecontrastivein their language(s).
(d) Know when to dismiss insignificant sound variations and to treat sounds as equivalent even though they may differ. Since every time a sound or stream of sounds is uttered it differs, and since covariation and phonological rules continually modulate sounds in the speech stream (cf. chapter 3), it is as important for children to dismiss varia- tion as to attend to it. Otherwise, every “pop” could be treated like a different word.
(e) Discover thephonologicalandphonotactic rules, which provide the forms of words. This involves creating an Underlying Representation (UR) for the sounds of words in the language and a systematic way of mapping from this UR to a representation of their surface forms, or Surface Representation (SR).
(f) Combine sound segments into larger phonological units, i.e.,
“suprasegmental units,” sequencing them. In this way, acquiring knowledge of phonology can be viewed as acquiring “an orchestral score” (Anderson 1985, 348). This “score” allows us and the child to realize the rhyme in (2), where word pairs match syllable structure and meter as well as subtle sound substitutions.
2. A Simple Thimble or
a Single Shingle?
(Seuss, 1979)
(g) Confront suprasegmental units and systems of their organization. For example, in Berber (Morocco), syllables need not have vowels; utter- ances like “tsqssft stt” (“you shrank it”) are “quite unexceptional”
(Clements 1999, 639). Words in English, Dutch and Sesotho are pro- ductivelytrochaicin metrical structure; the hierarchical “word tree”
is frequently strong–weak in foot structure, whereas Quich´e Mayan is iambic (weak–strong).3 Infants must determine whether the lan- guage is “syllable-timed” like the Romance languages (organized temporally on the basis of syllables) or, like English, organized with regard to stress.
3Jusczyk suggested that approximately 90 percent of content words in English conversations begin with a stressed syllable, cf. Cutler and Carter 1987.
(h) Relate the score and the notes, discovering what we might call the
“secret skeleton” which relates these.4 One representation of this is sketched in (3).
3. A Secret Skeleton (Clements and Keyser 1983) σ Syllable tier
C V C Skeletal tier
[p a p] Segmental Tier
“pop”
(i) Discover what sound combinations arenotpossible in their language.
(j) Relate speech perception and production.
(k) Map from acoustic “cues” in the speech stream in order to accomplish (a–j).
(l) Discover which speech cues are critical in the language and how (cf.
chapter 3).
8.1.2 What are the challenges?
As we saw in chapter 3, the speech stream does not directly provide the units or the system for organizing them. In the absence of perceptual invari- ants, children must create invariant contrasts and categorize sounds, acquiring and integrating a set of levels of representation: the sound categories (phonemic units), the linguistically distinctive features which characterize and distinguish these units, and the syllable and word units. Computation must combine segmen- tal and suprasegmental dimensions in order to derive the “score” of the language.
Length, stress and tone which shape words and word combinations must be con- sidered. Children must acquire the rules and processes which relate the levels of representation across a hidden skeleton in a constrained but invisible manner, as suggested in figure 8.1.
4Suprasegmental phonological structure includes a temporal level of representation (e.g., the foot, syllable or mora). A “hidden skeleton” can capture this, providing “slots” or a “template” to which segments can be associated. Research is pursuing various theories of this skeleton (Kenstowicz 1994). In “non linear” phonology, different units may be represented on separate parallel levels (or tiers) in what is called an “autosegmental” representation. Some have proposed a Prosodic Hierarchy (Selkirk 1984; Nespor and Vogel 1986).
146 c h i l d l a n g ua g e
Phonological knowledge
Underlying representation: Categorical, featural and other structural knowledge
The System:
Hidden skeleton Rules, principles, parameters and constraints
Output (Surface representation) The child’s experience Fig. 8.1 Phonological knowledge
8.1.3 Leading questions
r When do children perceive and produce the fine sound distinctions of language? How and when do children differentiate the contrastive sound categories (the phonemes) of language?
r When do children categorize sound distinctions?
r Are children’s speech perception mechanisms qualitatively different from adults’?
r Do children’s auditory or motoric abilities change over time and deter- mine acquisition of phonology?
r Do relations between speech perception and speech production change with development?
r Does children’s early phonology first reflect input-determined, language-specific structure before universal structures, or do universal structures lead?
r Is the basic architecture of the Language Faculty operative in the Initial State and continuously through development, or are there qualitative changes? Is there a “prelinguistic” stage in children’s early sound perception and production? Is the course of acquisition marked by a passage from “phonetics” to true “phonology” only later (e.g., Vihman and Velleman 2000)? What is the nature ofchange over timein the course of language acquisition of phonology?
r How do children use the input at the Initial State and over time? For example, how are URs determined from exposure to input data?
r Are children’s perceptions of adult words and their URs for these similar to adults’ (e.g., Smith 1973; Menn and Matthei 1992), or qualitatively different?5(Contrast Figures 8.2a–8.2b.)
5Ingram 1976; Braine 1976; Macken 1979; Vihman 1982; Fee 1995.