3 Concordances in the classroom without a computer: assembling and exploiting
3.2 From corpus to concordances
51
3 Concordances in the classroom without
Data collection and materials development
52
children to see if there is any difference between the language of boys and girls at that age. Usually the first step is to gather a corpus (which can be stored electronically), a body of the relevant language, in this case the language of three-year-old boys and girls. This is an obvious step, but it is not an easy one. Decisions must be taken as to the size of the corpus, and care must be taken to see that the corpus is as rep- resentative as possible. But in principle the task is a manageable one.
Once a researcher has assembled an appropriate corpus, that corpus can be used to answer relevant research questions.
Increasingly nowadays corpora are used in this way to help research- ers analyse and describe the grammar and lexis of the language. A study may be directed at a particular genre of language – spoken as opposed to written, say, or the language of television chat shows or of research arti- cles in medical journals. Corpora can also be used to provide a picture of the language as a whole, but if this is the aim, then a very large cor- pus running into many millions of words is required. One of the earliest and best known large corpora of this kind is The Bank of English. This corpus, assembled in the 1980s and named the Collins and Birmingham University International Language Database (hence COBUILD), pro- vided the basis for the Collins Cobuild English Dictionary, The Collins Cobuild Student’s Grammar, The Collins Cobuild English Course and many other reference books. COBUILD set a trend and was soon fol- lowed by a number of other corpus-building projects directed at learn- ers of English, such as the British National Corpus (BNC) and the many others listed in the Appendix to Chapter 2. These in turn led to more corpus-informed reference books and grammars such as The Cambridge Grammar of English (2005) and The Longman Student Grammar (2002).
The process of gathering a corpus of this kind is extremely complex.
Once the corpus has been assembled, however, and has been stored in computer memory, the process of examining it is relatively simple.
If lexicographers wish to analyse and define a particular word, they can use a computer program called a concordancer to generate a number of concordance lines of that word. Even a limited num- ber of concordances can provide us with some useful insights, as the small set of concordance lines for the word any in Figure 3.1 shows.
These lines were carefully selected to give a tiny but representative sample from the original COBUILD corpus. Pedagogic grammars and coursebooks often give the rule that any is used in negatives and questions, and some is used in statements. As you read down through this set of concordances, try to think of what the word any actually means. In how many cases do the lines here conform to the commonly given rule?
Concordances in the classroom
53 These concordance lines suggest that the rule does not hold up very well.
Around half the examples show any used in positive statements. In fact, in all its uses any seems to carry a general non-specific meaning of ‘It doesn’t matter which’ (which is maybe why it is used commonly in ques- tions and negatives where there is often nothing to be specific about). A far larger set of concordances would be needed if we wished to identify common collocations, patterns and pragmatic uses. But this small sam- ple does accurately reflect the balance of uses of the word any from the research corpus, which in turn reflects typical everyday usage.
The corpus research process, then, involves isolating a particular lin- guistic feature, a word or a pattern, and studying that feature in detail.
From this organised study of the language, researchers are able to pro- duce a description of the language – its grammar and lexis, its typical patterns, collocations, meanings and uses.
Once we begin to view the process of language description in this way, it is a short step to applying the process pedagogically. Teachers want to make language description accessible to students. Students need to discover and internalise regularities in the language they are study- ing. If we can place students in the position of researchers (as suggested by Johns 1991 and 2002, and illustrated in Willis and Willis 1996), this will accomplish these goals neatly and economically and could well increase the self-esteem and confidence of the students.
This process of language analysis will inevitably lead to particular aspects of the language becoming salient, which is the first aim of any kind of awareness-raising activity. A rationale for such an approach is outlined in Brian Tomlinson’s Chapter 1, the Introduction to this book. Schmidt (1990) and others argue that ‘noticing’ features of the target language is a neces- sary initial stage in the learning process. Ellis (1991: 241, fleshed out in Ellis
are interesting to observe. Any child under two is given a bottle so the young men went for any job they could rather than a farm job state of affairs could not go on any longer. Someone had to act soon
they hadn’t dared to strike any more matches - they were just the longest open tradition of any of the English link that have
complicated. The closing of any of them would be a major engineering We work more overtime than any other country in Europe, even dry. I don’t think there was any rain all summer long, was there?
just won’t come out. Have we any stain remover? . . . I thought there at Steve’s house. just turn up any time after 12. It’ll go on all afternoon
hard pressed. there was never any time for standing back and appraising
Figure 3.1
Source: Cobuild data sheets, 1986
Data collection and materials development
54
(2003: 163)), argues that ‘ consciousness-raising constitutes an approach to grammar teaching which is compatible with current thinking about how learners acquire L2 grammar’. Rather than rely on a diet of ‘practice activi- ties’ which restrict input and expect immediate accuracy in the ‘production’
of small items of language, we should be giving learners plenty of opportu- nities to discover language and systematise it for themselves before expect- ing them to proceduralise their knowledge and put it to use. In support of this, Willis (2003) illustrates numerous practical ways to draw learners’
attention to different aspects of language – from words to lexical phrases and pattern grammar. Later in this chapter I show how different kinds of analysis activities based on concordance lines for the most frequent words can highlight a rich array of language features. These activities help stu- dents both to recognise and memorise useful patterns and recurrent chunks (fixed phrases, such as a matter of fact, Know what I mean?) as well as to analyse and make useful generalisations about grammar.