Conclusion - 3 Concordances in the classroom without a computer: assembling and exploiting

3 Concordances in the classroom without a computer: assembling and exploiting

3.9 Conclusion

Data collection and materials development

principled approach to corpus design is more likely to cover the language that students need than an approach which selects texts and language focus points in a more random fashion. See Willis and Willis (2007: 187–98) for more on the syllabus design process.

It is impossible for most language teachers and course designers to assemble their own research corpus for a particular group of learners, unless the learners’ target discourse is a very narrow, well-defined area which is readily researchable. But there is a growing range of more specialist language corpora with frequency lists already assembled (see Chapter 2 Appendix for sources) and over the next few years more will be made available for public use. It is, however, possible to aim at assembling the learners’ own pedagogic corpus, that is, one that reflects as far as possible their target language needs, even without the insights gained from a computational analysis of a research corpus.

The most frequent words, meanings and patterns are obviously going to be the most useful for learners and give the most efficient coverage of the target discourse. But in addition to the criterion of frequency, we need to take into account factors such as learnability and learners’

immediate interests. Thus the syllabus might well include words that are similar in the two languages, and words from topic areas and types of text (e.g. sport, pop songs, magazine pieces) that students find moti- vating. Such texts would then become part of the pedagogic corpus, and would undoubtedly also serve to illustrate more common uses of common words.

To further increase their vocabulary and to extend their experience of language, individual learners should always be encouraged to read (and listen) more widely on their own, and to look out for more examples of specific features from outside data, but since this will be part of the individual learner’s corpus, and unfamiliar to other learners, it would not form part of the pedagogic corpus available for concordance analysis.

Concordances in the classroom

73 narrative – all useful for students wishing to write or to speak with more fluency and naturalness.

A full-length lexical syllabus derived from a suitable research corpus might comprise an inventory of, say, 2,000–3,000 words and their meanings and patterns. This could be used as a checklist, and would allow the teacher or materials writer to gain a far more reliable coverage of language that learners needed. But this is the ideal, and without computational facilities, it would take a long time to find and assemble suitable examples of all these words from the materials selected for the pedagogic corpus.

The benefits of focusing on a mere 50 or so very common words may at first sight have seemed somewhat limited in scope. However, because these words occur so frequently in all kinds of text and have so many different uses, they provide the cement for a huge number of fixed and semi-fixed expressions and grammatical patterns. Using these common words as ‘bait’, learners are likely to catch a wide variety of other useful words, phrases and patterns, and will inevitably gain insights into new aspects of the target language as exemplified by their pedagogic corpus.

The analysis activities encourage learners to process text more closely, to systematise their knowledge and to look out for similar examples in their own reading outside class. Once attention has been drawn to the meanings, uses and functions of common words in the target language, learners are more likely to notice and reflect on further occurrences of the language items that have been made salient through study of the concordances. This process should lead to the development of the learner’s interlanguage. Analysis activities and awareness-raising procedures can also encourage learner independence and efficient dictionary use (especially with regard to the common words that students often think they know already and do not bother to look up). They help learners to recognise the parts played by collocation and lexical phrases and to realise there is more to language than just vocabulary and grammar.

Working directly from the data, searching for patterns, investigat- ing and describing what is actually there, is a secure and relatively unthreatening activity. It is ideal for mixed-level classes since, being a learner-centred activity, it allows students to work at their own level, in their own time and in their own ways. It also provides solid benefits for teachers. I have constantly found that language analysis activities inform and enrich my own view of the language. Not only learners but also teachers are likely to gain from an investigative approach to language.

Data collection and materials development

References

Batstone, R. 1994. ‘Product and process: grammar in the second language classroom’. In M. Bygate, A. Tonkyn and E. Williams (eds.), Grammar and the Language Teacher. Hemel Hempstead: Prentice Hall International.

Ellis, N. 2003. ‘Constructions, chunking, and connectionism: the emergence of second language structure’. In C. Doughty and M. Long (eds.), The Handbook of Second Language Acquisition. Oxford: Blackwell.

Ellis, R. 1991. Second Language Acquisition and Second Language Pedagogy.

Avon: Multilingual Matters.

2003. Task-based Language Teaching and Learning. Oxford:Oxford University Press.

Johns, T. 1991. ‘Should you be persuaded – two samples of data-driven learning materials’. In T. Johns and P. King (eds.), Classroom Concordancing, ELR Journal 4. CELS: University of Birmingham.

2002. ‘Data-driven learning: the perpetual challenge’. In B. Kettemann and G. Marko (eds.), Teaching and Learning by Doing Corpus Analysis.

Amsterdam and New York: Rodopi.

Mauranen, A. 2004. ‘Spoken – general: spoken corpus for an ordinary learner’. In J. Sinclair (ed.), How to Use Corpora in Language Teaching.

Amsterdam: John Benjamins.

O’Keeffe, A., M. McCarthy and R. Carter. 2007. From Corpus to Classroom.

Cambridge: Cambridge University Press.

Römer, U. 2006. ‘Pedagogical applications of corpora: some reflections on the current scope and a wish list for future developments’. Zeitschrift für Anglistik und Amerikanistik, 54(2): 121–34, available at www.

uteroemer.com/ZAA 2006 Ute Roemer.pdf

Schmidt, R. 1990. ‘The role of consciousness in second language learning’.

Applied Linguistics, 11(2): 129–58.

Sinclair, J. (ed.). 2004. How to Use Corpora in Language Teaching.

Amsterdam: John Benjamins.

Skehan, P. 1994. ‘Interlanguage development and task-based learning’. In M. Bygate, A. Tonkyn and E. Williams (eds.), Grammar and the Language Teacher. Hemel Hempstead: Prentice Hall International.

Willis, D. 1990. The Lexical Syllabus. Collins Cobuild. Out of print but available free on www.cels.bham.ac.uk/resources/LexSyll.shtml

2003. Rules, Patterns and Words: Grammar and Lexis in English Language Teaching. Cambridge: Cambridge University Press.

Willis, D. and J. Willis. 1996. ‘Consciousness-raising activities in the language classroom’. In J. Willis and D. Willis (eds.), Challenge and Change in Language Teaching. Oxford: Heinemann ELT. Now available on the authors’ website: www.willis-elt.co.uk/books.html

2007. Doing Task-based Teaching. Oxford: Oxford University Press.

Concordances in the classroom

75 Appendix A: Wordlists from a general research corpus

1 the 11,110,235 2 of 5,116,374 3 to 4,871,692 4 and 4,574,340 5 a 4,264,651 6 in 3,609,229 7 that 1,942,449 8 is 1,826,742 9 for 1,716,788 10 it 1,641,524 11 was 1,395,706 12 on 1,354,064 13 with 1,262,756 14 he 1,260,066 15 I 1,233,584 16 as 1,096,506 17 be 1,030,953 18 at 1,022,321 19 by 980,610 20 but 884,610 21 are 880,318 22 have 879,595 23 from 872,792 24 his 849,494 25 you 819,187 26 they 779,636 27 this 771,211 28 not 704,615 29 has 693,238 30 had 648,205 31 an 629,155 32 we 552,869 33 will 542,649 34 said 534,522 35 their 527,987 36 or 527,919 37 one 522,291 38 which 513,286 39 there 501,951 40 been 496,696 41 were 485,024 42 who 480,651 43 all 478,695 44 she 469,709 45 her 448,175 46 would 430,566 47 up 428,457 48 more 422,111 49 when 404,674 50 if 401,086

51 out 398,444 52 about 393,279 53 so 378,358 54 can 369,280 55 what 359,467 56 no 342,846 57 its 333,261 58 new 324,639 59 two 308,310 60 mr 302,507 61 than 297,385 62 time 293,404 63 some 293,394 64 into 290,931 65 people 289,131 66 now 287,096 67 after 280,710 68 them 279,678 69 year 272,250 70 over 266,404 71 first 265,772 72 only 260,177 73 him 259,962 74 like 258,874 75 do 256,863 76 could 255,010 77 other 254,620 78 my 253,585 79 last 238,932 80 also 236,350 81 just 232,389 82 your 227,200 83 years 217,074 84 then 214,274 85 most 208,894 86 me 206,475 87 may 198,700 88 because 196,595 89 says 193,730 90 very 189,285 91 well 188,445 92 our 186,013 93 government 184,618 94 back 184,105 95 us 182,796 96 any 180,222 97 even 178,657 98 many 173,938 99 three 173,093 100 way 172,787

101 world 170,293 102 get 168,694 103 these 168,486 104 how 167,461 105 down 166,119 106 being 165,168 107 before 165,119 108 much 164,217 109 where 161,691 110 made 161,595 111 should 159,023 112 off 155,770 113 make 153,978 114 good 153,878 115 still 151,889 116 ’re 151,359 117 such 150,812 118 day 150,684 119 know 147,052 120 through 145,920 121 say 143,888 122 president 143,502 123 don’t 142,288 124 those 142,260 125 see 141,845 126 think 140,701 127 old 140,096 128 go 137,929 129 between 137,009 130 against 136,989 131 did 135,593 132 work 131,780 133 take 131,212 134 man 130,580 135 pounds 130,095 136 too 129,804 137 long 127,660 138 own 125,299 139 life 124,047 140 going 124,018 141 today 123,869 142 right 121,995 143 home 121,052 144 week 119,115 145 here 118,177 146 another 116,325 147 while 115,963 148 under 113,114 149 London 112,310 150 million 112,138

Table 3.1 The 150 most frequent word forms occurring in The COBUILD Bank of English written corpus of 196 million words

Data collection and materials development

Table 3.2 The 150 most frequent word forms occurring in The COBUILD Bank of English spoken corpus of 196 million words

1 the 500,843

2 I 463,445

3 and 367,221

4 you 359,144

5 it 313,032

6 to 308,438

7 that 284,422

8 a 273,009

9 of 242,811

10 in 187,523

11 er 178,464

12 yeah 155,259 13 they 135,084

14 was 133,022

15 erm 132,836

16 we 124,928

17 mm 122,674

18 is 113,420

19 know 111,741

20 but 100,648

21 so 91,836

22 what 89,364

23 there 88,938

24 on 88,456

25 yes 87,211

26 have 84,294

27 he 79,137

28 for 77,842

29 do 77,207

30 well 75,287

31 think 74,543 32 right 74,191

33 be 66,492

34 this 65,424

35 like 63,948

36 ’ve 63,160

37 at 62,654

38 with 61,289

39 no 60,885

40 as 58,871

41 mean 58,825

42 all 58,360

43 ’re 57,131

44 or 56,857

45 if 56,774

46 about 56,321

47 not 56,109

48 just 55,329

49 one 55,189

50 can 53,090

51 are 51,775

52 got 51,727

53 don’t 51,273

54 oh 51,013

55 then 44,372

56 were 41,453

57 had 41,185

58 very 41,128

59 she 38,841

60 get 38,361

61 my 38,194

62 people 37,774

63 when 37,335

64 because 37,172 65 would 35,945

66 up 35,894

67 them 34,766

68 go 34,127

69 now 33,801

70 from 33,633

71 really 33,444

72 your 33,310

73 me 33,278

74 going 32,598

75 out 32,015

76 sort 31,555

77 been 30,405

78 which 30,334

79 see 30,325

80 did 30,175

81 say 29,720

82 two 28,817

83 an 27,485

84 who 27,220

85 how 26,837

86 some 26,172

87 name 26,029

88 time 25,990

89 ’ll 25,154

90 more 24,586

91 said 23,143

92 ’cos 22,345

93 things 21,982 94 actually 21,131

95 good 20,783

96 other 20,378

97 want 20,375

98 by 20,260

99 could 19,435

100 any 18,958

101 okay 18,757

102 much 18,567

103 didn’t 18,521 104 thing 18,480

105 lot 18,453

106 where 18,440 107 something 18,134

108 way 17,895

109 here 17,819

110 quite 17,470

111 come 17,089

112 their 16,892

113 down 16,678

114 back 16,505

115 has 16,017

116 place 15,888

117 bit 15,520

118 used 15,267

119 only 15,159

120 into 15,094

121 these 15,064 122 three 15,059

123 work 15,005

124 will 14,939

125 her 14,286

126 him 14,160

127 his 14,029

128 doing 13,921

129 first 13,273

130 than 12,998

131 went 12,842

132 put 12,692

133 why 12,653

134 our 12,610

135 years 12,437

136 off 12,393

137 those 12,248

138 us 12,245

139 course 12,211

140 mhm 12,112

141 isn’t 12,060

142 over 11,874

143 look 11,297

144 done 11,247

145 year 11,224

146 take 11,190

147 being 11,153 148 should 11,007 149 school 11,001 150 thought 10,786

Concordances in the classroom

77 Appendix B

This table shows what proportion of general English text is covered by the most frequent word forms. By word forms I mean that have, has, had, and so on, and singular and plural nouns, for example, each count as a separate item.

Table 3.3

The most common 25 word forms account for 29% of written text and 29%

of spoken text

50 36% 36%

100 42% 46%

500 56% 66%

(Source: Cobuild Bank of English: figures based on a written corpus of 196 million words and a corpus of unscripted speech of 15 million words.)

4 Telling tails: grammar, the spoken

Dalam dokumen 5 Materials Development in Language Teaching (Halaman 94-100)