Introduction
Materials evaluation is a procedure that involves attempting to predict or measure the value of the effects of language-learning materials on their users. The macro effects on learners can include the understanding and production of language, the acquisi- tion of language, the development of language skills and the development of commu- nicative competence, whilst the micro effects can include engagement, motivation, self- reflection, self-esteem, autonomy and attitudes toward the target language and towards the learning of it. Teachers are also users of materials and the effects on them can include changes of beliefs, engagement, motivation, investment and teacher development, as well as ease of preparation and delivery. It can even be argued that administrators are users of materials and that the effects they are looking for are standardization and value for money. Interestingly, although the prime users of commercially produced materi- als are learners their prime buyers are administrators. Brian once did a confidential research project for a major British publisher in which he investigated what the users of coursebooks wanted from the coursebooks they used. He found in the 12 countries he researched that in most institutions the coursebooks were selected by administrators, that in a few institutions the classroom teachers selected the coursebooks and that in no institutions were the textbooks selected by learners. Guess who most commercial coursebooks are designed to appeal to?
Materials evaluation is probably the most written about procedure in the field of mate- rials development. This is not surprising as publishers have always evaluated the mate- rials developed by their authors and teachers have always evaluated materials either explicitly when selecting them or implicitly when using them. What is surprising, as you will see below, is that so little has been published about publisher evaluation and about informal teacher evaluation, and so much has been published about formal evaluation for reviews and for materials selection.
Brian’s early experiences of materials evaluation included untrained evaluations of a global coursebook he wanted to make more relevant to his students in Nigeria and then of a coursebook written for Malawi, which was being used in Zambian secondary schools. In Zambia, Brian was teaching in the same secondary school as Rod Ellis and their discontent with the coursebook led them to writing their own series of coursebooks to replace it (Ellis & Tomlinson, 1973, 1974). That series was then selected to be the one and only official coursebook for Zambian secondary schools and it was still in use when
The Complete Guide to the Theory and Practice of Materials Development for Language Learning, First Edition. Brian Tomlinson and Hitomi Masuhara.
©2018 John Wiley & Sons, Inc. Published 2018 by John Wiley & Sons, Inc.
3 Materials Evaluation
we visited Zambia 30 years later. Even though the book does have its strengths, Brian and Rod wish they had had more expertise in materials evaluation when they were writ- ing it and that their publishers and the selection committee had been more rigorously evaluative too.
Hitomi started her teaching career in Japan where there was very little scope for teach- ers to write replacement coursebooks but she was immediately involved in formal eval- uation in order to contribute to a selection of coursebooks and then in informal eval- uation in order to make the coursebooks more relevant and engaging for her students.
Since then she has gone on to gain considerable experience of evaluation in selecting textbooks, in writing coursebook reviews, and in textbook projects.
When we run courses and workshops for teachers on materials development we always focus on materials evaluation as we think that teachers can learn a lot about materials, about language acquisition, and about their own implicit beliefs through spending time on rigorous evaluations of materials in preparation, as published, and in use. Away from the course, the teachers will not have the time to be so rigorous but being so on a course helps them to be principled when evaluating materials which they are developing, which their colleagues are developing, which they are selecting from, which they are reviewing, which they are adapting, or which they are actually in the process of using. We really wish we had received such training early in our teaching careers.
From our now extensive experience of materials evaluation we would say that an eval- uation of language learning materials should attempt to predict or measure whichever of the following effects are relevant to the context of learning in which the materials are being or are going to be used:
r
The surface appeal of the materials for the learners. (For example, are the illustrations attractive? Are the sections separated by sufficient white space? Is there an effective use of color?)r
The content appeal of the materials (Do the users like the topic content, the texts, and the activities provided by the materials?)r
The credibility of the materials to learners, teachers and administrators. (Do the mate- rials look as though they are going to meet their needs and wants?)r
The validity of the materials. (Is the learning they attempt to facilitate worth facilitat- ing?)r
The reliability of the materials. (Would the materials have the same effects with dif- ferent groups of target learners and when “delivered” by different teachers?)r
The ability of the materials to interest both the learners and the teachers.r
The ability of the materials to motivate the learners to use them and the teachers to“deliver” them.
r
The ability of the materials to engage the learners affectively and cognitively.r
The degree of challenge presented by the materials (with achievable challenge being the ideal aimed at).r
The relevance of the materials to the learners’ lives, needs, and wants.r
The value of the materials in terms of short-term learning (important, for example, for performance on tests and examinations).r
The value of the materials in terms of long-term acquisition and development (of lan- guage, of language skills, and of communicative competence).r
The learners’ perceptions of the value of the materials. Materials Development for Language Learning
r
The teachers’ perceptions of the value of the materials.r
The assistance given to the teachers in terms of preparation, delivery, and assessment.r
The flexibility of the materials (e.g. the extent to which it is easy for a teacher to adapt the materials to suit a particular context—see Bao (2015) for a focus on flexibility of materials).r
The contribution made by the materials to teacher development.r
The match with administrative requirements (e.g. standardization across classes, cov- erage of a syllabus, preparation for an examination).Adapted from Tomlinson (2013b), pp. 21–22.
Before deciding which of the effects listed above to measure it would be necessary to consider which of them are relevant to the specific context of use. No two evaluations can be the same, as the levels, needs, wants, experiences, objectives, and out-of-class backgrounds of the learners will differ from context to context. This is obviously true of an evaluation of the value of a coursebook being used with groups of teenagers preparing for an examination in Thailand compared to an evaluation of the same book being used with groups of young adults preparing for a different examination in Peru. The main point is that it is not the materials that are being evaluated but their effect on the people who come into contact with them (including, of course, the evaluators).
In order to measure the value of any of the general effects listed above it would be necessary to specify specific criteria (ideally phrased as questions that are unambiguous and answerable). For advice on how to develop and use such criteria see “The Principles and Procedures of Materials Evaluation that we Recommend,” later in this chapter.
Before concluding this introduction we would like to stress that conducting an eval- uation is not the same as doing an analysis. Both the objectives and the procedures are different. As we have said, an evaluation makes judgements about the effects of materials on their users. An evaluation can (and should be) structured, criterion ref- erenced, and rigorous but it will always be essentially subjective. In contrast, an analysis focuses on the materials themselves and it aims to be objective in its analysis of them.
It “asks questions about what the materials contain, what they aim to achieve and what they ask learners to do” (Tomlinson, 1999, p. 10). So, for example, “Does it provide a transcript of the recorded dialogues?” is an analysis question that can be answered by either “Yes” or “No.” “What does it ask the learners to do before listening to a song?”
and “which tenses does it teach?” are also analysis questions and can be answered fac- tually. As a result of answering many such questions, a description of the materials can be made that specifies what the materials do and do not contain and what they ask the learners to do. On the other hand, “Are the reading texts likely to engage the learners affectively?” is an evaluation question and can be answered on a cline between “very unlikely” and “very likely.” It can also be given a numerical value (e.g. 5 for “very likely”), and, after many evaluation questions have been asked and answered, scores can be cal- culated that can be used as indicators of the potential value of the materials. For exam- ple, a coursebook that scores a total of 80% or more is very likely to be effective but, if it scored a subtotal of only 55% for speaking skills, it would be unlikely to be effective for a class of students whose main objective is to develop their oral communication skills. See Littlejohn (2011) for an example and a discussion of materials analysis and Tomlinson, Dat, Masuhara, & Rubdy (2001), Masuhara, Haan, Yi, & Tomlinson (2008), and Tomlin- son and Masuhara (2013) for examples of criterion-referenced and rigorous materials evaluation.
3 Materials Evaluation
A rigorous analysis of a set of materials can be very useful for finding out, for example, if:
r
anything important has been missed out of a draft manuscript;r
the materials match the requirements of a syllabus or of a particular course;r
the materials contain what the teachers believe they should contain;r
the materials ask the students to do what they will have to do in an examination they are preparing for;r
the materials are promising enough to be subjected to a subsequent evaluation.Analysis seems to be objective because the questions are likely to be given the same answers by each of a large number of analysts. There is only one answer, for example, to the question, “Does the coursebook contain practice tests?” or the question, “Does each unit have a pronunciation section?” However, analysts are inevitably and often overtly influenced by their own ideology and their selections of questions are biased accordingly.
For example, the question, “Do the listening texts include different regional accents?”
implies that they should do. Analysts also often have a hidden agenda when develop- ing their instruments of analysis. For example, an analyst might ask the question, “Are the reading texts authentic?” in order to provide data to support an argument that EAP coursebooks do not typically help to prepare learners for the realities of academic read- ing. This is valid if the analysis questions are descriptive and the data that the analysis provides is then subjected to evaluative interpretation. For example, Brian conducted an analysis of 10 lower-level coursebooks (Tomlinson, 1999, p. 10) to provide data to support his contention that such books were typically too limited in their emphasis on language forms, on language practice rather than language use and on low-level decod- ing skills. His data disclosed that “nine out of the ten books were forms and practice focused and that in these books there were five times more activities involving the use of low-level skills (e.g. pronouncing a word) than there were involving the use of high- level skills (e.g. making inferences)” (Tomlinson, 2013b, p. 23). Brian then made use of his data to put forward a case for making lower level coursebooks more holistic, more meaning focused and better able to contribute more to the learners’ development of high-level skills. Yet a different analysis using the same instruments to reveal the same data could use its results to argue that lower level coursebooks were actually helping learners to develop from a confident base of low-level skills. Of course, both arguments would need to present data from an evaluation of the effects of the materials on the learners’ development to give credibility to their case.
One problem when looking for advice on evaluation is that many experts writing about materials evaluation mix analysis and evaluation and therefore make it very difficult to use their suggested criteria because, for example, in a numerical evaluation most analysis questions could only be answered by either 1 or 5 on a five-point scale and would thus be weighted disproportionately when combined with evaluation questions, which could yield 2, 3 and 4 as well. For example Mariani (1983, pp. 28–9) includes, in a section on
“Evaluate your coursebook,” such analysis questions as “Are there any teacher’s notes…”
and “Are there any tape recordings?” alongside such evaluation questions as “Are the various stages in a teaching unit adequately developed.” The two analysis questions could score 5 each (even if the teachers’ notes and the recordings were not very useful) and the evaluation question might only score 2—thus giving an undeserved high score of 12 out of 15. Likewise Cunningsworth (1984, pp. 74–79) includes both analysis and evaluation questions in his “Checklist of Evaluation Criteria.” He does demonstrate awareness of
Materials Development for Language Learning
the problem, though, by saying that “Some of the points can be checked off either in polar terms (i.e. yes or no) or, where we are talking about more or less of something, on a gradation from 1 to 5” (1984, p. 74). Our preference for separating analysis from evaluation is shared by Littlejohn (2011), who puts forward a general framework for analyzing materials (pp. 182–198), which he suggests should be used before evaluating materials and making decisions about them. He proposes a model, which is sequenced as follows:
r
Analysis of the target situation of use.r
Materials analysis.r
Match and evaluation (determining the appropriacy of the materials to the target situation of use).r
Action.This is a model that has been used by many of our postgraduate students in their detailed investigations of the value of specific materials in specific contexts of learning but it is dauntingly detailed and demanding for busy teachers wanting, for example, to make quick but principled decisions about the selection and / or adaptation of course- books. McDonough, Shaw, and Masuhara (2013) offer a similar but less daunting model designed to be useful to teachers without demanding too much time and expertise. Their model has two stages: an initial one, which involves an “external evaluation that offers a brief overview of the materials from the outside (cover, introduction, table of contents)”
(p. 53) and a subsequent one that involves a criterion-referenced “internal evaluation.”