• Tidak ada hasil yang ditemukan

239205628 Science Education Journal

N/A
N/A
Protected

Academic year: 2017

Membagikan "239205628 Science Education Journal"

Copied!
18
0
0

Teks penuh

(1)

PLEASE SCROLL DOWN FOR ARTICLE

On: 7 January 2010

Access details: Access Details: [subscription number 778684090] Publisher Routledge

Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK

Assessment in Education: Principles, Policy & Practice

Publication details, including instructions for authors and subscription information: http://www.informaworld.com/smpp/title~content=t713404048

The Role of Assessment in Science Curriculum Reform

Graham Orpwood

To cite this Article Orpwood, Graham(2001) 'The Role of Assessment in Science Curriculum Reform', Assessment in Education: Principles, Policy & Practice, 8: 2, 135 — 151

To link to this Article: DOI: 10.1080/09695940125120

URL: http://dx.doi.org/10.1080/09695940125120

Full terms and conditions of use: http://www.informaworld.com/terms-and-conditions-of-access.pdf

This article may be used for research, teaching and private study purposes. Any substantial or systematic reproduction, re-distribution, re-selling, loan or sub-licensing, systematic supply or distribution in any form to anyone is expressly forbidden.

(2)

The Role of Assessment in Science

Curriculum Reform

G

RAHAM

O

RPWOOD

York/Seneca Institute for Science, Technology and Education (YSISTE), 70 The Pond Road, Toronto, Canada M3J 3M6

ABSTRACT The argument of this article is that changes in curriculum need to be closely linked to changes in assessment and that this is true as much of the forms of assessment as it is of its content. Using science as the case in point, the changes in the goals of science education in the 1960s towards a greater emphasis on inquiry skills were matched some 20 years later with a change in assessment to include performance assessment. Now the new goals of science education are focused on the need to link science to the broader social context, but assessment practices have yet to catch up with this change. Given the relatively greater importance on assessment in the present era, the new curriculum emphasis may well be ignored unless new approaches to assessment are not designed and implemented soon.

Developing valid and reliable assessment instruments is complex at the best of times. However, at times of major change in the curriculum, additional challenges and dilemmas present themselves to test developers and force questions about the sometimes competing roles of assessment in the larger educational context. In times gone by, such competing roles might have been of only academic interest. At the present, however, assessment—whether international, national or local—has become of such importance, both educationally and politically, that clarifying the roles and purposes of assessment has become a priority.

In the past 10 years, I have had the opportunity of participating in or closely observing several science curriculum development projects and also two science assessment projects. The curriculum development projects have been in the Cana-dian context and speciŽcally in the province of Ontario, but in the course of undertaking these projects I have also had reason to analyse science curriculum developments elsewhere in the world. The assessment projects with which I have been associated have included the Third International Mathematics and Science Study (TIMSS)—a large international study for which I acted as science co-ordinator—and the Assessment of Science and Technology Achievement Project (ASAP)—an Ontario project to develop curriculum and assessment resources for classroom teachers. Despite the differences in purpose of these two projects, they shared the challenge of developing valid science assessment instruments in a period of signiŽcant curriculum change.

ISSN 0969-594X print/ISSN 1465-329X online/01/020135-17Ó2001 Taylor & Francis Ltd DOI: 10.1080/09695940120062629

(3)

Curriculum development should, of course, include the development of appropri-ate assessment both for the classroom teacher to use, and for any external assess-ment that is used as a summative or external assessassess-ment. However, in my experience, many curriculum guides or policy directions are determined by one institution, learning materials (such as textbooks) by another and assessment (both classroom and external) by yet other individuals or examination boards. Among these various players, there may or may not be consistency of understanding or commitment concerning the curriculum and the assessments (by whomever given) may or may not have high validity. This consistency and validity (or their absence) form the central theme of this article, which takes science as its case in point. However, the central issue may be equally applicable to other subject areas also. The argument of the article begins with a brief contextual account of some of the changes that have been taking place in science curriculum during the past 50 years, at least in the English-speaking world. I shall argue that, while some of these have constituted what can be called ‘normal’ curriculum change, others—using the new Ontario science curriculum as a case in point—warrant the label of curriculum ‘revolutions’. Next, I examine the role of assessment during these periods of curriculum revolution and identify one corresponding revolution in this area. Finally, through reecting on these experiences, I shall argue that leadership in assessment in support of curriculum change must come through research and the professional development of teachers, rather than through large-scale international assessment projects.

Science Curriculum Change

From the long-term perspective, the science curriculum can be seen always to be in a state of ux. While governments or ofŽcial curriculum agencies or examination boards may not issue a new curriculum every year, teachers are always Žnding new ways to present the curriculum to students and, therefore, students always experi-ence a ‘new’ curriculum. Sometimes, the change is minor and simply constitutes changing instructional routines, but at other times, especially after a new ‘ofŽcial’ curriculum has been issued, the changes called for at the classroom level may be more signiŽcant.

The International Association for the Evaluation of Educational Achievement (IEA) has developed a useful framework for distinguishing three different senses in which the term ‘curriculum’ is used (Robitaille et al., 1993):

· the intended curriculum—as set out or mandated in ofŽcial statements of the curriculum;

· the implemented curriculum—as actually taught or delivered in schools; · the attained curriculum—as achieved by the students.

This is a useful framework as it enables us to conceptualise important relationships among the three levels. In general (and to over-simplify the complexities of the relationships among these three), governments (or other ofŽcial agencies) control the intended curriculum, teachers the implemented curriculum and students the

(4)

attained curriculum. The Žrst two levels are never entirely synchronised and, at times, there may be very signiŽcant slippage between them. Those involved with assessment want to make claims about the third level—the ‘attained curriculum’. However, since this is not directly observable, we have to use ‘indicators’ of achievement in the form of assessment instruments, from which the attained curriculum can be inferred. The Žrst part of this article is focused on ways in which the intended curriculum has changed over the years. Later, I shall consider how approaches to assessment have reected these changes.

Normal Curriculum Change

Of course, the science content of the intended curriculum—the ‘what’ should be taught and learned that is the substance of all curricula—is constantly undergoing reŽnement as science itself evolves and, as Spencer’s century-old question ‘what knowledge is of most worth?’, is constantly given new answers. As genetics, microbiology and ecology become recognised as critically important elements of biology, the traditional botany and zoology that characterised curricula of the 1950s have given way more and more to these newer aspects of life science. Earth and space science has moved from Geography to Science in several jurisdictions as the importance of the scientiŽc (as opposed to the social) aspect of these areas has increased. Chemistry increasingly focuses its attention on matters of structure, mechanism and energy change, from the more traditional attention to the classiŽcation and properties of materials. Curiously, school physics appears to have retained a more traditional view of appropriate content with the term ‘modern physics’ being used to characterise aspects of the subject discovered largely in the period of 1890–1920. Even the inclusion of technology in some science courses can be seen as yet another adjustment to the course content. If Kuhnian terminology can be employed in this context, these changes in content can be seen as aspects of ‘normal’ curriculum change.

Science Curriculum Revolutions

However, overlaying this normal evolutionary change of the science curriculum, the past 50 years have seen at least two more important changes—changes that I believe warrant the term ‘curriculum revolutions’ [1]. These revolutionary changes—like the paradigm shifts Kuhn described to explain the growth of science—were intended to change, in a fundamental way, how the science curriculum was to be understood, taught and learned. They focused less on the content of the science curriculum, and more on the goals for or purposes of teaching and learning science. ScientiŽc knowledge still represents the core of the curriculum, but the question ‘Why are we learning this stuff?’ is given a new set of answers.

Roberts’ (1982) concept of curriculum emphasis helps capture the nature of the change involved. For Roberts, the content of science teaching is always presented in a context—he calls it a ‘curriculum emphasis’—which communicates to the student

(5)

(often implicitly) the purpose of learning the science content. Roberts has also described a series of seven such curriculum emphases that have characterised science curricula during this century. While elements of most of these can be found in some classrooms today, it was not always so. In the 1950s, barely three of Roberts’ seven emphases were to be found, and two of these—‘correct explanations’ and ‘Žrm foundations’—both see acquiring scientiŽc knowledge as an end in itself, the only worthwhile outcome of the curriculum. In one case—correct explanations—because it (science) is ‘true’ and in the other—Žrm foundations—because it sets a foundation for the further study of science. In this context, ‘normal change’ of the curriculum can be seen as the adjustment of which scientiŽc knowledge should be learned and at which stage. By contrast, ‘revolutionary change’ involves the introduction of one or more brand new or radically different curriculum emphases, a phenomenon we observed Žrst some 30 or 40 years ago, and are observing, once again, at the present time. The new emphasis not only adds a dimension to the curriculum. It also changes radically the selection of science content seen to be important and changes the ways in which students are expected to interact with that science content.

The Žrst of these periods of revolutionary change began in the late 1950s and 1960s in both the British and American (i.e. USA) education systems, as well as elsewhere in the world. During this period, science curricula became focused on the nature and processes of the scientiŽc discipline itself. This was the period in England of the NufŽeld science projects (and those that followed in the same tradition) at both secondary and primary school levels. In the USA, similar emphases were being incorporated into science curriculum projects such as PSSC physics, ChemStudy chemistry, BSCS biology (at the secondary school level) and Science: a process approach (SAPA), the Elementary Science Study (ESS) and the Science Curriculum Improvement Study (SCIS, at the elementary school level).

A whole literature sprang up, which both articulated the rationale underlying these curriculum projects, and advocated them to teachers and schools (Hurd & Gallagher, 1968; Hurd, 1969, 1970). For example, in a withering critique of traditional school science, Schwab (1965) characterised it as ‘a rhetoric of conclu-sions’, which ignored the underlying processes of inquiry that, he argued, more truly represented the essence of science. Spurred on by the release of the Russian Sputnik in 1957, the US government poured millions of dollars into the new science curricula with a concern to generate more and better scientists to support the national security imperatives.

In studying science using these curricula, students were expected not just to learn the concepts and theories of science, but also to acquire an understanding of how science functions as a discipline and the skills associated with scientiŽc investigation. Hodson (1993, p. 106) has summarised the purposes of science education following this period in terms of students ‘learning science, learning about science, and doing science’. Learning scientiŽc concepts, laws and theories was still seen as important, but equally important was the context in which the content was to be set. ‘Doing science’ meant acquiring the skills, strategies and habits of mind associated with

(6)

scientiŽc investigation. ‘Learning about science’ referred to understanding how science functioned as a discipline, its practice, its methods, its logic and its episte-mology. Two new emphases—scientiŽc skill development and the structure of science (Roberts, 1982)—had been born, at least on paper.

This science curriculum revolution inuenced science curriculum talk (Orpwood, 1998)—in curriculum guides, textbooks and professional development workshops— for nearly four decades. However, classroom teachers were, for the most part, unprepared to teach these new emphases. Few had had any personal experience of ‘hands-on’ scientiŽc inquiry or received any formal background study in the philos-ophy of science. Despite the enormous amounts of money and effort that the curriculum projects put into in-service teacher education, the innovations were rarely fully taken up in schools, at least in the form that their developers had in mind (e.g. Stake & Easley, 1978).

Over the subsequent decades, another chapter of the research literature was devoted to explaining why the curriculum revolution had failed to take root fully in American schools. There were many factors involved, but one was the failure of assessment in school science to match the changes in direction adopted by the curriculum. To this point we shall return after considering the second great curricu-lum revolution of the past half-century.

The second period of revolutionary change began slowly in the early 1980s and has now (in the late 1990s) gathered signiŽcant momentum in many countries of the world. If the Žrst revolution focused attention inward towards the structure and processes of science itself, the second balances this with attention outward towards society and the complex relationships among science, technology, society and the environment. A signiŽcant literature has now described the development and rationale for the many versions of this new focus for science education (e.g. Hurd, 1975; Aikenhead, 1980; Solomon, 1981; Bybee, 1985; Fensham, 1988; Cheek, 1992a; Solomon & Aikenhead, 1994; Yager, 1996; Black & Atkin, 1996, to name but a few). Now, in addition to acquiring basic scientiŽc knowledge and the skills of scientiŽc investigation, students are being expected to understand how science is related to technology, and how both science and technology impact on society and the environment. This new curriculum emphasis has even acquired its own acronym, STS (for science, technology and society) [2].

Once again, national standards and curriculum guides have begun to embrace this second revolution (e.g. American Association for the Advancement of Science, 1995; National Research Council, 1996; Council of Ministers of Education, Canada, 1997; Government of Ontario, 1998, 1999). Another factor is inuencing this revolution in a way that it did not in the 1960s. In the past, goals or aims were usually stated quite independently from the science content. The result was that textbooks, teachers and assessors were free to embrace or ignore them. Now, since many of the newer curriculum guides are stated in the form of outcomes, which incorporate both goals and content, the new emphasis has become an integral part of the curriculum speciŽcations (Orpwood & Barnett, 1997). By way of illustration, I will describe the new Grades 1–8 science and technology curriculum in the Canadian province of Ontario.

(7)

The Ontario Curriculum in Science and Technology: a case in point

The document that eventually becameThe Ontario Curriculum, Grades 1–8, Science and Technology (Government of Ontario, 1998) was developed by a consortium of teachers and school districts led by science educators at York University as a product of the Assessment of Science and Technology Project (ASAP). It had a series of features that were new for the province. It represented the Žrst curriculum in Ontario in 30 years that clearly articulated expectations in science for each grade of the elementary school. It integrated the study of science with that of technology, the Žrst time technology education was speciŽcally mandated in Ontario. It introduced the study of earth and space sciences into the science curriculum (these areas having been regarded previously as physical geography). It was set out in the form of outcomes for what students should know and be able to do by the end of each year. All of these could be regarded as ‘normal’ changes to the curriculum, even though these were major changes and presented signiŽcant challenges for classroom teachers.

However the three goal statements represented the element of the curriculum that was revolutionary. These are that students are intended:

· to understand the basic concepts of science and technology;

· to develop the skills, strategies and habits of mind required for scientiŽc inquiry and technological design;

· to relate scientiŽc and technological knowledge to each other and to the world outside the school (Government of Ontario, 1998, p. 4).

These goals emerged from a complex project design that combined analysis of the following factors:

· an up-to-date view of the nature of science and technology; · curriculum trends nationally and internationally;

· research on children’s capacity to learn; · the experience of classroom teachers;

· consideration of the needs of Canadian society;

· a deliberated consensus about all of these and the needs of Ontario’s children.

This is not the place for an exhaustive account of all of these factors. However, two can serve to demonstrate the origins of the goals adopted by this curriculum. For example, the project sought to link the concepts of ‘science’ and ‘technology’, as school subjects to the array of concepts that each embodies in the real world (see Orpwood & Bloch, 1998, pp. 7–9). Both are, Žrst, ‘systems of knowledge’—science seeking to describe and explain the natural and physical world, and technology seeking to meet human needs, through inventing modifying devices, structures, systems or processes. Secondly, both science and technology are processes of investigation and exploration—science through the processes of inquiry and technol-ogy through those of design. Thirdly—and this represented a new element for

(8)

many—science and technology are both social enterprises, which exist in social economic, political and environmental contexts. Omission of these contexts means that only a partial view of both science and technology is presented.

The changing needs of students in the new millennium were another major component that emerged from the research that examined trends nationally and internationally (Orpwood & Barnett, 1997). The project held deliberations involving a wide variety of stakeholders that led to a clear consensus about the desirable aim of the curriculum. We should ensure that every student received the opportunity to develop basic scientiŽc literacy and technological capability. These, in turn, involved three elements:

· understanding the core concepts of science and technology;

· acquiring the skills important for life and work in the twenty-Žrst century; · being able to relate the knowledge and skills acquired in school to real-life

situations.

The goals that emerged from this process (which occupied many hundreds of people and lasted two full years) are not coincidentally very similar to those appearing in many other new curricula in jurisdictions around the world. Indeed, analysing these curricula was a component of ASAP. However, they do differ signiŽcantly from those of the Žrst curriculum revolution and even more from those from before that time.

Incidentally, at the same time that ASAP was undertaking its curriculum development work, the Council of Ministers of Education, Canada, completed the development of its own framework for science curriculum known as the Pan-Canadian Science Framework (Council of Ministers of Education, Canada, 1997). There was signiŽcant interaction between the two projects, since the principal architect of the ASAP document (Marietta Bloch) was also a member of the Pan-Canadian development team. The goals articulated by the Pan-Canadian framework are entirely compatible with those in Ontario and, thus, this document belongs to the new generation of ‘second revolution’ curriculum frameworks (Aiken-head, 2000).

The three goals, once articulated, form the conceptual glue that binds the rest of the document together. The content is organised in Žve strands, which effectively integrate the science and technology content knowledge:

· life systems;

· matter and materials; · energy and control;

· structures and mechanisms; · earth and space systems.

For each of these strands, at each of the eight grades, the three goals are interpreted in the form of three overall expectations and three sections of speciŽc expectations. The goals are clearly integrated with the content so as to ensure their not being omitted during the implementation.

(9)

Assessment in the Context of Curriculum Change

If the curriculum went though ongoing ‘normal’ change and discrete periods of ‘revolutionary’ change, then it would be reasonable to expect that the cousin activity of assessment would experience parallel types of change and that current forms of assessment are co-ordinated well with the current curricula. However, I shall argue that does not turn out to be the case.

If the teaching of science knowledge for its own sake—Roberts’ (1982) ‘correct explanations’ and ‘Žrm foundations’ emphases—represents the basic curriculum paradigm of the pre-1960s period, then measuring how much scientiŽc knowledge a student has acquired represents the corresponding assessment paradigm. Throughout the world, science assessment both in classrooms, and in national or international projects focused on students demonstrating their scientiŽc knowledge chiey by responding to questions that required recall of memorised information, solving problems through memorised algorithms, and analysis of contrived data or situations that parallel those encountered in school science. This pattern of school science assessment mirrors in many ways patterns experienced in university examinations.

The pattern described here is not restricted to the use of multiple choice items though these are popular in North America because of the ease and reliability of scoring. The essay-type constructed response items and the short-answer format (more familiar to teachers and students in Europe) are equally likely to call for recall or the simple processing of memorised information. The point I am making here is that ‘normal’ science assessment comprises a very limited range of student cognitive activities, regardless of the types of assessment item used.

Given this paradigm, validity issues in science assessments usually amount to analysis of the distribution of items of various science content areas compared to the distribution of the science content topics in the curriculum. One of the problems of assessment thus consists of developing an assessment that is balanced with respect to the many science topics covered, while still maintaining an assessment of reasonable length. In classroom tests, teachers handle this by having frequent ‘unit tests’ covering small areas of the curriculum. School examinations handle it by a judicious selection from the topics covered—leading, of course, to the students having to try to second-guess which topics will be ‘on the exam’ with those who guess best being more successful than those whose predictions are less accurate. In large-scale achievement tests, such as TIMSS, the problem is magniŽed, since the range of topics covered by curricula in the many countries is very broad. This problem can be resolved in part through a complex test design involving a very large pool of items and the use of multiple test booklets (Adams & Gonzalez, 1996). Even so, the problem of test development in TIMSS was signiŽcant. With a blueprint based loosely on the science curricula of participating nations (McKnight et al., 1993), it involved making many compromises based on such factors as Želd-test results, national preferences and the avoidance of large item-by-country interactions (Garden & Orpwood, 1996). While, from a technical (reliability and Item Response Theory (IRT) scaling) perspective, the TIMSS written achievement tests ‘worked’

(10)

effectively, they have continued to attract criticism from observers in a variety of countries (e.g. Fensham, 1998). The over-arching criticism has been a challenge to their validity, particularly in these times of curriculum change.

There was much more to TIMSS than the simple assessment of students’ science content knowledge and I shall return to further discussion of TIMSS later. First, however, I want to discuss a revolution in assessment that corresponds to the Žrst curriculum revolution described earlier.

Assessment for the 1960s Curriculum Revolution

While many of the curriculum projects that incorporated the new goals (concerning the nature of science and the acquisition of science inquiry skills) attempted to develop their own measures of achievement, these rarely became commonplace in schools or in large-scale assessments. Rather, teachers and national/international assessment projects continued to use traditional assessment measures—measures that, in the main, called for recall of memorised scientiŽc knowledge.

There were perhaps four major reasons for this. First, the assessment technology that would permit valid assessment of students’ abilities to conduct investigations in science had not been designed in the 1960s. The Žrst signiŽcant ‘performance assessments’ (as they have now become known) were designed in England in the early 1980s by the Assessment of Performance Unit (APU, 1983) fully 20 years after the goal of instilling ‘inquiry skills’ in students had Žrst been introduced into the curriculum. Even after its initial successes in the UK, the APU saw its funding cut and it was even longer before the sort of assessment using performance tasks became familiar in North America.

Secondly, even when newer, more authentic assessments had been developed, the psychometric community—particularly in the United States—who were anxious to maintain the reliability of the multiple choice and other objectively scored tests, expressed scepticism over such new measures of assessment. It is only in the last decade that signiŽcant research on the characteristics of performance assessments has become commonplace in the educational literature.

The third reason had to do with public credibility: universities and the public thought they knew what traditional tests measured, and new, unproven forms of assessment lacked the familiarity and thus the credibility of the traditional ones. This reason is still, as we shall note later, a problem for science educators who try to make their assessments match the intended goals of the curriculum.

The Žnal reason, I would submit, was a professional inertia amongst teachers themselves, particularly at the secondary school level, for whom assessment has often tended to mimic the examinations experienced at university. Thus, when a national assessment was developed or reviewed by a committee of teachers, the items most likely to be considered acceptable were those of the most traditional variety.

These factors led to a signiŽcant delay in the paradigm shift in assessment corresponding to the curriculum revolution of the 1960s. It was the 1980s before performance assessment even made its Žrst signiŽcant appearance and the 1990s

(11)

before it became at all widespread. Even in TIMSS, the performance assessment component (Harmonet al., 1997), which had initially been described as ‘integral’ to the study, was later treated as a national option, was reported (at the international level) separately from the paper-and-pencil assessment, and was dropped entirely from the TIMSS-R replication study taking place in 1999.

It appears that the goals that formed the essence of the science curriculum revolution of the 1960s are still not being assessed with the same degree of attention as those that focus on simple recall of scientiŽc information. Stake & Raizen (1997), commenting on this situation in a recent review of curriculum innovations in the United States, observe that:

most reformers in the eight projects we studied agreed that the alization of science education is incomplete if it leaves out the reconceptu-alization of assessment. Yet systemic educational reform calls for the use of rigorous, objectively scored, standardized tests as bottom-line criteria. (p. 138)

They also point out the political dilemma:

it is difŽcult to assure parents, taxpayers, and sceptical teachers that the new curricula and teaching strategies will provide students the information that achievement testing has traditionally required. Reformers who claimed that back in the 1960s failed to be persuasive (Stake & Easley, 1978). (State & Raizen 1997, p. 132)

Assessment for the 1990s Curriculum Revolution

If assessment of the curriculum goals that characterised the 1960s revolution has been delayed and still seeks credibility, that for the STS revolution in science curriculum has barely surfaced at all beyond the research level. Some researchers have recognised the problem (e.g. Aikenhead et al., 1987; Bybee, 1991; Cheek, 1992b) and some projects have attempted to tackle it (e.g. American Chemical Society, 1988; Aikenhead & Ryan, 1992). Cheek (1992b) reports that STS compo-nents are contained in the work of the New York State Education Department, the South Australia Senior Secondary Assessment and several examination boards in the UK. In Canada, Alberta Education’s assessment branch has also attempted to ensure that the STS components of the curriculum are truly reected in their provincial assessments.

However, for most classroom teachers and large-scale assessment projects there remains little guidance or exemplary work that addresses how to assess student achievement in the context of an STS-orientated curriculum. In times gone by, this might not have mattered if teachers were convinced that an STS emphasis was something they felt was right to integrate into their science programmes. However, with the increasing importance of assessment and the measuring of students’ achievement of the intended outcomes, the absence of STS from classroom and

(12)

Item A1

Nuclear energy can be generated by ®ssion or fusion. Fusion is not currently being used in reactors as an energy source. Why is this?

A. The scienti®c principles on which fusion is based are not yet known B. The technological processes for using fusion safely are not yet developed. C. The necessar y raw materials are not yet readily available.

D. Waste products from the fusion process are too dangerous.

FIG. 1. Item A1.

large-scale assessment is likely to have a profound inuence on whether the STS-related outcomes in the curriculum are taken seriously.

The test development experience of TIMSS once again provides an illustration of some of the difŽculties associated with trying to include assessment items that address the STS emphasis in the science curriculum. The main TIMSS item pools for 9- and 13-year-olds—TIMSS populations 1 and 2—contained few items (5% at most) that addressed STS issues and no more than this number that focused on the nature of scientiŽc investigation. However, the Mathematics and Science Literacy (MSL) component of the third TIMSS population (school-leavers) represented the most systematic attempt to include such items, in a category of the test labelled ‘Reasoning and Social Utility’ (RSU) [3]. While, in the end, RSU was not used as an independent reporting category, several STS items were included in this aspect of the MSL achievement tests. Some of them, such as Item A1, call for students to recall previously learned scientiŽc or technological information (Fig. 1).

Others, such as Item A7, call for the application of scientiŽc principles to a social situation (Fig. 2).

A third type of item, of which there were very few examples in TIMSS, but which perhaps illustrates the STS emphasis more faithfully, is exempliŽed by Item A11 (Fig. 3).

This item was based on a real-life scenario (described in a newspaper article) and the original item only contained part (B). This version of the item was challenged by the TIMSS subject-matter specialists as ‘containing no science’ and thus part (A) was added. The second part of the item is clearly an attempt to assess students’ STS understanding in that it invites consideration of the social and economic conse-quences of the introduction of a new technology.

Item A7

Some high-heeled shoes are claimed to damage ¯oors. The base diameter of these very high heels is about 0.5 cm and that of ordinary heels about 3 cm. Brie¯y explain why the very high heels may cause damage to ¯oors.

FIG. 2. Item A7.

(13)

Item A11

It takes 10 painters 2 years to paint a steel bridge from one end to the other. The paint that is used lasts about 2 years, so when the painters have ®nished painting at one end of the bridge, they go back to the other end and start painting again.

A. Why must steel bridges be painted?

B. A new paint that lasts 4 years has been developed and costs the same as the old paint. Describe two consequence s of using the new paint.

FIG. 3. Item A11.

The item very nearly failed to survive in the MSL achievement test because of a variety of additional factors including:

· difŽculties with scoring—the notion of a ‘correct answer’ is dependent on the socio-political context;

· because it was not considered by educators in many countries as appropriate for a science achievement test, even one that focused on science literacy;

· the implicit introduction of ‘values’ into a science assessment—the response one gives to part (B) of this item requires one to adopt a value-laden ‘position’, and think through the assumptions and consequences of that position.

Nevertheless, the item did remain in TIMSS and the results, which are currently being analysed for another paper, show some interesting patterns of response across the world and even within countries.

However, the difŽculties encountered in the development and use of this item remain and would appear to be endemic to STS assessment. Aikenhead and his colleagues at the University of Saskatchewan have suggested that ‘a new generation of standardized instruments’ (Aikenheadet al., 1987) is required or, in the language of this article, a new revolution in assessment. Their work in developing the Views on Science-Technology-Society (VOSTS) instrument certainly represents a chal-lenge to the normal conception of assessment in science. In their words, ‘VOSTS requires students to write an argumentative response—a reaction to a statement about a STS topic. Rather than analyzing “right” and “wrong” answers, we let

students’ arguments deŽne various positions or viewpoints on each STS topic’. While the original VOSTS was not suitable for use in large-scale assessments, it has since been adapted to describe students’ views on STS in Ontario (Crelinstenet al., 1993). VOSTS overcame the ‘problem of values’ by allowing the student to adopt any position, but assessed the quality of the argument.

The new OECD study, Programme for International Student Assessment, known as PISA, is also attempting to push the bounds of assessment in the area of STS. However, it is resisting the inclusion of items that require values to be analysed. Rather, it is presenting students with scenarios from real life and asking them to demonstrate their abilities at using ‘scientiŽc processes’ in the analysis of the issues involved (for more information see the PISA Framework document, Programme for International Student Assessment, 1999).

(14)

Task 1LS/PT02 (for grade 1, Life Systems strand) GAME TIME

Design and make a game for a child who is not able to see. Name your game and describe the rules so that others can play it.

What other senses will people who play your game have to use?

Name the materials you used to make the game and describe why you chose them. Draw a picture of the game and label the parts.

Describe the rules of your game.

FIG. 4. Task 1LS/PT02.

The Assessment of Science and Technology Achievement Project (ASAP) has developed a wide range of assessment tasks for classroom use (Orpwood et al., 1999) corresponding to the full range of the expectations contained in the Ontario science and technology curriculum described earlier. The focus of the collection of 500 tasks covering eight grades is on ‘what students can do with what they know’, rather than on the traditional ‘what they know’. In the area of STS, some of the tasks put students into real-world situations and ask them to reect on the situation in some important respect. In this respect, the ASAP collection bears similarities to the PISA science assessment. A few sample tasks can serve to illustrate the point (Fig. 4).

The Grade 1, Life Systems unit is entitled ‘Characteristics and Needs of Living Things’ and the task is focused on several expectations from the ‘skills of inquiry and design’ section of the curriculum including asking questions about the needs of living things, planning investigations and communicating results. In addition, the task addresses the STS expectations of comparing the ways in which humans use their senses to meet their needs, and describing ways in which people adapt to the loss or limitation of sensory ability.

Not all the tasks are ‘hands-on’ in the sense of requiring students to undertake practical work in a laboratory setting. Consider the following task, for example, from the Grade 7 Life Systems unit on ‘Interactions within Ecosystems’ (Fig. 5).

These tasks call for students to think holistically about a real-world situation, taking into account the competing demands of apparently conicting positions. It

Task 7LS/EA04 (for Grade 7, Life Systems strand)

A construction company is about to bulldoze a wood lot with a pond nearby so that a new housing development can be built. Devise a plan so that the new houses get built and yet the environment, and the plants and animals in it, get protected. We want this to be a win-win situation. How can the new houses be built and yet the environment still protected?

FIG. 5. Task 7LS/EA04.

(15)

calls for the creative development of solutions to a problem that clearly has no right or wrong answers. In both cases, students must have developed prior knowledge and skills, and in both the responses will demonstrate their abilities at these. The focus here is on integrated, open-ended thinking of a kind not usually sought in science assessments. Scoring responses to such a question will be hard, especially if reliability considerations are paramount. Yet both would appear to be entirely appropriate given the expectations of the curriculum. While these examples are not presented as ideal examples of STS assessment items, they represent the direction that the needed assessment revolution must pursue if the latest curricu-lum revolution is to be reected adequately in classroom assessments.

Concluding Thoughts: what counts as science assessment?

Clearly, new directions—arguably revolutions—are emerging in science curricula in various parts of the world. The assessments required to determine students’ achieve-ment of the new goals of science curricula, however, have been slow to catch up. While recent progress in the use of performance assessments have focused attention on what students can ‘do’ in science, as well as on what they ‘know’, the new challenges presented by the STS revolution in science education has not been systematically addressed by most assessments. Indeed, the problem of ‘what counts’ as science assessment has in many cases not developed much from the pre-revolutionary era when measuring the quantity of students’ knowledge of science was the major focus.

Of course, the new STS revolution appears in many varieties. It is not the case that all versions of STS curriculum are focused on the same speciŽc goals or integrate STS with science content in the same way or to the same extent, as Aikenhead (1994) has pointed out. For example, the Žve items sampled above all reect some aspect of STS in that all of them link science topics with STS content. However, each of them does so in a different way. Some (e.g. A7 and A11a) call for students simply to apply their scientiŽc knowledge, albeit in an STS context. Others (such as A1) call for students to recall speciŽc STS information. Yet others (e.g. item A11b) require students to demonstrate little knowledge of science content, but rather to be able to reason about the impact of science and technology in a social context.

Those of us who advocate STS in science education have a responsibility to clarify more precisely what we expect students to be able to demonstrate in an assessment context if we expect STS to appear more consistently in science assessments of any kind. A framework that enables analyses of the varieties of STS objective that are incorporated in a curriculum and thus the types of assessment that are appropriate, is needed. Aikenhead’s framework provides a useful start, but it focuses on the percentage of a complete assessment that is STS. As the items shown here demon-strate, the issue is not simply one of ‘how much’ of an assessment is STS, but also ‘what types’ of student performance are called for and how these relate to the intent of the STS curriculum. Any move towards a more comprehensive framework for the assessment of STS must take these complexities into account.

(16)

International and other large-scale assessments face a particular dilemma. On the one hand, their validity is sometimes determined (as was the case in TIMSS) not only in reference to the content of the intended curricula, but also partly in relation to the implemented curricula. Even in countries that intend the curriculum to include STS, implementation may lag way behind the intended curriculum changes. It is hard, therefore, for such international projects to provide ‘leadership’ in terms of promoting new forms of assessment having higher validity in some countries, while also remaining ‘acceptable’ to all participants. At the same time, the political status and high-proŽle consequences of these large-scale international studies such as TIMSS may encourage the maintenance of the status quo, or even slow down the spread of curriculum revolutions across and within the countries that participate. Leadership is therefore required from all quarters to ensure that innovations such as performance assessment and STS assessment are not allowed to be regarded as ‘second-class’ or entirely ‘optional’ ways of assessing achievement in science edu-cation. In the case of large-scale assessments, this requires new models for address-ing validity to be introduced such as that proposed (but not implemented) for TIMSS by Shavelsonet al. (American Educational Research Association, 1993). He proposed what became known within TIMSS as the ‘ower and petals’ model, involving a core cluster of items as an assessment for all countries, and other clusters of items, which would be taken by those countries selecting to do so. Such a model might have gone some way to resolving the dilemma of validity across the many countries participating in TIMSS.

At the same time, the professional inertia that resists change in assessment at the classroom, local and national levels needs to be addressed. Here, I believe that the key move is to integrate assessment with the professional development of teachers, as is already the practice of the Ontario provincial assessment programme and in the next phase of ASAP currently under way. Teachers will work on developing new forms of assessment for their classrooms as part of an ongoing series of professional development workshops, and thereby address together the challenges of a new STS curriculum and of assessing it in an appropriate way.

Finally, academic leadership must be shown through greater collaboration be-tween the curriculum and psychometric research communities. One of the casualties of academic specialisation is that those of us schooled in the issues of curriculum, teaching and learning are not often also up-to-date with developments in assess-ment, while those whose expertise lies in assessment have not had time or interest to understand the complexities of the revolutions that have taken place in the curriculum. Dialogue across this divide is required if the revolutions of the intended science curriculum are to be reected in the real and reported achievements of students, in whose interests the entire enterprise is undertaken.

NOTES

[1] The term ‘revolution’ is also used in this way by Atkinet al. (1996).

[2] Sometimes, ‘Environment’ is added as an additional element to STS, making the acronym STSE (see Council of Ministers of Education, Canada, 1997, for example).

[3] Orpwood & Garden (1998) describe the test development for the MSL component of TIMSS in detail.

(17)

REFERENCES

ADAMS, R. & GONZALEZE. (1996) The TIMSS test design, in: M. MARTIN& D. KELLY (Eds) Third International Mathematics and Science Study, Technical Report, Volume 1: design and development(Chestnut Hill, Boston College).

AIKENHEAD, G. (1980)Science in Social Issues: implications for teaching(Ottawa, Science Council of Canada).

AIKENHEAD, G. (1994) What is STS science teaching? in: J. SOLOMON& G. AIKENHEAD(Eds)STS Education: international perspectives on reform, pp. 47–59 (New York: Teachers College Press).

AIKENHEAD, G. (2000) STS science in Canada: from policy to student evaluation, in:D. KUMAR & D. CHUBIN(Eds)Science, Technology, and Society: a source book on research and practice, pp. 49–89 (Kluer, Plenum Press).

AIKENHEAD, G. & RYAN, A. (1992) The development of a new instrument: ‘Views on science-technology-society’ (VOSTS),Science Education, 76, pp. 477–491.

AIKENHEAD, G., FLEMINGR. & RYANA. (1987) High school graduates’ beliefs about science-technology-society, Science Education, 71, pp. 145–161.

AMERICANASSOCIATION FOR THEADVANCEMENT OFSCIENCE(AAAS) (1995)Project 2061: science literacy for a changing future, a decade of reform(Washington, AAAS).

AMERICANCHEMICALSOCIETY(ACS) (1988) ChemCom: chemistry in the community(Dubuque, Kendall/Hunt).

AMERICANEDUCATIONALRESEARCHASSOCIATION(AERA) (1993) TIMSS achievement test item pools, unpublished report (Vancouver, University of British Columbia).

ASSESSMENT OFPERFORMANCEUNIT(APU) (1983) Science at Age 11 (London, Department of Education and Science).

ATKIN, M., BLACK, P., BRITTON, E. & RAIZEN, S. (1996) A global revolution in science, mathe-matics and technology education, Education Week(April 10).

BLACK, P. & ATKIN, M. (Eds) (1996)Changing the Subject: innovations in science, mathematics, and technology education(New York, Routledge).

BYBEE, R. (1985) The Sisyphean question in science education: what should the scientiŽcally and technologically literate person know and be able to do as a citizen? in: R. BYBEE (Ed.) Science-Technology-Society, 1985 NSTA Yearbook, pp. 79–93 (Washington DC, National Science Teachers Association).

BYBEE, R. (1991) Science-Technology-Society in science curriculum: the policy-practice gap, Theory into Practice, 30(4), pp. 294–302.

CHEEK, D. (1992a) Thinking Constructively about Science, Technology, and Society Education (Albany, SUNY Press).

CHEEK, D. (1992b) Evaluating learning in STS education,Theory into Practice, 31(1), pp. 64–72. COUNCIL OFMINISTERS OFEDUCATION, CANADA(CMEC) (1997)Common Framework of Science

Learning Outcomes(Toronto, CMEC).

CRELINSTEN, J.,DEBOERRJ. & AIKENHEAD, G. (1993)Measuring Students’ Understanding of Science in its Technological and Social Context(Toronto, Ministry of Education).

FENSHAM, P. (1988) Approaches to the teaching of STS in science education,International Journal of Science Education, 10, pp. 346–356.

FENSHAM, P. (1998) Insights from TIMSS for Australian science education, unpublished paper presented at the Annual Meeting of the National Association for Research in Science Teaching, San Diego, April 21–25, 1998.

GARDEN, R. & ORPWOOD, G. (1996) Development of the TIMSS achievement tests, in: M. MARTIN& D. KELLY (Eds) Third International Mathematics and Science Study, Technical Report, Volume 1: design and development, pp. 2.1–2.19 (Chestnut Hill, Boston College). GOVERNMENT OFONTARIO (1998) The Ontario Curriculum, Grades 1–8, Science and Technology

(Toronto: Ministry of Education and Training).

GOVERNMENT OFONTARIO(1999)The Ontario Curriculum, Grades 9–10, Science(Toronto, Minis-try of Education and Training).

(18)

HARMON, M., SMITH, T., KELLY, D., BEATON, A., MULLIS, I., GONZALEZ, E. & ORPWOOD, G. (1997)Performance Assessment in IEA’s Third International Mathematics and Science Study (Chestnut Hill, Boston College).

HODSON, D. (1993) Towards a more critical approach to practical work in school science,Studies in Science Education, 22, p. 106.

HURD, P. (1969) New Directions in Teaching Science in Secondary Schools (Chicago, Rand McNally).

HURD, P. (1970)New Directions in Teaching Science for Junior High Schools(Belmont, Wadsworth). HURD, P. (1975) Science, technology and society: new goals for interdisciplinary science teaching,

Science Teacher, 42, pp. 27–30.

HURD, P. & GALLAGHER, J. (1968)New Directions in Elementary Science Teaching (Chicago, Rand McNally).

MCKNIGHT, C., SCHMIDT, W. & RAIZENS. (1993) Test blueprints: a description of the TIMSS Achievement Test Content Design, TIMSS document, ICC797/NRC357 (Vancouver, University of British Columbia).

NATIONALRESEARCHCOUNCIL(NRC) (1996) National Science Education Standards (Washing-ton DC, NRC).

ORPWOOD, G. (1998) The logic of science curriculum talk, in: D. ROBERTS& L. O¨STMAN(Eds) Problems of Meaning in Science Curriculum, pp. 133–149 (New York, Teachers College Press).

ORPWOOD, G. & BARNETT, J. (1997) Science in the National Curriculum: an international perspective,Curriculum Journal, 8(3), pp. 331–249.

ORPWOOD, G. & BLOCH, M. (1998)Implementing the Ontario Curriculum, Grades 1–8: science and technology(Toronto, Ontario English Catholic Teachers Association).

ORPWOOD, G. & GARDEN, R. (1998)Assessing Mathematics and Science Literacy, TIMSS mono-graph No. 4 (Vancouver, PaciŽc Educational Press).

ORPWOOD, G., BLOCH, M., BARTLEY, A., HERRIDGE, D. & MARKS, M.. (1999)Classroom Assessment in Science and Technology: a resource handbook for teachers(Toronto, Nelson).

PROGRAMME FOR INTERNATIONAL STUDENT ASSESSMENT (PISA) (1999) Measuring Student Knowledge and Skills—a new framework for assessment (Paris, Organisation for Economic Co-operation and Development).

ROBERTS, D. (1982) Developing the concept of ‘curriculum emphases’ in science education, Science Education, 66, pp. 243–260.

ROBITAILLE, D., SCHMIDT, W., RAIZEN, S., MCKNIGHT, C., BRITTEN, E. & NICOL, C. (1993) Curriculum Frameworks for Mathematics and Science, TIMSS monograph No. 1 (Vancouver, PaciŽc Educational Press).

SCHWAB, J. (1965) Science as Inquiry, in: J. SCHWAB& P. BRANDWEIN(Eds)Science as Inquiry, pp. 1–103 (Cambridge, Harvard University Press).

SOLOMON, J. (1981) Science and society studies in the curriculum, School Science Review, 82, pp. 213–220.

SOLOMON, J. & AIKENHEAD, G. (Eds) (1994) STS Education: international perspectives on reform (New York, Teachers College Press).

STAKE, R. & EASLEY, J. (1978)Case Studies in Science Education(Urbana, Center for Instructional Research and Curriculum Evaluation, University of Illinois).

STAKE, R. & RAIZEN, S. (1997) Underplayed issues, in: S. RAIZEN& E. BRITTON (Eds) Bold Ventures: 1. Patterns among Innovations in Science and Mathematics Education, pp. 11–153 (Dordrecht, Kluwer).

YAGER, R. (1996)Science/Technology/Society as Reform in Science Education(Albany, SUNY Press).

Referensi

Dokumen terkait

Bagaimana aplikasi skema voting digunakan untuk merangkum hasil ramalan nilai kurs rupiah terhadap US Dollar dan bagaimana menggabungkan hasil ramalan ketiga metode

[r]

Mardjono Reksodiputro, 1997” Hak asasi Manusia Dalam Sistem Peradilan Pidana ” , Pusat Pelayanan Keadilan dan Pengabdian Hukum Lembaga Kriminologi Universitas Indonesia,

The Independent Speaking Task 103 Independent Speaking Preview Test 103 Lesson 15: Personal Preference Task 103 Lesson 16: Paired Choice Task 104 The Integrated Speaking Task

Berdasarkan analisis dan sintesis yang mengacu pada penelitian relevan, melalui langkah-langkah strategi proses pembelajaran pemaknaan, yaitu pembelajaran matematika dengan

Badan Ketahanan Pangan Provinsi Sumatera Utara, "Produksi Kebutuhan Pokok", BKP Sumut, Medan, 2014..

puisi yang tidak asal-asalan maka harus ada unsur-unsur yang harus diperhatikan. Selain itu puisi juga memiliki lapis atau strata. Bila orang membaca puisi, maka yang

Komunikasi interpersonal atau antarpribadi ( interpersonal communication ) adalah komunikasi yang dilakukan antar seseorang dengan orang lain dalam suatu masyarakat maupun