PERFORMANCE ASSESSMENT
Dilemma 2: How seriously should we take the inclusion of conative and affective factors in some of these new sets
At the heart of this research agenda is a need for traditional construct validity studies in which the constructs to be measured are operationalized and explicitly linked to the content of the assessment and its scoring criteria or standards. Coherence at this level is not enough, however. It is also necessary to demonstrate relationships between the construct as measured and other constructs or outcomes. For example, do judgements of a student’s reading accomplishment on performance tasks correlate with teacher’s judgement of accomplishment? With performance on other tasks or measures of reading? With general academic success or real world self-sufficiency?
While proponents of performance assessment hold that the authentic nature of the tasks attests to their validity as measures of reading, a collection of
‘authentic’ tasks may fall short in terms of representing the broad domain of reading. It is impossible to know whether the domain is adequately represented without studies of this kind.
Think-aloud procedures could be useful for understanding how participants perceive the tasks and the standards used to judge their responses or work (Ericsson and Simon, 1984; Garner, 1987). By asking students to tell us the decisions they make as they construct and present their responses, we can begin to determine the fit between the task as intended and as perceived by the participant and assess the magnitude of the threat to validity imposed by a lack of fit. This research could help us to create tasks that are more resistant to multiple interpretations as well as help us to improve scoring criteria to address a variety of interpretations.
Likewise we could ask scorers to think aloud while they scored performance tasks to gain insight into how individual judges interpret standards and assign scores to student work. Through think-alouds, it may be possible to determine the extent to which the underlying conceptualization of reading as represented in the task and scoring rules is guiding judges’
decisions as well as the extent to which extraneous factors are influencing scoring.
Dilemma 2: How seriously should we take the inclusion of
NSP OWNERSHIP DIMENSION FOR READING
• Initiates participation in reading communities outside the classroom.
• Consistently demonstrates self-confidence, independence, and persis-tence.
• Pursues reading for enjoyment.
• Reads widely and for a variety of purposes.
• Evaluates own reading to set personal goals.
• Analyses personal responses to text.
• Selects challenging texts.
In the USA, matters of motivation and other affective dimensions, as facets of assessment, have become a ‘damned if you do and damned if you don’t’ situation. It is very hard to scale students on these dimensions.
Self-report measures are fraught with error, while observations and other surveillance strategies can be personally invasive. On the other hand, motivation is so clearly relevant to most discussions of student achievement that failure to account for it severely limits the validity and utility of the test results. Affective factors occupy a salient, but conflicted position in our assessment logic. Rarely, at least until performance assessments came along, have we included them as a part of formal assessment in our schools, yet we do privilege them in other, often equally important forms of evaluation, such as prospective employer checklists (Is this individual reliable? punctual? cooperative?) and letters of recommendation.
Some designers of assessment systems are attempting to assess both cognitive and affective factors and the relationship among them. The Interstate Teacher Assessment and Standards Cooperative (CCSSO, n.d.) has proposed a set of standards (soon to followed by assessments) for initial licensure.
One of its principles illustrates:
Principle 3: The teacher understands how students differ in their approaches to learning and creates instructional opportunities that are adapted to diverse learners.
This principle is unpacked in three interwoven sections: knowledge, dispositions and performances.
Knowledge:
• The teacher understands and can identify differences in approaches to learning and performance, including different learning styles, multiple intelligences and performance modes, and can design instruction that helps use students’ strengths as the basis for growth.
Dispositions refer to beliefs and attitudes that teachers would have to hold in order to implement the standard.
Dispositions:
• The teacher believes that all children can learn at high levels and persists in helping all children achieve success.
• The teacher is sensitive to community and cultural norms.
Finally, performances represent the ‘evidence’ that teachers can meet the standard.
Evidence:
• The teacher identifies and designs instruction appropriate to students’
stages of development, learning styles, strengths and needs.
• The teacher brings multiple perspectives to the discussion of subject matter, including attention to students’ personal, family, and community experiences and cultural norms.
• The teacher makes appropriate provisions (in terms of time and circumstances for work, tasks assigned, communication and response modes) for individual students who have particular learning differences or needs.
Even though INTASC assessments are still in their formative stages and unavailable for review, it would be hard to imagine performance assessments, at least based upon these standards, that would not include indices of affect, will, and disposition.
Of all the dilemmas discussed in this chapter, this one may prove both the most challenging and most interesting. Both the challenge and the interest come from stepping into a personal world often viewed as the prerogative of the family, or at least of some institution other than the school. By including conative and affective factors in formal assessments, are we in danger of imposing a societally sanctioned view of dispositions? What if a child proves to be a contemptible, anti-social, genius? What if a student shows contempt for reading, shuns goals for improving attitude, does not seek challenges, but performs well? Should we be concerned about affect as long as cognitive performance is solid? The answer is neither clear nor simple, involving an examination of the role of schooling in society and issues of family rights and privacy.
Research possibilities It is difficult to recommend research on this dilemma.
While motivational features are an important part of assessment, they are clearly grounded in academic assessments. We know a great deal about motivation but very little about how professionals use formal evidence of motivational factors in educational decision-making. We seem to want
evidence of this sort, but at the same time we are concerned about issues of individual and family privacy and prerogative. What may be needed is a line of inquiry examining the value added to overall decision making when information about motivation, attitudes, and dispositions is available. It seems important to include students and parents as well as teachers in the category of educational decision-makers when we conduct this research.
Dilemma 3: Can we come to terms with the social