• Tidak ada hasil yang ditemukan

A core interest of personnel psychology is whether some intervention in selection, training, or motivation relates to some criterion. A criterionis an evaluative standard that is used as a yardstick for measuring employees’ success or failure on the job. In many instances, the criterion of interest will be job Criterion Theory———131

performance, but a criterion could also be a particular attitude, ability, or motivation that reflects an opera- tional statement of goals or desired outcomes of the organization. Implicit in this definition is that the cri- terion is a social construct defined by organization leaders who are responsible for formulating and trans- lating valued organizational outcomes.

THE ULTIMATE CRITERION

Eventually, we are interested in predicting theultimate criterion, that is, the full domain of employees’ perfor- mance, including everything that ultimately defines success on the job. Given the totality of this definition, the ultimate criterion remains a strictly conceptual construct that cannot be measured or observed. To approach it, however, and to describe the connection between the outcomes valued by the organization and the employee behaviors that lead to these outcomes, J. F. Binning and G. V. Barrett introduced the concept of behavioroutcome links. In practice, these links should be based on a thorough job analysis, in the form of either a job description that analyzes the actual job demands or a job specification that reveals the con- structs required for good performance.

THE OPERATIONAL CRITERION

The conceptual nature of the ultimate criterion requires practitioners to deduct and develop the crite- rion measure or operational criterion, an empirical measure that reflects the conceptual criterion as well as possible. Using this operational criterion as a proxy for the conceptual criterion of interest, the usual approach in personnel psychology is to establish a link between performance on a predictor and perfor- mance on the operational criterion as an indication of the predictor’s criterion-related validity. Operational criteria might include the following:

• Objective output measures (e.g., number of items sold)

• Quality measures (e.g., number of complaints, number of errors)

• Employees’ lost time (e.g., occasions absent or late)

• Trainability and promotability (e.g., time to reach a performance standard or promotion)

• Subjective ratings of performance (e.g., ratings of knowledge, skills, abilities, personal traits or charac- teristics, performance in work samples, or behavioral expectations)

• Indications of counterproductive behaviors (e.g., disciplinary transgressions, personal aggression, substance abuse, or voluntary property damage) In practice, operational criteria should satisfy at least three independent requirements.

1. Operational criteria must be relevant to the organiza- tion’s prime objectives. Although this may sound obvious in theory, in practice, criterion choices are often based on convenience (e.g., using data from performance records that are “lying around any- way”), habit, or copying what others have used.

Though recorded output data such as sales volume might be easily accessible, they may represent a more suitable criterion measure for some organiza- tions (e.g., car dealers striving for high momentary sales figures) than for others (e.g., car dealers living on good word-of-mouth propaganda resulting from superior customer service).

2. Operational criteria must be sensitive in discriminat- ing between effective and ineffective employees.

This requires (a) a linkage between performance on the operational criterion and the employee’s actual performance on the job (i.e., the link between the operational and the conceptual criterion) and (b) variance among employees. Speed of production, for example, may be an unsuitable criterion in cases in which speed is restrained by the tempo of an assem- bly line; likewise, the number of radioactive acci- dents is likely to be low across all nuclear power plant engineers, making this theoretically relevant criterion practically useless.

3. Operational criteria need to be practicable, as the best-designed evaluation system will fail if manage- ment is confronted with extensive data recording and reporting without seeing an evenly large return for their extra efforts.

THE CRITERION PROBLEM

The criterion problem describes the difficulties involved in conceptualizing and measuring how the conceptual criterion of interest, a construct that is multidimensional and dynamic, can best be captured by an operational criterion. This problem, according to Binning and Barrett, is even bigger if the job analysis on which the criterion is based is of poor quality and if the link between the operational and the conceptual criteria has been weakly rationalized.

132———Criterion Theory

CRITERION DEFICIENCY AND CONTAMINATION

Any operational criterion will suffer from at least one of two difficulties: First, criterion deficiency is a for- midable and pervasive problem because operational criterion measures usually fail to assess all of the truly relevant aspects of employees’ success or failure on the job. Second, operational criteria may be contami- nated because many of the measures are additionally influenced by other external factors beyond the indi- vidual’s control. One of the most persistent reasons a measured criterion is deficient is the multidimension- ality of the ultimate criterion, which combines static, dynamic, and individual dimensionality. Two other reasons for both contamination and deficiency are the unreliability of performance and performance obser- vation. Finally, reasons associated primarily with criterion contamination are stable biases influencing criterion evaluations.

Static dimensionalityimplies that the same individ- ual may be high on one facet of performance but simultaneously low on another. Thus, although an employee may do terrific work in terms of classical task performance (i.e., activities that transform raw materials into the goods and services produced by the organization or that help with this process), the same employee may show relatively poor contextual or organizational citizenship behaviors (i.e., behaviors that contribute to the organization’s effectiveness by providing a good environment in which task perfor- mance can occur, such as volunteering, helping, coop- erating with others, or endorsing and defending the organization to outsiders). Even more, the same indi- vidual might engage in counterproductive behaviors or workplace deviance (i.e., voluntary behaviors that violate organizational norms and threaten the well- being of the organization and its members, such as stealing, avoiding work, or spreading rumors about the organization).

Another aspect of static dimensionality addresses whether performance is observed under typical perfor- mance conditions (i.e., day-in, day-out, when employ- ees are not aware of any unusual performance evaluation and when they are not encouraged to per- form their very best) or under maximum performance conditions (i.e., short-term evaluative situations during which the instruction to maximize efforts is plainly obvious, such as work samples). The distinction is

important because performance on the exact same task can differ dramatically between situations, not only in absolute terms but also in terms of employee ranking of performance. A likely reason is that typical versus maximum performance situations influence the rela- tive impact of motivation and ability on performance.

Temporal or dynamic dimensionality implies that criteria change over time. This change may take any of three forms. First, the average of the criterion may change because performers, as a group, may grow bet- ter or worse with time on the job. Second, the rank order of scores on the criterion may change as some performers remain relatively stable in their perfor- mance while others highly increase or decrease their performance over time. Third, the validity of any pre- dictor of performance may change over time because of changing tasks or changing subjects. The changing task model assumes that because of technological developments, the different criteria for effective per- formance may change in importance while individu- als’ relative abilities remain stable (e.g., a pub starts accepting a new billing method and expects employ- ees to know how to handle it). Alternatively, the changing subjects model assumes that it is not the requirements of the task but each individual’s level of ability that change over time (e.g., because of increased experience and job knowledge or decreas- ing physical fitness).

Individual dimensionality implies that individuals performing the same job may be considered equally good, but the nature of their contributions to the orga- nization may be quite different. For example, two very different individuals working in the same bar may end up with the same overall performance evaluation if, for example, one of them does a great job making cus- tomers feel at home while the other one does a better job at keeping the books in order and the bills paid.

Criterion reliability, the consistency or stability with which the criterion can be measured over time, is a fundamental consideration in human resource inter- ventions. However, such reliability is not always given. More precisely, there are two major sources of unreliability. Intrinsic unreliability results from per- sonal inconsistency in performance, whereas extrinsic unreliability results from variability that is external to job demands or the individual, such as machine down- times, the weather (e.g., in construction work), or delays in supplies, assemblies, or information (in the case of interdependent work) that may contaminate Criterion Theory———133

ratings at some times but not necessarily at others.

Practically, there is little remedy for criterion unrelia- bility except to search for its causes and sample and aggregate multiple observations over the time and over the domain to which one wants to generalize.

Besides lack of reliability in the criterion itself, lack of reliability in the observationis another cause of discrepancy between the conceptual and the opera- tionalized criterion—that is, the criterion measure- ment may result in very different results depending on how it is rated and by whom. Thus, objective data and ratings by supervisors, peers, direct reports, cus- tomers, or the employee may greatly diverge in their evaluations of an employee’s performance for diverse reasons. Although an employee’s direct supervisor may seem to be the best person available to judge per- formance against the organization’s goals, he or she may actually observe the employee only rarely and thus lack the basis to make accurate judgments. Such direct observation is more frequent among peers, yet relationships (e.g., friendships, coalitions, in-groups versus out-groups) among peers are likely to contam- inate ratings. Direct reports, in contrast, primarily observe the evaluated individual in a hierarchical situ- ation that may not be representative of the evaluated individual’s overall working situation, and, if they fear for their anonymity, they, too, may present biased rat- ings. For jobs with a high amount of client interaction, clients may have a sufficiently large observational basis for evaluations, yet they may lack the insight into the organization’s goals needed to evaluate the degree to which employees meet these goals.

Operational criteria can be contaminated because of diverse errors and biases. Error, or random varia- tion in the measurement, lowers the criterion reliabil- ity and is best addressed through repeated collection of criterion data. Biases, however, represent a system- atic deviation of an observed criterion score from the same individual’s true score. Because they are likely to persist across measures, biases may even increase statistical indications of the operational criterion’s reliability. If the same biases are equally related to the measure used to predict the criterion (e.g., perfor- mance in a specific personnel selection procedure), they may also increase the predictor’s statistical crite- rion-related validity, even though this validity is not based on the actual criterion of interest but on the bias persisting across measurements.

Among the diverse possible biases, biases resulting from knowledge of predictor information or group membership and biases in ratings are particularly prominent. If, for example, a supervisor has been involved in the selection of a candidate, his or her impression during the selection process is likely to influence the evaluation of this candidate’s later fit or performance on the job. Group membership may incur bias if the organization provides support to some groups but not others (e.g., management training for women only) or if different standards are established for different groups (e.g., quota promotions). Finally, biases in criterion ratings may result from personal biases or prejudices on the part of the rater, from rating tendencies (leniency, central tendency, or severity), or from the rater’s inability to distinguish among different dimensions of the criterion (halo). These effects will become more severe as employees’ opportunities to demonstrate their proficiency become more unequal and as the rater’s observation becomes more inaccurate.

—Ute-Christine Klehe

See also Counterproductive Work Behaviors; Employee Selection; Job Performance Models; Training Methods

FURTHER READING

Binning, J. F., & Barrett, G. V. (1989). Validity of person- nel decisions: A conceptual analysis of the inferential and evidential bases. Journal of Applied Psychology, 74, 478−494.

Cascio, W. F. (1998). Applied psychology in human resource management(5th ed.). Upper Saddle River, NJ:

Prentice Hall.

Guion, R. M. (1998). Assessment, measurement, and pre- diction for personnel decisions.Mahwah, NJ: Lawrence Erlbaum.