The Current Status of Test Development in Neurobehavioral Toxicology

Ann M. Williamson

There has been a proliferation in the number of tests used in neurobehavioral toxicology in recent years and an increase in the number of groups producing test batteries. Nevertheless, the area still remains a difficult one despite the increased interest in it, and many questions still remain unanswered.

Neurobehavioral methods have been used to examine an increasing number of toxicants, but the impact of the findings from such studies has differed considerably across countries. Evidence from neurobehavioral testing has been highly influential in lowering acceptable health standards for lead and solvent exposure, for example, in Scandinavian countries, but it has had little effect in many other countries (e.g., Australia and Britain). Political forces no doubt play a large part in these differences, but the fact remains that evidence from neurobehavioral tests is simply not convincing to many decision makers in the latter countries.

The reasons for this are the problems encountered in neurobehavioral testing, namely, problems of selection of controls; accounting for confounding variables such as alcohol, drug use, education, age, and socioeconomic status;

selection of sensitive and comprehensive tests; and quantifying exposure. It is apparent that these problems are often regarded as sufficient evidence to reject an overwhelming weight of evidence that would otherwise be seen as convincing.

For occupational lead exposure, for example, a review of the literature since 1980 demonstrates that of the 14 or so papers published

THE CURRENT STATUS OF TEST DEVELOPMENT IN NEUROBEHAVIORAL TOXICOLOGY

al representation of the original work has been recomposed from XML files created from the original paper book, not from the are true to the original; line lengths, word breaks, heading styles, and other typesetting-specific formatting, however, cannot be errors may have been accidentally inserted. Please use the print version of this publication as the authoritative version for attribution.

Behavioral Measures of Neurotoxicity http://www.nap.edu/catalog/1352.html

on neurobehavioral effects, all but 2 show statistically significant impairments in some tests in lead workers compared to controls. In all studies, lead exposure levels were within the subclinical range (i.e., below 3.8 µmol/L). Despite this, critics of the field tend to ''throw the baby out with the bathwater'' and concentrate their attention on the undeniable flaws in most of the studies without looking at the literature as a whole. It therefore becomes essential to ensure that neurobehavioral testing for toxic effects is as rigorous as possible if it is to have a significant impact on decision makers.

RESEARCH DESIGN AND TEST SELECTION

Although the major factor of interest here is the choice of tests, it is not really possible to divorce questions of experimental design from those relating to test selection. A number of workers have reviewed the study design problem for neurobehavioral toxicology (Gamberale, 1985; Valciukas and Lilis, 1980).

Virtually all studies in this area are cross sectional, in which exposed workers are compared to nonexposed workers on their performance of a battery of neurobehavioral tests. Test selection is a problem in this type of design because many tests are sensitive to extraneous differences between exposed and nonexposed workers such that they contribute to, and potentially confound, the finding of differences in test performance. For example, in a study by Parkinson et al. (1986), when the effects of age, education, and income were removed statistically, significant differences between lead-exposed workers and controls on some neurobehavioral tests disappeared.

Problems of confounding in this design are usually dealt with by matching exposed and control groups or by statistical means. Appropriately chosen tests however can produce the same effect through the selection of tests that are not vulnerable to the effects of confounding factors such as age, education, or ethnic background, at least for working populations. This has not been done in any study to date. It is common though for researchers to investigate the effect of putative confounding variables prior to using various statistical techniques to minimize confounding (Hogstedt et al., 1983; Valciukas and Lilis, 1982).

For the few prospective cohort studies done in this area, the problem of test selection lies in selecting tests that are not susceptible to the effects of practice.

Because workers are tested on more than one occasion, it is essential that test- retest results are not muddied by the fact that workers will improve from one test session to another simply because they have seen the test before. The results of the single

THE CURRENT STATUS OF TEST DEVELOPMENT IN NEUROBEHAVIORAL TOXICOLOGY

his new digital representation of the original work has been recomposed from XML files created from the original paper book, not from the Page breaks are true to the original; line lengths, word breaks, heading styles, and other typesetting-specific formatting, however, cannot be pographic errors may have been accidentally inserted. Please use the print version of this publication as the authoritative version for attribution.

Behavioral Measures of Neurotoxicity http://www.nap.edu/catalog/1352.html

prospective study performed on occupational lead exposure (Mantere et al., 1984), for example, were less convincing because of strong training effects on the performance of a number of tests.

REASONS FOR TEST SELECTION

In most studies a set or battery of psychological or behavioral tests is used.

Typically, the tests are chosen for one or more of the four following reasons:

1. The test is a well-standardized psychological test for which the distribution of scores in the population is already established, e.g., subtests from the Wechsler Adult Intelligence Scale (WAIS;

Wechsler, 1955).

2. The tests will measure some aspects of functioning that are known to be influenced by the toxic substance based on clinical evidence;

e.g., knowing that inorganic mercury exposure produces unintentional hand tremor would result in a tremor test or some other motor test being included in the test battery.

3. Each test corresponds roughly to a particular psychological function, e.g., including the Santa Ana test for motor functions, the Digit Symbol test for attention, and Digit Span test for memory functions.

4. Tests may be chosen on some theoretical grounds such that each test is juxtaposed against another in some logical manner (Williamson et al., 1982).

In some studies however, no rationale is provided for the choice of tests in a particular battery.

Various test batteries are currently employed for a range of neurotoxic substances, and there are a number of reviews that describe them (Anger, 1984, 1986; Gullion and Eckerman, 1986; Johnson and Anger, 1983). Some of these batteries, such as that devised at the Finnish Institute of Occupational Health (Hanninen and Lindstrom, 1979), have been used in multiple studies of a range of toxic hazards. Consequently, some estimate can be made about their sensitivity. Many batteries, however, are simply put together for a single purpose, and if no rationale is provided for test selection, the reliability, validity, and sensitivity are largely unknown, particularly if no sound rationale exists from previous work.

THE CURRENT STATUS OF TEST DEVELOPMENT IN NEUROBEHAVIORAL TOXICOLOGY

Behavioral Measures of Neurotoxicity http://www.nap.edu/catalog/1352.html

Neurobehavioral Core Test Battery

The Neurobehavioral Core Test Battery (NCTB) battery was developed at a meeting of a group of experts from a range of disciplines as a joint initiative of the World Health Organization (WHO) and the National Institute for Occupational Safety and Health (NIOSH). The aim of the meeting was to devise a test battery that could be used to screen for neurotoxic effects with particular reference to its use in developing countries. Although the rationale for test selection was not much different from that used in many other studies, this initiative constitutes a significant leap forward because it is an attempt to set up norms for each test that are applicable to comparisons between and within cultural boundaries. It is argued that by applying the test battery in a range of countries it should be possible to estimate the influence of cultural differences on test performance. Moreover, on a more practical level, such cross-cultural comparison should allow for better interpretations between studies performed in different countries.

The Computerized Battery Approach

The computerized battery approach capitalizes on the recent boom in personal computer development by designing a battery that is administered by computer. Two advantages of computerized testing are that test administration is standardized and reproducible, requiring minimal involvement by the test administrator, and that data handling and scoring are made easier so that the results can be reported immediately.

Two main research groups have developed computer-administered test batteries. Probably the most well known is the battery developed by Baker and Letz at Harvard (Baker et al., 1985). After some standardization procedures, the final test battery includes three memory tests from the Wechsler Memory Scale (WMS) or Wechsler Adult Intelligence Scale (WAIS); two tests classified as measuring verbal concept formation both from the WAIS; four visuomotor tests two of which are from the WAIS; and a mood scale. Three of these tests are from the NCTB.

This battery has been subjected to some validation (Baker et al., 1983), although the description of the investigation of test reliability is not clear (Fidler et al., 1987). Some estimates can be made as to the sensitivity of this battery because it has been used to show impairments in both lead-exposed (Baker et al., 1984) and solvent-exposed workers (Fidler et al., 1987).

The second well-known computer-assisted battery was developed

THE CURRENT STATUS OF TEST DEVELOPMENT IN NEUROBEHAVIORAL TOXICOLOGY

Behavioral Measures of Neurotoxicity http://www.nap.edu/catalog/1352.html

at the University of North Carolina by Foree et al. (1984). This battery (also known as the Microtox battery) was based on the work of Carroll (1980), who proposed a theory of 10 factors to represent the range of cognitive abilities.

Carroll holds that all test performance is partitionable into smaller building blocks which he called elementary cognitive tasks (ECTs). The battery consists of three sensory tests, two psychomotor tests, two attention tests, eight memory tests, and one test that is classified as "other."

Investigations of validity and reliability were performed by Carroll in the development of the theory. The sensitivity of the battery has been evaluated to some extent in studies of the effects of carbon monoxide and alcohol (Foree et al., 1984).

Model or Theory-Based Tests and Test Batteries

A few test batteries have some theory of psychological function as their basis. The aim of this approach is to facilitate interpretation of results. Smith and Langolf (1981), for example, take the view that tests selected for a battery should be ability-specific and have some underlying theoretical structure. This, they argue, allows interpretation to be made in terms of the processing stages or systems that produce performance on a test; furthermore, scores can be given for individual stages in processing. The argument advanced by Smith and Langolf appears to rest mostly on their use of the Sternberg memory scanning test (Sternberg 1966, 1975), which has a well-developed theoretical basis. There has been considerable debate, however, on the adequacy of Sternberg's theory to account for all aspects of performance on this test. Gullion and Eckerman (1986) state that this debate is sufficient to make unwarranted any inferences about the intactness of underlying cognitive processes on the bases of performance on the memory scanning task.

Rather than choosing particular tests that have well-developed theoretical structures, two research groups have employed theory-based test batteries. The first is the Microtox test battery described above. In this battery, Carroll's elementary cognitive task theory (Carroll, 1980), is used to guide test selection.

The second is a battery devised by Williamson and colleagues (Williamson and Teo, 1986; Williamson et al., 1982, 1987), in which the choice of tests is based on information-processing theory (Wickens, 1984, 1987). This battery includes a sensory test, three psychomotor tests, a sustained attention test, and four memory tests: a sensory store memory test, two short-term or working memory tests, and a long-term memory test. Validation has

THE CURRENT STATUS OF TEST DEVELOPMENT IN NEUROBEHAVIORAL TOXICOLOGY

Behavioral Measures of Neurotoxicity http://www.nap.edu/catalog/1352.html

been carried out on this test battery, and some limited studies of reliability have been done, but these results have not been published. The sensitivity of this battery has also been evaluated in detecting effects of inorganic mercury (Williamson et al., 1982), inorganic lead (Williamson and Teo, 1986), and prolonged exposure to the underwater environment (Williamson et al., 1987).

Model or theory-based test batteries have the advantage of providing a comprehensive coverage of neurobehavioral functions that is often missing from other approaches to battery design. Screening test batteries in particular should be designed on this basis. For screening purposes, battery design should proceed as if nothing is known about the neurobehavioral effects of the toxin in question because clinical symptoms may be very misleading. For example, a commonly reported symptom in toxically exposed workers is fatigue (Fidler et al., 1987; Hanninen et al., 1979; Valciukas et al., 1979). However, a researcher would have great difficulty selecting an appropriate test to investigate this symptom further unless a holistic or theoretical approach was taken in designing the test battery. Clinically manifested fatigue can have mental or physical origins so which should be tested for? In addition, if, for example, the fatigue is due to problems in maintaining mental effort, it would be impossible to determine which of the functional areas or stages of processing is responsible. Unless all possibilities are tested no clear conclusions can be made regarding the action of the supposed toxin.

Another difficulty with designing a battery without some global structure is that interpretation of test results can be made only for single tests. For example, a typical study of the effects of lead exposure in which the battery included tests of a range of neurobehavioral functions may have found impairments in reaction time, learning, and memory functions in lead workers, compared to nonexposed controls, but no apparent impairment of other functions. In this case the conclusion would be made that lead exposure affects motor, learning, and memory functions. If, however, the tests could be related on the basis of a cohesive theory, the interpretation might be very different. For example, in the study of lead exposure by Williamson and Teo (1986) the clustering of performance impairments seen in lead workers compared to controls suggested that sensory motor, learning, sensory store, and short-term but not long-term memory functions were affected. By using the information- processing principles on which the tests were selected, however, because vision was involved in the performance of each test and vision was impaired, it is just as likely that lead is affecting only the sensory function measured (in this case,

THE CURRENT STATUS OF TEST DEVELOPMENT IN NEUROBEHAVIORAL TOXICOLOGY

Behavioral Measures of Neurotoxicity http://www.nap.edu/catalog/1352.html

vision) and that problems relating to the adequacy of stimulus input would explain the impairments in the other functions. This possibility is being pursued.

INFLUENCE ON TESTING SELECTION OF DIFFERENT OBJECTIVES FOR TESTING

The position taken above regarding the utility of theory-based test batteries is also relevant to the question of the most important objectives for neurobehavioral testing. It is commonly agreed that there are two levels of testing (Hanninen, 1981). The primary level focuses on screening for neurobehavioral insult by particular substances; the secondary level, on determination of the functional and, if possible, the neurological sites and mechanisms of the toxic action. Test selection is affected by the level or reason for testing. For screening batteries it is argued that tests should be quick and easy to administer and should concentrate on the known effects of the toxin, whereas for the second level of testing it is maintained that tests can be more complex and time-consuming.

Although this dichotomy of levels has real merit on practical grounds, there has been a tendency to focus too much on the "getting the job done"

approach (Gullion and Eckerman, 1986) at the expense of "understanding the phenomenon." There has been a tendency in doing so to use tests that are fast and expedient rather than comprehensive and informative about the toxic effect on the system. This does not necessarily mean understanding the phenomenon but, rather, being aware of the breadth of the problem. A test battery designed around an information-processing model, for example, will provide a proper screening tool that reduces the possibility of Type 1 errors occurring simply because the affected function was not tested adequately or at all. As Wickens (1987) states about information-processing theory,

From the perspective of human factors, the importance of the distinction between processing stages results because knowing that a particular environmental stressor, chemical toxicant or system characteristic influences one processing stage and not another has important implications for system redesign or reconfiguration. For example, knowing that a given stressor influences response processes and not encoding should lead the designer to focus on the improvement in control, rather than the display interface.

In the same way, neurobehavioral toxicology needs to be able to distinguish effects on function by appropriate choice of tests at the screening level of testing. Questionnaires are simply not adequate for initial screening.

THE CURRENT STATUS OF TEST DEVELOPMENT IN NEUROBEHAVIORAL TOXICOLOGY

Behavioral Measures of Neurotoxicity http://www.nap.edu/catalog/1352.html

Dalam dokumen Behavioral Measures of Neurotoxicity.pdf (Halaman 73-86)