9
North American neuropsychology is largely based on the psychometric tradition of mea- surement. Many neuropsychologists practicing in North America were trained in the context of graduate psychology departments. Because they received their training from psychologists, rather than from neurologists who may have provided ancillary training experiences, clinical neuropsychologists largely have chosen assessment instruments that involve standardized assessment. Many clinical neuropsychologists utilized a fixed battery approach in the past. In the early 1980s, the Battle of the Batteries consisted of adherents of either the Halstead-Reitan (HRNB) or of the Luria-Nebraska (LNNB) conducting studies, publishing results, and trading comments. It seemed as if neither battery would achieve hegemony. Surveys such as that described by Guilmette and Faust (1991) indicated that although the HRNB had greater frequency of usage, both batteries were commonly used.
Since that time, changes in some of the influences on clinical neuropsychology have resulted in a lesser emphasis on comprehensive evaluations using instruments that have been normed and standardized on a single sample. Whether this is a positive trend is sub- ject to review.
These influences include the development of sophisticated, highly focused assessment instruments such as the California Verbal Memory Test and the Warrington Recognition Memory Test in the realm of memory assessment. Yet another influence of change has been restrictions brought about by the carriers of reimbursement. Managed care organizations (MCOs) have severely limited the amount of time they will pay for in a neuropsychological evaluation. As a result, it is likely that fewer comprehensive evaluations are being con- ducted using full fixed batteries. A recent survey indicates that the use of the flexible battery approach has increased in frequency while the fixed battery approach has decreased (Sweet, Moberg, & Westergaard, 1996). Even with this decrease, the fixed battery will remain with us for a long time. If publications are any indication, there is not a strong decline in interest in these batteries as dependent measures (e.g., Haltiner, Temkin, Winn, & Dikmen, 1996;
Hom, Haley, & Kurt, 1997). Nonetheless the battery approach may undergo some modifica- tion in the future. One likely scenario would be for components of batteries to be normed on the same sample but for those components also to be applicable in their single manifesta- tions. An example of this approach can be seen in the Wechsler scales in which the same subset of subjects was used to norm the intelligence scales (WISC-III and WAIS-III), the
115
achievement scales (Wechsler Individual Achievement Test or WIAT), and the memory scales (WMS-III).
Rather than the battle of the batteries as described earlier, the recent controversy regarding the HRNB has been criticisms from a leading proponent of the flexible approach (Lezak, 1995). Lezak has been very vocal regarding criticisms of the fixed battery approach.
Earlier editions of her textbook had dismissed the LNNB, and the third edition contains a spirited and opinionated critique of the HRNB. Russell (1998) provided an ample response to the criticisms and suggested that Lezak had little understanding of the HRNB and of standardized testing. Some of Lezak's (1995) comments regarding the HRNB do seem to be misinformed, but some of Russell's (1998) responses assume a specific and particular use of the HRNB. It is true that there is an underlying assumption of pattern recognition being an important model of interpretation of the HRNB, but this assumption has not been well articulated by Reitan. It was Russell himself who in a series of writings (Russell, 1984, 1986, 1994, 1997) has provided the most cogent exegesis of this set of concepts. It is also true that other authors have provided different models of interpretation of the HRNB. For example, Jarvis and Barth (1994) presented a model, based on Reitan (1967), in which level of performance and the presence of pathognomonic signs also figure in the interpretation.
Bradford (1992) also discussed process relationships in the interpretation of data from the HRNB. Regardless of the method of interpretation favored by an individual, the accuracy of that interpretation is evaluable empirically, and that, not opinion, is the main issue in critiquing an instrument.
THE HALSTEAD-REITAN NEUROPSYCHOLOGICAL BAITERY
The HRNB is the forerunner of comprehensive neuropsychological batteries. Ward Halstead collected the procedures from the psychological literature after an extensive review. Later Ralph Reitan extended and modified the battery. The intent of the battery was to identify the psychological aspects of impaired brain function. Although there is general agreement that the HRNB is highly reliable and valid (Hevern, 1980), there is much more empirical information regarding its validity than there is regarding its reliability. Perhaps because of its longevity, the HRNB or its subtests are sometimes used as criterion measures in evaluating new tests (e.g., Coleman, Moberg, Raglund, & Gur, 1997). The original HRNB consisted of the following seven tests which were selected for their ability to discriminate between subjects with frontal lobe lesions and patients with other lesions or normal subjects (Halstead, 1947; Reitan & Davison, 1974): (1) the Category Test; (2) the Tactual Performance Test (a modification of the Seguin-Goddard Form Board); (3) the Rhythm Test (which originally appeared in the Seashore Measures of Musical Talent);
(4) the Speech Sounds Perception Test; (5) the Finger Oscillation Test (Finger Tapping Test); (6) the Critical Flicker Fusion Test; and (7) the Time Sense Test. The Flicker Fusion Test and the Time Sense Test are not typically included in current modifications of the HRNB, as they have not been shown to reliably differentiate neurologically impaired subjects from unimpaired subjects (Boll, 1981; Russell, Neuringer, & Goldstein, 1970). The five remaining tests produce seven individual scores, three scores (total time, memory, and location) being derived from the Tactual Performance Test (TPT). These scores are used to calculate an Impairment Index, which represents the proportion ofthe patient's scores that
fall within the impaired range. An Impairment Index of .5 is considered the cutoff for overall performance within the impaired range.
Although the Impairment Index has proved to be a clinically useful measure, it is important to note that diagnostic conclusions regarding the simple presence or absence of brain damage that are based on this measure have been found to be less accurate than those obtained by clinical judgment based on tests, interviews, and medical history (Tsushima &
Wedding, 1979). Because of the amount of time required to obtain the HII, an alternative index (Alternate Impairment Index or All; Horton, 1995a,b) using some of the same tests was proposed. Further research indicated that the All may not be as accurate as the HII in describing the degree of severity of brain damage (Horton, 1997). [Additionally, there is another summary index score available from the HRNB, namely, the General Neuro- psychological Deficit Scale (GND). This score sums information from the results of the individual subtests and appears to be sensitive to both localized and diffuse damage. The GND is discussed further in the section on validity.]
In addition to the tests listed in the preceding, many clinicians augment this core battery with tests of verbal and visuospatial memory (e.g., Dodrill, 1978, 1979; Matarazzo, Wiens, Matarazzo, & Goldstein, 1974; Russell, 1980), perceptual integrity (e.g., Reitan, 1966; Russell et aI., 1970), and motor performance (e.g., Harley, Leuthold, Matthews, &
Bergs, 1980; Matthews & Haaland, 1979). In addition to the original tests comprising the Halstead battery, Reitan has included several additional procedures: the Wechsler Adult Intelligence Scale, an aphasia and sensory-perceptual battery, the Trail Making Test, and a measure of grip strength. It should be noted that the titles "Halstead-Reitan Neuropsycho- logical Battery" and "Halstead-Reitan Battery and Allied Procedures" refer to three separate test batteries (adult, intermediate, and young children). The adult battery, the Halstead Neuropsychological Test Battery and Allied Procedures, is used for persons 15 years old and older. The procedures for children aged 9 to 15 years is the Halstead Neuropsychological Test Battery for Children and Allied Procedures. The battery for children aged 6 to 9 years is the Reitan Indiana Neuropsychological Test Battery for Children. Each of these batteries includes a minimum of 14 separate tests and 26 variables, as well as an aphasia and constructional praxis test of 31 separate items (Boll, 1981).
Normative Data
Table 9.1 provides a listing of cutoff scores for the adult HRNB and many of its commonly used additional procedures. The norms for this table were adopted from Hal- stead (1947), Reitan (n.d.), Russell et ai. (1970), and Golden (1977, 1978a).
Although the research literature in general is supportive of the HRNB, it is important to note that the original norms for this battery (see Boll, 1981) were not well founded. It is questionable whether the 29 subjects that were used as normals in this research were appropriate. For example, ten of the subjects were diagnosed as having "minor" psychi- atric problems, one subject was awaiting criminal sentencing (either life imprisonment or execution) at the time of testing, and four subjects were awaiting lobotomies because of aberrant behavior. In spite of these criticisms, the HRNB has proved to be quite robust in its ability to assess neurological impairment.
For those reasons, recent publications have sought to provide more comprehensive normative information. Bomstein (1986c) presented normative information on the differ-
TABLE 9.1 Ranges for Brain-Injured Performance on the Halstead-Reitan Battery
Test Aphasia exam Category test Finger Agnosia
Finger Tapping Test (dominant) Finger Tapping Test (nondominant) Fingertip Number Writing Grip Strength (dominant) Grip Strength (nondominant) Impairment Index
Rhythm Test
Speech Sounds Perception Test Suppressions (all modalities) Tactile Form Recognition Tactual Performance Test
Total time Memory Location
Trail Making Test (Part A) Trail Making Test (Part B)
Impaired range
>6 points or 72 errors
>50 errors
>2 errors
<51 taps
<46 taps
>3 errors
<40 kg
<35 kg
>.4 . >4 errors
>7 errors
>0
>0 errors
>942 seconds
<6 correct
<5 correct
>39 seconds
>91 seconds
ences between left- and right-sided perfonnance on the Grooved Pegboard, the Smedley dynamometer, and the Finger Tapping Test. Bomstein (1986a) examined the cutoff scores in relation to perfonnance by 365 nonnal subjects and found that between 15% and 80% of the subjects would have been Imsclassified as impaired. Fromm-Auch and Yeudall (1983) presented nonnative data based on a sample of 193 normal adult subjects. Steinmeyer (1986) reviewed the available nonnative data and calculated metanonns for the HRNB.
Alekoumbides, Charter, Adkins, and Seacat (1987) provided age and education corrections based on a limited sample of 235 patients.
Subsequently, Heaton, Grant, and Matthews (1991) published a set of nonnative infor- mation regarding the HRNB in which scores could be translated to standardized T-scores, stratified by gender, age, and education. The publication of this infonnation started a controversy in which Reitan questioned the wisdom of using age or education corrections (Reitan & Wolfson, 1995a). The basic argument is that the demographically based correc- tions may be unnecessary because the effects of brain damage are stronger than any effects of age or education. Shuttleworth-Jordan (1997) and Vanderploeg, Axelrod, Shere, Scott, and Adams (1997) reexamined the data presented by Reitan and Wolfson (1995a) and instead concluded that the demographic corrections are necessary, especially when the effects of subtle neurological impainnent may actually be masked by or mimicked by age or education, similar to earlier reports that age and education have consistent effects on test perfonnance whether or not the subject is neurologically impaired (Sherer & Adams, 1993).
The presence of age and education effects is consistent with current theory and empirical knowledge regarding neuropsychological test data, and it is probably wise to consider these factors when interpreting the results of the HRNB. However, there is another cautionary
consideration here. The cell sizes for the various stratifications of age, education, and gender are not given in the Heaton et al. (1991) manual. However, using simple arithmetic and the total number of subjects, we can see that the normative bases may be quite limited in size, especially for the lower education and upper age ranges. As a result, even these norms should be interpreted with caution. Fastenau and Adams (1996) presented these and other criticisms of the Heaton et al. (1991) published norms in a very negative review.
Heaton, Grant, Matthews, and Avitable (1996) responded with the unconvincing argument that the tables are based on a regression analysis of the entire sample rather than on just the individual cells of age, education, and gender permutations. This particular rejoinder is unconvincing as the tables are presented as comparisons to substantial collections of subjects complete with percentiles and T-scores. Other suggestions by Heaton et al. (1996), such as that the scores on the 60-item version of the Boston Naming Test be prorated to use the 85-item experimental version represented in the comprehensive norms, also are not convincing. Heaton et al. 's (1996) replies to other criticisms raised by Fastenau and Adams (1996) are reasonable. It is important to not lose sight of the fact that the Heaton comprehen- sive norms are superior in some aspects to those provided in the original publications and therefore represent a substantial contribution to the resources available to clinicians who use the HRNB. However, it should also be said that clinicians would be wise to not rely exclusively on this data base in the interpretation of HRNB scores. Fastenau (1999) examined the use of the Heaton norms in evaluating the performance of 63 healthy older adults in interpreting scores from the TMT-A, TMT-B, and Boston Naming Test. Unfortu- nately, the Heaton norms resulted in the creation of education effects in the sample. There was overcorrection of age influences. Clearly, caution should be used, at least in this age range (over 60 years old).
Reitan and Wolfson (1995a) argued against the use of these demographic corrections.
Their point is that the effects of cerebral impairment will largely overcome the effects of age, education, or gender. Reitan and Wolfson (1995a) marshalled evidence that the influence of age and education are much stronger for normal subjects than for impaired subjects. In their sample of 100 individuals equally divided between control subjects and subjects with documented brain damage, the General Neuropsychological Deficit Scale (GNDS) clearly separated the two groups of subjects. In addition, the GNDS was correlated with age and education only for the control group. The use of demographic correction factors may not affect the sensitivity of the HRNB to neurological impairment, but it might affect the specificity by increasing the false-positive rate in subjects with advanced age or low education. Moses, Pritchard, and Adams (1999) evaluated the Heaton et aI. (1991) demographic corrections in a sample of 290 neurological and 346 psychiatric patients.
Although these researchers concluded that the norms were helpful in reducing the effect of age and education on HNRB performance, they warned against interpreting the scores as reflecting the same levels of impairment on different tests because they were generated on a sample of only normal individuals.
It is not just the use of demographic corrections that is at issue here. The choice of normative groups and standards has extremely important implications for the interpretation of data. Kalechstein, van Gorp, and Rapport (1998) found very different conclusions in a set of Monte Carlo data when the fictitious test scores were compared to different sets of published norms. There is much variability in test performance even among normal subjects. It is extremely important to understand the characteristics of the normative sample
and compare them to the subject at hand in choosing a nonnative set of data for interpreta- tion purposes.
The GNDS is another method suggested for interpretation of the HRNB. The GNDS (Reitan & Wolfson, 1993) is a sum reflecting four different areas of infonnation from the HRNB; namely, level of perfonnance, pathognomonic signs, patterns and relationships among test results, and right-left differences. There are 42 different factors comprising these areas. In addition, it is impossible to calculate a Left Neuropsychological Deficit Scale (LNDS) and Right Neuropsychological Deficit Scale (RNDS). The GNDS was intended as an improvement on the Halstead Impainnent Index and the Average Impair- ment Index and as a way to detect diffuse neuropsychological impainnent.
Reitan and Wolfson (1993) reported that the GNDS could separate impaired from intact subjects using a cutoff of 26 points. Sherer and Adams (1993) found that although the GNDS could identify neurologically impaired subjects in their sample, it had a fairly high misclassification rate for pseudoneurologic subjects. Reitan and Wolfson (1995b) later cross-validated their original classification study and criticized Sherer and Adams (1993) for using subjects with low GNDS scores in their pseudoneurologic group. Then again, that appears to be the point of using pseudoneurologic subjects, to examine sensitivity in subtle cases.
Collingwood and Harrell (1999) examined the GNDS in a sample of psychotic and substance abusing patients with and without a history of closed head injury. The majority of the patients exhibited GNDS scores above the cutoff of 26 points, but there were no significant differences in comparisons of the patients with and without head injury. The GNDS may be sensitive to diffuse involvement. Sweeney (1999) found that the GNDS was more sensitive to the effects of nonimpact acceleration injuries from motor vehicle acci- dents than was either the comparison of subtest perfonnance to the original cutoffs or the translation of raw scores to demographically corrected T-scores. All of the individuals in this study had subjective complaints of cognitive difficulty, and 94% were involved in litigation. Because there was no external criterion, it can be concluded only that the GNDS was more sensitive, not that it was more accurate. Rojas and Bennett (1995) examined the GNDS in two groups of fairly well matched subjects in which one group had experienced mild traumatic brain injury and the other group had not. The GNDS was superior to the HII in accurately identifying subjects and the HII was superior to any subtest of the HRNB in isolation, although the difference between the accuracy of the HII (.80) and the Category test (.72) was small. The GNDS was superior to the HII in discriminating among leaming- disabled, head-injured, and nonnal subjects (Ostreicher & O'Donnell, 1995).
Effect of Age
The restricted age range of the nonnative group for the HRNB merits serious concern, especially because many cognitive functions are known to vary with age, and older indi- viduals may be likely to receive neuropsychological evaluations. The subjects ranged from 14 to 50 years of age; the average age was 28.3. However, as noted by Lewinsohn (1973), Prigatano and Parsons (1976), and Bak and Greene (1980), perfonnance on most of the subtests decreases with age. As a result, erroneous diagnostic conclusions may be reached with older adults (Ehrfurth & Lezak, 1982; Price, Fein, & Feinberg, 1979). In general, the
research literature indicates that HRNB tests that are more complex measures of cognitive skills show stronger age effects than measures of specific motor or sensory skill (Fitzhugh, Fitzhugh, & Reitan, 1964; Golden & Schlutter, 1978; Reed & Reitan, 1963; Reitan, 1967).
Older subjects tend to demonstrate greater deficits on tasks requiring immediate adaptive ability, and to perform better on tasks requiring use of stored information or previous experience. There are some age-related norms to address this problem. Pauker (1977), for example, supplied means and standard deviations for five age levels (for subjects between the ages of 19 and 76) for the seven commonly used HRNB scores and the Impairment Index using WAIS Full Scale IQ as a covariate. Harley et al. (1980) provided T-score conversions based on a sample of veterans for five age ranges from 55 to 79. Schludermann, Schludermann, Merryman, and Brown (1983) provided an excellent review of the neuro- psychological changes associated with aging using Halstead's data from older subjects.
Effect of Education
The patient's educational level can be a confounding variable in performance on the HRNB (Finlayson, Johnson, & Reitan, 1977). Well-educated subjects with neurological deficits may attain relatively high scores, whereas less educated, neurologically intact subjects may obtain low scores indicative of neurological impairment. Vega and Parsons (1967) reported that the Speech Sounds Perception Test and the Seashore Rhythm Test are most susceptible to this effect. Finlayson et al. (1977) reported that education is a confound- ing variable across all of the HRNB subtests.
Effect of Sex
Male-female differences have been reported on the HRNB. For example, in a study using a matched-subjects design for a sample of 47 non neurological subjects and a sample of 47 neurological patients, Dodrill (1979) found that males obtained significantly higher scores on tasks containing a strong motor and/or spatial component (e.g., Finger Tapping;
dynamometer; and Wechsler Memory Scale, Visual Reproduction). On the WAIS, females performed better on the Digit Symbol subtest, whereas males performed better on the Arithmetic, Picture Completion, and Block Design subtests. No significant differences were noted on the WAIS summary measures (Verbal IQ, Performance IQ, and Full Scale IQ). In their evaluation of the Finger Tapping Test, the Form Board, and the State-Trait Anxiety Test, King, Hannay, Masek, and Bums (1978) found a sex effect on the Finger Tapping Test: females performed slower. In females, trait anxiety was also found to be negatively correlated with finger tapping performance and to be positively correlated with the time used to complete the Form Board.
Reliability
Only a limited number of reliability studies have been reported that have assessed the HRNB in its entirety. The focus of the majority of this research has been test-retest reliability. Typically, research in this area has involved calculating the traditional psycho- metric index of reliability, that is, Pearson's coefficient of correlation (r).