Hanjoon Lee
Linda M. Delene
Mary Anne Bunda
WESTERNMICHIGANUNIVERSITYChankon Kim
SAINTMARY’SUNIVERSITYService quality is an elusive and abstract construct to measure, and extra actual medical outcome (O’Connor, Shewchuk, and Carney, 1994). A series of services marketing research, however, has effort is required to establish a valid measure. This study investigates the
looked at the relationship between the services expected and psychometric properties of three different measurements of health-care
the service actually perceived as received by recipients (Car-service quality as assessed by physicians. The multitrait-multimethod
man, 1990; Finn and Lamb, 1991; Parasuraman, Zeithaml, approach revealed that convergent validity was established for measures
and Berry, 1985, 1988; Zeithaml, Parasuraman, and Berry, based on the single-item global rating method and multi-item rating
1988). The services marketing approach places an emphasis method. On the other hand, almost no evidence of convergent validity
on quality evaluation from the recipients’ perspectives, but was found for the measures based on the constant-sum rating method.
ignores the necessity for including an evaluation of the techni-Furthermore, discriminant validity for the seven health-care service
qual-cal skill of the provider and the nature of the mediqual-cal outcome. ity dimensions measured by the three methods was not well established.
Especially in the area of health-care service, the services mar-The high levels of interdimensional correlations found suggested that the
keting approach seems to neglect the important role of physi-service quality dimensions may not be separable in a practical sense. The
cians in shaping patients’ service expectations. A balanced study suggested an ongoing effort is needed to develop a new service
approach, therefore, utilizing aspects of service quality from quality scale suitable to this unique service industry. J BUSN RES2000.
both the services marketing and health-care approaches may
48.233–246. 2000 Elsevier Science Inc. All rights reserved.
be required. At the same time, the physicians’ view toward the quality of their own services needs more research attention. For the success of health-care organizations, accurate mea-surement of health-care service quality is as important as
T
he health-care delivery system has been undergoingunderstanding the nature of the service delivery system. With-formidable challenges in the 1990s. Rapid movement
out a valid measure, it would be difficult to establish and toward systems of managed care and integrated
deliv-implement appropriate tactics or strategies for service quality ery networks has led health-care providers to recognize real
management. The most widely known and discussed scale competition. To be successful or even survive in this hostile
for measuring service quality is SERVQUAL (Parasuraman, environment, it is crucial to provide health-care recipients
Zeithmal, Berry, 1988). Since the scale was developed, various with service that meets or exceeds their expectations. At the
researchers have applied it across such different fields as secu-same time, it is important to known which dimensions of
rities brokerage, banks, utility companies, retail stores, and health-care services physicians believe are necessary to
consti-repair and maintenance shops. The scale has also been applied tute excellent service. It is crucial to have a better
understand-to the health-care field in numerous studies (Babakus and ing of service quality perceptions possessed by both recipients
Mangold, 1992; Brown and Swartz, 1989; Carman, 1990; and providers when shaping the health-care delivery system.
Headley and Miller, 1993; O’Connor, Shewchuk, and Carney, The traditional medical model has focused on the technical
1994; Walbridge and Delene, 1993). However, with a few nature of health-care events; the focus has been on the training
exceptions, they did not systematically examine the psycho-and updated skills of the physicians psycho-and the nature of the
metric properties of their scale, because these studies dealt with pragmatic and managerial issues for health-care services.
Address correspondence to Hanjoon Lee, Marketing Department, Haworth Validity of the SERVQUAL scale seems not to be fully estab-College of Business, Western Michigan University, Kalamazoo, Michigan
49008, USA. lished. A more stringent psychometric test has been
recom-Journal of Business Research 48, 233–246 (2000)
2000 Elsevier Science Inc. All rights reserved. ISSN 0148-2963/00/$–See front matter
mended for the improvement of the service quality measure- the most widely used quality assessment approaches has been ment (for a recent review, please see Asubonteng, McCleary, proposed in the structure-process-outcome model of
Donabe-and Swan, 1996). dian (1980). In this model, the structure indicates the settings
In this study, we sought to examine rigorously the psycho- where the health care is provided, the process indicates how metric properties pertaining to alternative methods of measur- care is technically delivered; whereas, the outcome indicates ing health-care service quality as perceived by physicians. the effect of the care on the health or welfare of the patient. Specifically, physicians were asked to assess health-care ser- In the structure-process-outcome model, quality was viewed vice quality along the seven dimensions of a modified SERV- as technical in nature and assessed from the physicians’ point QUAL scale. Dimensional responses were collected using three of view. It is well known that physicians pay significantly measurement methods: single-item global rating method, con- more attention to the technical and functional dimensions of stant-sum rating method and multi-item rating method, thus health-care service (Donabedian, 1988; O’Connor, Shewchuk, resulting in multitrait-multimethod (MTMM) data. Based on and Carney, 1994). This tendency might be attributal to physi-the results of construct validation conducted on physi-the MTMM cian education and training. Considering the potentially fatal data, we reported findings regarding the convergent validity and irrevocable consequences of poor medical quality (mal-of the three methods and the discriminant validity (mal-of the seven practice) in health care, in contrast to other service industries, service quality dimensions as measured by the three methods. it would be logical and desirable for physicians to hold such
an attitude.
A difference has been observed between the service
market-Previous Research
ing approach emphasizing recipients’ perspectives and thetraditional health-care approach honoring physicians’
con-Two Approaches in Health-Care Service Quality
cerns. Both patient groups and physician groups are important Service quality is an exclusive and abstract concept because
constituents of the health-care system. However, it has been of its “intangibility” as well as its “inseparability of production
found that health-care recipients have difficulty in evaluating and consumption” (Parasuraman, Zeithaml, and Berry, 1985).
medical competence and the security dimensions (i.e., cre-Various approaches have been suggested regarding how to
dence properties) considered to be the primary determinant define and measure service quality. The services marketing
of service quality (Bopp, 1990; Hensel and Baumgarten, literature has defined service quality in terms ofwhatservice
1988). This inability or impossibility of assessing the technical recipients receive in their interaction with the service providers
quality received in health-care service leads patients to rely (i.e., technical, physical, or outcome quality), and how this
more heavily on other dimensions, such as credibility or tangi-technical quality is provided to the recipients (i.e., functional,
bility (i.e., search properties) when inferring the quality of interactive, or process quality) (Gro¨noos, 1988; Lehtinen and
health-care service (Bowers, Swan, and Taylor, 1994). This Lehtinen, 1982; Berry, Zeithaml, and Parasuraman, 1985).
lack of patient ability to make a proper evaluation raises a Parasuraman, Zeithaml, and Berry (1985) asserted that
con-question regarding the gap analysis paradigm suggested by sumers perceive service quality in terms of the gap between
Parasuraman, Zeithaml, and Berry (1985). If customers in received service and expected service. They identified 10
di-the health-care delivery system cannot evaluate di-the important mensions of service quality: access, communication,
compe-service dimensions, can they have a reasonable expectation tence, courtesy, security, tangibles, reliability, responsiveness,
about services they will receive? If they cannot, the contribu-credibility, and understanding or caring. They then classified
tion of the health-care recipients’ views in influencing the 10 dimensions into three categories: search properties
(credi-design of an efficient system may not be as significant as we bility and tangibles; dimensions that consumers can evaluate
formerly thought. before purchase), experience properties (reliability,
respon-If the health-care service industry were similar to other siveness, accessibility, courtesy, communication, and
under-industries that provide services for their customers, a patient standing/knowing the consumer; dimensions that can be
could choose among many physicians who offer different judged during consumption or after purchase), and credence
prices, and provide service that differs in terms of medical properties (competence and security; dimensions that a
con-technical quality (i.e., competence and security) or other ser-sumer finds hard to evaluate even after purchase or
consump-vice-related dimensions. The reality in the health-care industry tion).
is different. Patients do not have enough information about In the area of traditional health-care research, the quality
their physicians. Even if more information were available and of health care has been viewed from a different perspective.
accessible, patients probably could not weigh the information Quality has been defined as “the ability to achieve
desir-properly. Physician choice is often made not by the patients able objectives using legitimate means” (Donabedian, 1988,
themselves, but through referral from the patient’s primary p. 173), where the desirable objective implied “an achievable
doctor, from his or her health organization (HMO), and/or state of health.” Thus, quality is ultimately attained when a
from friends. Although service recipients’ perceptions toward physician properly helps his or her patients to reach an
it is as crucial to understand physicians’ perceptions of service the confounding effects of random and systematic errors from trait variance. Without disentangling the variation in measures quality when designing and improving the health-care delivery
system. Therefore, this study placed its focus on how physi- attributable to the trait, we cannot assess the extent of the true relationship between the measures and traits (i.e., the cians perceive health-care service quality.
convergent validity) or the true relationships between traits (the discriminant validity).
Measurement Issues in
Despite its seminal role in the understanding and
assess-Health-Care Service Quality
ment of construct validity, the original Campbell and Fiske A system cannot be designed and operated effectively unless
(1959) approach to MTMM analyses has limitations. Most the quality of the product or service can be understood or
notably, it prescribes no precise standards for determining correctly measured. One major stride toward developing
evidence of construct validity. Furthermore, the procedure quantitative measures of service quality was made by
Parasura-does not yield specific estimates of trait, method, and random man, Zeithaml, and Berry (1985), and the SERVQUAL scale
variance. Several alternative procedures have been proposed was the consequence of this effort (Parasuraman, Zeithaml,
for analyzing MTMM data (for a review, see Bagozzi, 1993). and Berry, 1988). The 10 dimensions discussed in the 1985
The construct validation process in this study utilized two of study were reduced into five dimensions in SERVQUAL after
these alternative MTMM approaches; namely, application of an empirical test. Their original objective was to discover
di-the confirmatory factor analysis (CFA) model (Joreskog and mensions that were generic to all services. If this assumption
Sorbom, 1993) and the correlated uniqueness (CU) model is correct, dimensional patterns for service quality should be
(Marsh, 1989). similar across different service industries. Several researchers
have since examined the stability of SERVQUAL dimensions
(Asubonteng, McCleary, and Swan, 1996; Babakus and Boller,
Research Design
1992; Carman, 1990; Dabholkar, Thorpe, and Rentz, 1996).Design of the MTMM Study
Carman (1990) found that the numbers of service quality
Previous studies have indicated that SERVQUAL must be mod-dimensions were not stable across different services in his
ified for each unique service sector (Carman, 1990; Babakus factor analysis results. He also found that, among the five
and Boller, 1992). Haywood-Farmer and Stuart (1988) empir-dimensions, items measuring “tangibles” and “reliability”
con-ically tested SERVQUAL and found it did not encompass all sistently loaded on the expected factors across different
ser-the dimensions of professional service quality. They suggested vices. However, items tapping “assurance” and “empathy”
that service dimensions for core service, service customization, broke into different factors. A similar finding was reported
and knowledge and information be added to the five dimen-by Babakus and Boller (1992). There seems to be a consensus
sions of SERVQUAL. Of these additional dimensions, core that SERVQUAL is not a generic measure for all service
indus-service was found to be the most important factor not repre-tries and that service-specific dimensions other than those
sented in the SERVQUAL instrument. Related research of suggested in SERVQUAL may be needed to understand service
professional service quality perception was done by Brown quality perceptions fully.
and Swartz (1989). This study found that “professionalism” Although these studies have generated insight into the
and “professional competence” were significant factors for measurement properties of SERVQUAL, their measurement
both providers and patients in the evaluation of service quality. analyses, which were aimed primarily at checking
dimension-The modified SERVQUAL approach utilized in this re-ality, were inadequate for testing the construct validity of the
search, therefore, included the five dimensions of SERVQUAL scale. Construct validity is defined as the degree of
correspon-(Parasuraman, Zeithaml, and Berry, 1985), as well as the “core dence between constructs and their measures (Peter, 1981).
medical service” (Haywood-Farmer and Stuart, 1989) and the A systematic and rigorous construct validation requires
multi-“professionalism/skill” (Brown and Swartz, 1989) dimensions. trait-multimethod data, which is the correlation matrix for
The latter two dimensions were included to measure the tech-two or more traits where each trait is measured by tech-two or
nical aspects of health-care service. These same service quality more methods. Demonstration of construct validity requires
dimensions were also used in the earlier research of Walbridge evidence of convergent validity and discriminant validity
and Delene (1993), which involved physician attitudes toward (Campbell and Fiske, 1959).
service quality. The seven dimensions, their origins, and their The two main sources of variance in measures of a construct
definitions can be found in Table 1. are the construct or trait being measured and measurement
error. Measurement error can be divided further into random It is well known that measurement methods can affect the nature of a respondent’s evaluation (Kumar and Dillon, 1992; error and systematic error (e.g., method variance). Single
mea-sures do not allow us to make an assessment of measurement Phillips, 1981). Of the various methods used in measurement, three were selected for this research: single-item global rating error. With a single method, we cannot separate trait variance
Table 1. Service Quality Attributes
Attribute Definition Authors
Assurance Courtesy displayed by physicians, nurses, or office staff Parasuraman, Zeithaml, and Berry, 1988 and their ability to inspire patient trust and
confidence
Empathy Caring, individualized attention provided to patients by Parasuraman, Zeithaml, and Berry, 1988 physicians and their staffs
Reliability Ability to perform the expected service dependably and Parasuraman, Zeithaml, and Berry, 1988 accurately
Responsiveness Willingness to provide prompt service Parasuraman, Zeithaml, and Berry, 1988
Tangibles Physical facilities, equipment, and appearance of contact Parasuraman, Zeithaml, and Berry, 1988 personnel
Core medical service The central medical aspects of the service: appropriate-ness, effectiveappropriate-ness, and benefits to the patient
Professionalism/skill Knowledge, technical expertise, amount of training, Swartz and Brown, 1989 and experience
respondent with dimensions and definitions of each service to gather information regarding the relative importance of each dimension. One way to do this is through the use of dimension. With this method, the respondent reported his or
her evaluation rating on each dimension—without evaluating constant-sum rating method. The constant-sum rating method forces respondents to identify the comparative importance of the multiple indicators (components) of each dimension and
without comparing it to other service dimensions. The con- each service dimension. In health-care study, this constant-sum method was used to examine determinant dimensions stant-sum rating method, in contrast, is comparative in nature,
requiring the respondent to allocate a given number of “impor- in hospital preference (Woodside and Shinn, 1988). Constant-sum method also tends to eliminate individual response styles tance points” among various dimensions. In this method,
respondents were forced to think about the relative impor- of “nay-saying” and the “halo effects,” which cause respondents to carry over their judgments from one dimension to another tance of each service dimension. In the multi-item rating
method, multiple indicators were developed that were in- (Churchill, 1991). In an earlier, related study (Walbridge and Delene, 1993), it was believed that physicians may be reluctant tended to capture each of seven service quality dimensions.
It is generally accepted that the multi-item rating method to rate any service quality dimension as unimportant. Thus, the constant-sum rating method was employed in this research can provide a better sampling of the domain of content than
the single-item global rating method (Bagozzi, 1980). Thus, to determine its applicability as an efficient measurement method of health-care service quality where physicians’ per-content validity can be enhanced with multiple-item measures.
They also have the advantage of allowing the computation ceptions were surveyed.
There are a few drawbacks to using the constant-sum of reliability coefficients (e.g., Cronbach’s alpha, [Cronbach,
1951]). Reliability assessment with the single-item global rating method. The first is its inherent increase in task complexity for respondents. It requires more mental effort from the individual method is a problem in typical survey research studies, because
measurement error cannot be estimated with a single item. than either the single-item or multi-item methods. Each rating decision affects other ratings because of the constraints im-However, a drawback in using the multi-item rating
method in place of the single-item global rating method is posed by the nature of the measurement process. As the num-ber of attributes increase, respondents become more taxed the tendency toward questionnaire length along with possible
detrimental effects on response rate and respondent fatigue. (Aaker, Kumar, and Day, 1994; Malhotra, 1995). This increase in complexity may lead the subject to use a subset of the In other words, the single-item global rating method has the
potential advantage of parsimony for the respondent. There- dimensions instead of including all of them in his or her evaluation (Churchill, 1991). This effect may be heightened fore, in areas where there is little or no difference between
the explanatory power of single- and multi-item methods, the if the subject does not view the dimensions as being completely independent. This lack of independence was found to produce single-item global rating method may be preferable in studies
where parsimony is important. spurious correlations sometimes (Kerlinger, 1973).
seven dimensions of health-care service quality as measured Of the original 1,428 addresses, 72 were invalid. Six ques-by three different methods: single-item global rating method, tionnaires returned were unusable. A total of 348 responses multi-item rating method, and constant-sum rating method. were received from the two mailings with an effective response rate of 24.4%. Demographic characteristics of our sample were
Questionnaire Development
compared with those of the physician population in the United States in Table 2. The similarities become apparent through A panel of physicians was consulted on questionnaire designand semantics, and input was also received from a state univer- simple visual inspection. The population of physicians in the sity hospital. The questionnaire was divided into four sections, United States is 16.4% female; whereas, the sample was 18.7% with one section for each of the three measurement methods female. The age distribution of the sample was also somewhat and the last section containing demographic questions. Sec- similar, especially for physicians under the age of 65, which tion One utilized the single-item global rating method. The accounted for about 90%. The sample was similar to the subjects were given the name and definition of each dimension population on the basis of practice specialty. The goodness-indicated in Table 1 and asked to rate the importance of each of-fit tests were performed for sex, age, and specialty group dimension on a seven-point scale. Pretesting with physicians categories. The results werex2 50.4 (DF5 1,p5 0.729) showed that a conventional scale using the two bipolar adjec- for sex,x25 14.4 (DF5 4,p, 0.000) for age, andx25
tives “unimportant” and “important” was inappropriate. Physi- 4.8 (DF5 3,p5 0.084) for specialty group. These results cians were reluctant to rate any of the dimensions as “unimpor- suggest that the sample reflected the population’s sex and tant” or “less important.” Further pretesting results suggested specialty group compositions, but consisted of physicians who the use of “Important” for the low end (one) and “Critical” were somewhat older than the population.
for the high end (seven) in a seven-point scale.
Section Two was a constant-sum rating method that asked
Analysis
the subjects to distribute a fixed number of “importancepoints” among the seven dimensions. This led respondents
Instrument Reliability for the Multi-Item
to rate the comparative importance of each service dimensionRating Method
relative to the others. The same names and definitions of the
It is necessary to derive a composite score for each of the dimensions were used as in Section One.
seven service quality dimensions measured by the multi-item Section Three consisted of forty-three (43) “practice
charac-rating method. For this purpose, the level of internal consis-teristics.” Placed in random order, each practice characteristic
tency was checked as a way of assessing the homogeneity corresponded with one of the seven service quality
dimen-of items comprising each dimension. The Cronbach’s alpha sions, with between five and seven characteristics pertaining
indices for the seven dimensions ranged from 0.80 to 0.90, to each service quality dimension based on a previous study
with a mean of 0.85. This high degree of internal consistency (Walbridge and Delene, 1993) (please see Appendix A). In
(Nunnally, 1978) allowed us to sum the ratings to get compos-this section, physicians evaluated the practice characteristics,
ite scores for each of the seven dimensions. Each composite without referring to the names or definitions of the pertinent
score indicated a measure of each service quality dimension service quality dimensions. Practice characteristics were
evalu-obtained by the multi-item rating method. These composite ated using the same “important–critical” dichotomy used in
scores were used for the MTMM analysis along with the other Section One. Respondents then answered questions related
to demographic variables in the last section.
Table 2. Demographics: Population vs. Sample
Sampling
Populationa Sample
Physicians (1,428) were randomly selected by a commercial mail-order vendor from a national databased leased from the Age
American Medical Association. Some professional categories Under 35 24.4 20.3
35–44 39.8 32.1
were eliminated to remove nonphysicians from the list, as
45–54 21.6 18.5
well as specialties considered divergent from the mainstream
55–64 12.5 17.1
of health-care service (for a listing of the specialties used,
65 and over 1.7 12.1
please see Appendix B). The four-page, self-administered Gender
questionnaire was mailed to physicians. To attain a higher Male 83.6 81.0
Female 16.4 18.7
response rate, physicians received a “warm-up” postcard
an-Specialty group
nouncing the arrival of the questionnaire within the next week.
Primary care 34.5 39.4
The initial mailing of the questionnaire included a cover letter
Surgical 12.9 10.3
explaining the purpose of the research and the confidentiality Hospital based 10.8 12.9 of responses. Approximately 6 weeks later, a follow-up mailing Other specialties 22.5 33.3 of 1,200 questionnaires was sent to physicians who had not
scores assessed by the single-item global rating method and related, no such assumption is necessary for the CFA model. the constant-sum rating method. Finally, both CFA and CU models are premised on the additive
effects of traits and methods on measures.
Given its robust nature, the CU model is an attractive
Construct Validation of the Modified
alternative when the CFA model results in an ill-defined
solu-SERVQUAL Scale
tion or nonconvergence (Bagozzi, 1993; Marsh, 1989). We Our investigation of the construct validity of the modified
subsequently tested the CU model’s fit to our MTMM data SERVQUAL involved CFA of the multitrait–multimethod
(see Figure 1 for the diagram of the CU model). Another (MTMM) data. CFA allowsmethodsto affectmeasuresoftraits
application of the CU model in a similar situation can be in different degrees; whereas,methodsare assumed to covary
found in Kim and Lee’s (1997) construct validation study freely among themselves. CFA then provides assessments of
involving measures of children’s influences on family deci-over-all goodness of fit for the variable specification of the
sions. The CU model’s fit as indicated by the x2 test result
given MTTM data, while enabling the partition of variance in
(x2(105) 5 311.62,p5 .00) was unsatisfactory. However,
measures into trait, method, and error components. Trait
because of the x2 test’s sensitivity to sample size, some re-variance reflects the shared variation for measures of a
com-searchers (Bentler, 1990; Bagozzi and Yi, 1991) have suggested mon trait and can be used to assess convergent validity.
Dis-fit assessments based on other goodness-of-Dis-fit indices when criminant validity among traits is indicated by intertrait
corre-the sample size is suspected to be corre-the cause for rejecting lations significantly lower than unity (Bagozzi and Yi, 1991).
the hypothesized model. One frequently used measure is the As suggested by Bagozzi (1993) and Widaman (1985), we
comparative fit index (CFI), which evaluates the practical first tested a CFA model based on the hypothesis that the
significance of the variance explained by the model (for a variation in measures can be explained by traits and random
detailed discussion, seen Bentler, 1990 and Bagozzi, Yi, and error (i.e., the trait-only model). In this model, there are seven
Phillips, 1991). For our CU model, computation of the CFI traits (i.e., seven service quality dimensions), and each trait
yielded .96. This is much greater than the .90 rule of thumb is indicated by threemeasures. Each of threemeasuresis related
suggested as the minimum acceptable level by Bentler (1990). to its own rating method (i.e., single-item global rating
Therefore, the CU model captures a significant proportion of method, constant-sum rating method, etc.,). This model
re-variance of our MTMM data from a practical point of view; sulted in poor fit, as indicated byx2(168) 51935.04, p5
hence, little variance remains to be accounted for. .00. A probable cause for the trait-only model’s poor fit was the
Table 3 presents the estimated factor loadings for the CU presence of method factors as important sources of variation in
model. Significant trait factor loadings (t.2.0) establish the the measures (Bagozzi, 1993; Widaman, 1985). Subsequently,
convergent validity of the measures (Widaman, 1985; Bagozzi another CFA model that incorporated trait and method factors
and Yi, 1991). Although the trait factor loadings for all the was tested. Estimation of this trait-method model was not
measures based on global single-item method and multi-item possible, however, because iterations failed to converge. Such
method are significant, only three of the seven constant-sum occurrence in the confirmatory factor analysis of a
trait-measures were significant. The three dimensions of health-care method model is not uncommon (Marsh and Bailey, 1991;
service quality for which the constant-sum measure exhibited Bagozzi, 1993; Van Driel, 1978). Also, frequently found in
convergent validity are assurance, responsiveness, and tangi-the CFA solution are improper parameter estimates, such as
bles. An assessment of the extent of convergence shown by negative variances. In all these instances, the confirmatory
each measure requires a decomposition of the total variance factor analysis model is construed as an inappropriate
specifi-into proportions attributable to the corresponding trait and cation of the variable structure and must be rejected (Bagozzi
random error. As in the CFA model, the amount of trait and Yi, 1991).
variance in a measure is inferred by the squared trait factor In view of these problems that frequently accompany the
loading for that measure. For all seven dimensions of health-application of CFA models to MTMM data, Marsh (1989)
care service quality, trait variances for constant-sum measures proposed the CU model as an alternative. The CU model
were extremely low, with a range between 0.00 and 0.03 (or differs from the CFA model primarily in the interpretation of
0 and 3%). The best results were found for the global single-method effects. In the CFA model, single-method effects are inferred
item measures. Their trait variances ranged between 0.45 and by squared method factor loadings. In contrast, the CU model
0.83, with a mean level of .57. The seven multi-item measures specification does not include method factors. Instead, method
showed levels of trait variance generally lower than the global effects are depicted as and inferred from correlations among
single-item measures. Trait variances for these measures error terms corresponding to the measures based on common
ranged from 0.30 to 0.49, with a mean of 0.39. According to method. This depiction of method effect is the main reason
Bagozzi and Yi (1991), strong (weak) evidence for convergent why the CU model seldom produces an ill-defined solution
validity is achieved when at least (less than) half of the total (Marsh, 1989, p. 341). Another difference between the two
variation in a measure is caused by trait. According to this approaches rests on the assumption regarding method
Figure
1
.
Correlated
uniqueness
model
for
the
MTMM
Table 3. Summary of Parameter Estimates for the Correlated Uniqueness Model
Factor Loading Traits
Core Medical Professionalism/
Assurance Service Empathy Skills Reliability Responsiveness Tangibles
Single-item global measure
Assurance 0.69 (0.10) 0.00 0.00 0.00 0.00 0.00 0.00
Core medical service 0.00 0.67 (0.10) 0.00 0.00 0.00 0.00 0.00
Empathy 0.00 0.00 0.75 (0.10) 0.00 0.00 0.00 0.00
Professionalism/skills 0.00 0.00 0.00 0.71 (0.10) 0.00 0.00 0.00
Reliability 0.00 0.00 0.00 0.00 0.71 (0.10) 0.00 0.00
Responsiveness 0.00 0.00 0.00 0.00 0.00 0.83 (0.11) 0.00
Tangibles 0.00 0.00 0.00 0.00 0.00 0.00 0.91 (0.11)
Constant-sum Measure
Assurance 20.18 (0.06) 0.00 0.00 0.00 0.00 0.00 0.00
Core medical service 0.00 20.01 (0.06) 0.00 0.00 0.00 0.00 0.00
Empathy 0.00 0.00 0.05 (0.06) 0.00 0.00 0.00 0.00
Professionalism/skill 0.00 0.00 0.00 0.06 (0.06) 0.00 0.00 0.00
Reliability 0.00 0.00 0.00 0.00 0.10 (0.06) 0.00 0.00
Responsiveness 0.00 0.00 0.00 0.00 0.00 0.16 (0.06) 0.00
Tangibles 0.00 0.00 0.00 0.00 0.00 0.00 0.16 (0.06)
Multi-item Measure
Assurance 0.63 (0.09) 0.00 0.00 0.00 0.00 0.00 0.00
Core medical service 0.00 0.55 (0.09) 0.00 0.00 0.00 0.00 0.00
Empathy 0.00 0.00 0.70 (0.09) 0.00 0.00 0.00 0.00
Professionalism/skill 0.00 0.00 0.00 0.62 (0.09) 0.00 0.00 0.00
Reliability 0.00 0.00 0.00 0.00 0.66 (0.09) 0.00 0.00
Responsiveness 0.00 0.00 0.00 0.00 0.00 0.60 (0.09) 0.00
Tangibles 0.00 0.00 0.00 0.00 0.00 0.00 0.62 (0.09)
Standard error of estimates are shown in parantheses.
All zero values indicate that their corresponding parameters were fixed.
for most of our global single-item measures (5 out of 7). Trait indicates the existence of a significant method effect, the mag-nitudes of the uniqueness correlations (range: 0.03–0.36; variances for all seven multi-item measures fall below the level
of 0.5. Therefore, evidence for convergent validity is weak for mean 0.19) suggest that the size of method effect is small. The very large error variances shown in Table 4b demonstrate these measures using multi-item rating method; whereas, the
constant-sum measures exhibit little or no convergent validity. that almost all the variations in the constant-sum measures are attributable to random error. With regard to the multi-As noted before, the effects of methods under the CU model
are represented as correlations among error (uniqueness) item measures, as can be seen in Table 4c, all uniqueness covariances are significant. Uniqueness correlations were also terms. Although the CFA model enables the separation of the
variance portion that is caused by method bias, we can only generally high (range: 0.37–0.71; mean 0.59).
Our next investigation focused on discriminant validity infer the significance and size of the method bias in the CU
model analysis based on examination of the estimated unique- among the seven dimensions of health-care service quality. It consisted in verifying whether the correlations among the ness correlations. Table 4(a), 4(b), and 4(c) display the
esti-mated error variances and covariances for single-item global seven dimensions (i.e., traits) as measured by three different methods were significantly different from unity (11 or21) measures, constant-sum measures, and multi-item measures,
respectively. For the single-item measures, a significant covari- (Widaman, 1985; Bagozzi, Yi, and Phillips, 1991). As shown in Table 5, all of the correlations among the dimensions are ance between error terms were found in 14 of 21 possible
cases (see Table 4a). When these covariances were converted significant and very high (range: 0.69–0.99; mean: 0.84). Seven of the 21 correlations were above the 0.90 level. Such into correlations, the values ranged from 0.28 to 0.82, with
an average of 0.59. These levels of uniqueness correlations high correlations among service quality dimensions (range: 0.67–0.92; mean: 0.82) were also observed in the study con-demonstrate a considerable degree of method effect contained
in the measurement. Therefore, a substantial portion of the ducted by Dabholkar, Thorpe, and Rentz (1996). It should be noted, however, that these correlations are disattenuated variations in the global single-item measures can be attributed
to the measurement procedure. correlations (i.e., corrected for measurement error) and are larger than those correlations among measures. Particularly For the constant-sum measures, 16 of the 21 uniqueness
Table 4. Summary of Parameter Estimates for the Correlated Uniqueness Model
Traits
Core Medical Professionalism/
Assurance Service Empathy Skills Reliability Responsiveness Tangibles
(a) Error Variance and Covariance for Single-Item Global Measures
Assurance 0.51 (0.12)
Core medical service 0.37 (0.11) 0.53 (0.12)
Empathy 0.35 (0.13) 0.33 (0.12) 0.41 (0.14)
Professionalism/skills 0.35 (0.10) 0.34 (0.11) 0.28 (0.10) 0.51 (0.13)
Reliability 0.32 (0.11) 0.38 (0.11) 0.30 (0.11) 0.41 (0.12) 0.49 (0.13)
Responsiveness 0.27 (0.11) 0.23 (0.11) 0.22 (0.11) 0.26 (0.12) 0.26 (0.13) 0.31 (0.16)
Tangibles 0.14 (0.11) 0.09 (0.11) 0.11 (0.11) 0.08 (0.13) 0.10 (0.13) 0.09 (0.14) 0.17 (0.19)
(b) Error Variance and Covariance for Constant-Sum Measures
Assurance 0.99 (0.08)
Core medical service 20.03 (0.06) 1.00 (0.08)
Empathy 0.24 (0.06) 20.18 (0.06) 0.99 (0.08)
Professionalism/skills 20.35 (0.06) 20.14 (0.06) 20.13 (0.06) 0.99 (0.08)
Reliability 20.32 (0.06) 20.21 (0.06) 20.21 (0.06) 0.20 (0.06) 0.99 (0.08)
Responsiveness 20.16 (0.06) 20.20 (0.06) 20.15 (0.06) 20.16 (0.06) 0.27 (0.06) 0.97 (0.08)
Tangibles 20.10 (0.06) 20.23 (0.06) 20.09 (0.06) 20.18 (0.06) 0.07 (0.06) 0.26 (0.06) 0.97 (0.08) (c) Error Variance and Covariance for Multi-Item Measures
Assurance 0.58 (0.11)
Core medical service 0.35 (0.09) 0.68 (0.09)
Empathy 0.34 (0.11) 0.36 (0.09) 0.49 (0.12)
Professionalism/skills 0.39 (0.08) 0.41 (0.09) 0.32 (0.09) 0.60 (0.11)
Reliability 0.32 (0.09) 0.32 (0.09) 0.28 (0.09) 0.31 (0.10) 0.55 (0.11)
Responsiveness 0.31 (0.08) 0.34 (0.08) 0.28 (0.08) 0.29 (0.08) 0.37 (0.09) 0.63 (0.10)
Tangibles 0.38 (0.08) 0.38 (0.08) 0.32 (0.08) 0.43 (0.09) 0.41 (0.09) 0.37 (0.08) 0.61 (0.10)
All error variance and covariance estimates differing significantly from zero are underscored. Standard error of estimates are within parentheses.
and empathy (0.99), which is near unity. This high correlation the multi-item measure. For the constant-sum measure, on the other hand, there was virtually no sign of convergence. Almost between the assurance dimension and the empathy dimension
seemed to be consistent with the findings of the past studies all of the variance in the seven constant-sum measures (for the seven service dimensions) was attributed to random error. that discovered the dimensional instability of the SERVQUAL
scale (Babakus and Boller, 1992; Carman, 1990). A formal With respect to discriminant validity, from a strict statistical viewpoint, discrimination was demonstrated among the seven test of discriminant validity was conducted by computing a
95% confidence interval (the estimated correlation6twice its health-care service quality dimensions, except for one instance (between “assurance” and “empathy”). That is, all intertrait standard error estimate) for each of the estimated correlations
among the seven dimensions. Despite the high levels of corre- (or interdimensional) correlations except one were signifi-cantly less than unity. However, the magnitudes of the in-lation observed between the dimensions, only one (that
be-tween assurance and empathy) fell within the interval. Hence, tertrait correlations were generally very high, with a mean value of 0.84. Hence, the seven dimensions did not seem from a strict statistical point of view, discriminant validity
was established, except for between assurance and empathy. separable in a practical sense. We should note, however, that the interpretation of discriminant validity is meaningful only However, whether these dimensions are distinct from a
practi-cal standpoint is highly questionable. when convergent validity is established (Bagozzi, 1993). Given our finding that convergent validity was established for two In summary, the above results of the CU model analysis
of the MTMM data first led us to conclude that convergent of the three types of measures tested, the evidence relating to discriminant validity should be viewed with caution. validity was established for two of the three measures, the
single-item global measure and multi-item measure. Based on
Bagozzi and Yi’s (1991) rule of thumb, only the single-item
Implications and Conclusion
global measure, which captured an average trait varianceTable 5. Summary of Parameter Estimates for the Correlated Uniqueness Model
Trait Intercorrelation Traits
Core Medical Professionalism/
Assurance Service Empathy Skills Reliability Responsiveness Tangibles
Assurance 1.00
Core medical service 0.94 (0.03) 1.00
Empathy 0.99 (0.02) 0.91 (0.03) 1.00
Professionalism/skills 0.80 (0.04) 0.92 (0.03) 0.77 (0.04) 1.00
Reliability 0.89 (0.03) 0.90 (0.03) 0.83 (0.03) 0.95 (0.02) 1.00
Responsiveness 0.81 (0.04) 0.80 (0.05) 0.76 (0.04) 0.85 (0.04) 0.91 (0.03) 1.00
Tangibles 0.72 (0.05) 0.78 (0.05) 0.69 (0.05) 0.83 (0.04) 0.83 (0.03) 0.83 (0.03) 1.00
All error variance and covariance estimates differing significantly from zero are underscored. Standard error of estimates are within parentheses.
dimensions constituting health-care quality and valid ap- standing of the issues involved in the questions reduces mea-surement error in responses. Thus, such an outcome may not proaches to their measurement. This research focused on
con-ceptual and measurement issues relating to the study of health- be obtained from health-care recipients, who may not possess such a clear understanding. Nonetheless, this finding suggests care quality. In contrast to most of the past research in this
area, we took the physician’s (service provider’s) rather than that single-item global measures may elicit responses that are as reliable as the multi-item measures when knowledgeable the patient’s (service recipient’s) perspective. This approach
is justified in view of the prevalent understanding that health- service providers are involved, and do so with greater parsi-mony. The single-item global rating method may be useful if care recipients are often unable to evaluate key dimensions
of health-care service (Bopp, 1990; Hensel and Baumgarten, the goal of a study is to gain an understanding for thegeneral nature of health-care service issues. We should add, however, 1988), and, thus, may not have as much to contribute to the
design of an effective health-care system as providers. Another that assessment of reliability level for single-item measures is not possible in most cases. This remains a major problem for contrast is found in methodological approach. Whereas past
studies that investigated the validity of the SERVQUAL scale the single-item global rating method.
When the research is to be diagnostic in nature, focusing tended to lack methodological rigor and scope, our construct
validation procedure based on the MTMM data analysis al- onspecificcharacteristics of the service offering in an effort to identify areas for improvement, the multi-item rating method lowed for a more systematic scrutiny of key measurement
properties of the scale (i.e., convergent validity, discriminant has greater utility. The multi-item rating method has the dis-tinct advantage of being able to generate detailed information validity, and method bias).
First, we compared the performance of the constant-sum on specific aspects of service quality that can be used as a basis for action plans. As a caveat, it should be noted that rating method, the single-item global rating method, and the
multi-item rating method in measuring the health-care service our recommendation regarding the use of the single-item global rating method and the multi-item rating method is quality. All seven measures based on the constant-sum method
showed almost complete lack of convergence with the mea- limited to future research involving health-care service provid-ers’ perceptions. For research involving the perceptions of sures based on other methods. One plausible explanation for
this is the relatively high degree of complexity inherent in patients who do not understand the key dimensions of health-care service quality, the multi-item rating method seems to the measures using the constant-sum method. This measure
requires more effort on the part of the respondents, and, thus, be a better choice, because this method is less susceptible to measurement error than the single-item global rating method. is likely to create cognitive strains. Consequently, resulting
responses may not be as reliable as those obtained by other In terms of the discriminant validity of the seven health-care service quality dimensions, our results were not supportive of methods. In fact, many physicians seemed to have difficulty
allocating the importance points among the seven categories. the validity. The computed magnitudes of interdimensional correlations were very high. Although all correlations except In contrast to common expectation, the single-item global
measures performed better than the multi-item measures in one satisfied the statistical criterion applied (i.e., significantly less than unity), their magnitudes (ranging between 0.69– capturing the intended dimensions. An attempt to generalize
this finding beyond health-care providers may be inappropri- 0.99) cast much doubt on the separability of these dimensions from a practical viewpoint. Considering that a similar finding ate, because the result could have been caused by the high
under-criminant Validity by the Multitrait-Multimethod Matrix. Psycho-SERVQUAL scale or its modified versions in health-care
ser-logical Bulletin56(2) (1959): 81–105.
vice quality research. Because the validation of a measure is
Carman, James M.: Consumer Perceptions of Service Quality: An
an ongoing process, we suggest that more research be directed
Assessment of the SERVQUAL Dimensions. Journal of Retailing toward producing a suitable adaptation of the SERVQUAL
66(1) (Spring 1990): 33–55.
scale. It is important for this research to take into consideration
Churchill, Gilbert A., Jr.:Marketing Research: Methodological Founda-the unique aspects of this particular service sector. tions, 5th ed. The Dryden Press, Chicago. 1991.
This study limited its research scope to physicians’
percep-Cronbach, Lee J.: Coefficient Alpha and the Internal Structure of
tions toward health-care service quality. Under CQI or TQM, Tests.Psychometrika16 (1951): 297–334. patients’ perceptions or evaluations of health-care services also
Dabholkar, Pratibha A., Thorpe, Darly I., and Rentz, Joseph O.: A
play a critical role. If health-care providers do not understand Measure of Service Quality for Retail Stores: Scale Development how service recipients evaluate health-care services, it is diffi- and Validation. Journal of Academy of Marketing Science 24(1) cult for providers to design or improve strategic planning (1996): 3–16.
and marketing activities effectively. Therefore, research based Donabedian, Avedis: Quality Assessment and Assurance: Unity of
Purpose, Diversity of Means.Inquiry(Spring 1988): 175–192.
upon the patients’ perspective is necessary. Based upon the
perceptions of both parties in the health-care delivery system, Donabedian, Avedis:Explorations in Quality Assessment and Monitor-ing, vol. 1:The Definition of Quality and Approaches to Its Assessment. we can identify areas where mutual understanding exists,
Health Administration Press, Ann Arbor, MI. 1980.
means to inform and educate the public, and ways to improve
Finn, David W., and Lamb, Clarles W., Jr.: An Evaluation of the
the current delivery system.
SERVQUAL Scale in a Retail Setting, in Solomon, R.H., ed. Ad-vances in Consumer Research, vol. 18, Association of Consumer
References
Research, Provo, UT. 1991.Aaker, David A., Kumar, V., and Day, George S.:Marketing Research.
Gro¨nroos, Christian: Service Quality: The Six Criteria of Good Per-John Wiley & Sons, Inc., New York, NY. 1995.
ceived Service Quality.Review of Business(Winter 1988): 1–9. Asubonteng, Patrick, McCleary, Karl J., and Swan, John E.:
SERV-Haywood-Farmer, John, and Stuart, F. Ian: Measuring the Quality QUAL Revisited: A Critical Review of Service Quality.The Journal
of Professional Services. The Management of Service Operations. of Services Marketing10(6) (1996): 62–71.
Proceedings of the 3rd Annual International Conference of the Babakus, Emin, and Mangold, W. Glynn: Adapting the SERVQUAL UK Operations Management Association. 1988.
Scale to Hospital Services: An Empirical Investigation. Health
Headley, D.E., and Miller, S.: Measuring Service Quality and Its
Services Research26 (February 1992): 767–786.
Relationship to Future Consumer Behavior.Journal of Health Care
Babakus, Emin, and Boller, Gregory W.: An Empirical Assessment Marketing13(4) (December 1993): 32–41. of the SERVQUAL Scale.Journal of Business Research24(3) (1992):
Hensel, James S., and Baumgarten, Steven, A.: Managing Patient 253–268.
Perceptions of Medical Practice Service Quality.Review of Business
Bagozzi, Richard P.: Causal Models in Marketing.John Wiley and 9(3) (Winter 1988): 23–26. Sons, New York. 1980.
John, Joby: Improving Quality Through Patient-Provider Communi-Bagozzi, Richard P., Yi, Youjae, and Phillips, Lynn W.: Assessing
cation.Journal of Marketing Management1(1) (Fall 1991): 51–60. Construct Validity in Organizational Research.Administrative
Sci-Joreskog, Karl G., and Sorbom, Dag:LISREL 8: Structural Equation ence Quarterly36(3) (1991): 421–458.
Modeling with the SIMPLIS Command Language.Lawrence Erlbaum, Bagozzi, Richard P., and Yi, Youjae: Multitrait–Multimethod Matrices
Hillsdale, NJ. 1993. in Consumer Research.Journal of Consumer Research17 (March
Kerlinger, Fred N.:Foundations of Behavioral Research. Holt, Rinehart, 1991): 426–439.
and Winston, Inc., New York. 1973. Bagozzi, Richard P.: Assessing Construct Validity in Personality
Re-Kim, Chankon, and Lee, Hanjoon: Development of Family Triadic search: Applications to Measures of Self-Esteem.Journal of
Re-Measures for Children’s Purchase Influence.Journal of Marketing search in Personality27(1) (1993): 49–87.
Research34(3) (August 1997): 307–321. Bentler, Peter: Comparative Fit Indexes in Structural Models.
Psycho-logical Bulletin107(2) (1990): 238–246. Kumar, Ajith, and Dillon, William R.: An Integrative Look at the Use of Additive and Multiplicative Covariance Structure Models Berry, Leonard, Zeithaml, Valarie, and Parasuraman, A.: Quality
in the Analysis of the MTMM Data.Journal of Marketing Research
Counts in Services, Too.Business Horizons28 (May/June 1985):
24 (February 1992): 51–64. 44–52.
Lehtinen, Uolevi, and Lehtinen, Jarmo: Service Quality: A Study of Bopp, Kenneth D.: How Patients Evaluate the Quality of Ambulatory
Quality Dimensions. Unpublished working paper, Service Man-Medical Encounters: A Marketing Perspective.Journal of Health
agement Institute, Helsinki, Finland. 1982.
Care Marketing10(1) (March 1990): 6–15.
Malhotra, Naresh K.:Marketing Research.Prentice Hall, Upper Saddle Bowers, Michael R., Swan, John E., and Taylor, Jack A.: Influencing
River, NJ. 1996. Physicians Referrals. Journal of Health Care Marketing14 (Fall
1994): 42–50. Marsh, Herbert W.: Confirmatory Factor Analyses of Multitrait– Multimethod Data: Many Problems and a Few Solutions.Applied
Brown, Stephen W., and Swartz, Teresa A.: A Gap Analysis of
Profes-Psychology Measurement13 (1989): 335–361. sional Service Quality.Journal of Marketing53(4) (April 1989):
Dis-Alternative Models.Applied Psychological Measurement15 (1991): Perceptions of Hospital Operations by a Modified SERVQUAL Approach. Journal of Health Care Marketing 10(4) (December 47–70.
1990): 47–55. Nunnally, Jum C.:Psychometric Theory, 2nd ed. McGraw-Hill, New
Swartz, Teresa A., and Brown, Stephen W.: Consumer and Provider York. 1978.
Expectations and Experiences in Evaluating Professional Service O’Connor, Stephen J., Shewchuk, Richard M., and Carney, Lynn
Quality.Journal of the Academy of Marketing Sciences17(2) (Spring W.: The Great Gap.Journal of Health Care Marketing14(2) (1994):
1989): 189–195. 32–39.
Van Driel, O.P.: On Various Causes of Improper Solutions of Maxi-Parasuraman, A., Zeithaml, Valarie, and Berry, Leonard: A Concep- mum Likelihood Factor Analysis. Psychometrika 43(2) (1978):
tual Model of Service Quality and Its Implications for Future 225–243. Research.Journal of Marketing49 (Fall 1985): 41–50.
Walbridge, Stephanie W., and Delene, Linda M.: Measuring Physician Parasuraman, A., Zeithaml, Valarie, and Berry, Leonard: SERVQUAL: Attitudes of Service Quality. Journal of Health Care Marketing
A Multiple-Item Scale for Measuring Consumer Perceptions of 13(1) (Winter 1993): 6–15. Service Quality.Journal of Retailing64(1) (1988): 12–40.
Widaman, Keith F.: Hierarchically Nested Covariance Structure Mod-Peter, J. Paul: Construct Validity: A Review of Basic Issues and Market- els for Multitrait–Multimethod Data.Applied Psychological
Mea-ing Practices.Journal of Marketing Research18 (May 1981): 133– surement9(1) (March 1985): 1–26. 145.
Woodside, Arch, and Shinn, Raymond: Customer Awareness and Phillips, Lynn W.: Assessing Measurement Error in Key Informants’ Preferences Toward Competing Hospital Services. Journal of
Reports: A Methodological Note on Organizational Analysis in Health Care Marketing8(1) (March 1988): 39–47.
Marketing.Journal of Marketing Research 18 (November 1981): Zeithaml, Valarie A., Parasuraman, A., and Berry, Leonard L.: Prob-395–415. lems and Strategies in Services Marketing.Journal of Marketing
Appendix A. Specifications
Quality Attribute Activity
Reliability: Ability to perform the expected service dependably and accurately
—Accuracy in patient billing
—Current, accurate and neat medical record —Correct performance of the service the first time —Physician reputation among patients
—Physician reputation among other physicians —Reputation of the hospital
—Physician compliance with Universal Precautions
Professionalism/skill: Knowledge, technical expertise, amount of training, experience, etc.
—Knowledgeable, skilled nurses and support staff —Residency trained physicians
—Highly experienced physicians —Physician specialty board certification —Knowledgeable and skilled physicians
—Explaining trade-offs between service and cost to patient —Physician history of malpractice
Empathy: Caring, individualized attention provided to patients by physicians and their staffs
—Alleviating patient concerns about the medical treatment —Personal demeanor of the physician
—Learning the patient’s individual needs
—Providing individual consideration to the patient —Remembering names and faces of patients
Assurance: Courtesy displayed by physicians, nurses, or office staff and their ability to inspire patient trust and confidence
—Courteous, friendly nurses and support staff —Explaining the cost of service to the patient —Explaining the medical service to the patient —Courteous and friendly physicians
—Sensitivity to patient confidentiality
Core medical services: The central medical aspects of the service; appropriateness, effectiveness, and benefits to the patient
—Physicians who have published in medical journals —Well-established physician referral base
—Effective utilization of services
—Physicians who participate in medical research —Appropriate utilization of services (non-defensive) —Positive medical outcome
—Orientation to preventative medicine —Emphasis on patient education
Responsiveness: Willingness to provide prompt service
—Providing the service at the time promised —Prompt service without an appointment —Physician accessibility to patients by phone —Convenient office hours for patients —Adherence to patient appointment schedule
Tangibles: Physical facilities, equipment and appearance of contact personnel
—Professional appearance/dress of the support staff —Professional appearance/dress of the physician —Location of the office
—Location of the hospital
Appendix B. Specialty Groupings
Primary carea Specialists Hospital-Based Surgical
Family practitionerb Allergist/immunologist Anesthesiologist General surgeon
General practitioner Chest physician Emergency medicine Neurological surgeon
Pediatrics Dermatologist Pathologist Ophthalmologist
Internal medicine Geriatric Radiologist Orthopedic surgeon
Occupational medicine Nuclear medicine Plastic surgeon
Oncologist Thoracic surgeon
Physical medicine Ob/gyn
Psychiatrist Colon/rectal
Neurologist Otolaryngologist
Urologist