Reconsidering performance evaluative style
K. Vagneur
a,*, M. Peiperl
b aPricewaterhouse Coopers (London)bLondon Business School
Abstract
Hopwood, A. G. (1972), An empirical study of the role of accounting data in performance evaluation.Journal of Accounting Research,10, 156±182, modeled ``performance evaluative style'' as the predictor of unintended eects from performance measurement control systems, stimulating one of the few areas of cumulative research in behavioral accounting. However, despite twenty-®ve years of empirical testing, this stream of research has failed to converge. This paper considers the validity issues created by evolution in the conceptualisation and speci®cation of the relevant variables and classi®es them by calculation type. Results of an empirical test designed to explore comparability between the variable types are reported. Finally, implications for interpreting prior research and for future research directions are considered.#2000 Elsevier Science Ltd. All rights reserved.
Traditionally, management theory considers performance an outcome. Performance measure-ments are used as surrogates for performance outcomes, implicitly assuming measurement does not in¯uence performance. Argyris (1952) chal-lenged this practice by positing that performance measurement control systems in¯uence organiza-tional outcomes. Since then, a small but growing cross-disciplinary literature has explored these ``unintended eects''. Hopwood (1972, 1973) modeled subordinate perceptions of ``performance evaluative style'' as a predictor of various unin-tended behavioral outcomes such as job related tension and dysfunctional decision making. He also argued that these behaviors could negatively aect long-term performance (Hopwood, 1973, p. 192). Since then, accumulated evidence on these models has been complex and has failed to converge. This paper aims to help in the development of the
evaluative style concept by exploring some subtle dierences in conceptualisation and method found in this research area.
1. Evaluative style in the literature
Argyris (1952) and then Simon et al. (1954) explored the human side of formal measurement control systems. Both studies concluded that bud-gets and budgeting processes can be associated with important human relations problems. These included worker±management separation, cross-boundary con¯ict and job-related tension. This was a substantial departure from the mechanistic approach to performance measurement found in traditional management theory (e.g. Taylor, 1911; Chandler, 1962; Anthony, 1965). Subsequent research has tended to focus on Argyris' suggestion that it is the way formal measurement controls are used that stimulates organizational problems.
www.elsevier.com/locate/aos
0361-3682/00/$ - see front matter#2000 Elsevier Science Ltd. All rights reserved. P I I : S 0 3 6 1 - 3 6 8 2 ( 9 8 ) 0 0 0 0 2 - 6
Hopwood (1972, 1973) concentrated analysis on one independent construct to embody Argyris' concept of variation in use. He operationalized a four-level categorical variable to measure sub-ordinate perceptions of the importance of unit budget results in their superior's evaluation of the respondent's performance. Hopwood found that perceptions of high budget importance in evalua-tion, a ``budget constrained style'', correlated with increased job related tension, less favorable rela-tions with peers, and unit size (1973, p. 170). He posited that high budget emphasis in performance evaluation would be associated with higher levels of data manipulation, distrust, rivalry and dys-functional decision making vis a vis costs, custo-mer service and innovation, and argued that these would negatively aect performance. Hopwood's evidence for manipulation and dysfunctional deci-sion making was developed from interviews (n=20) and detailed analysis of accounts; it was not tested statistically on the larger survey sample (n=167). His argument was consistent with White (1961) who suggested that inter-departmental con¯ict was associated with certain cost allocation processes. However, the only performance vari-able Hopwood measured was budget perfor-mance. It had no signi®cant association with evaluative style, although a subset of the sample (32%), respondents in departments with high var-iation in evaluative style scores, did re¯ect a very weak association (p=0.11).
1.1. The Otley±Hopwood debate
Otley (1978) sought to replicate Hopwood's study with some modi®cation in variable speci®-cation and method. In particular, Otley chose a site with low between-unit interdependence, sug-gesting that unit budgets might not be an appro-priate control device when interdependencies are high (which was the case in the Hopwood study). The Hopwood-Otley ®ndings have attracted much debate and discussion, as well as some confusion. The principal dierence between the two studies is that Hopwood found job related tension to be predicted by a budget constrained evaluative style, while Otley did not. Otley reported that his subject managers ``appeared to adapt, with little felt
stress, to the budgetary system as they perceived it to be operated'' (1978, p. 136). Otley highlighted the level of interdependence as a main factor behind this dierence. He also suggested that the dierence may have occurred because the respon-dents in his sample had pro®t center responsi-bility, while Hopwood's were cost center managers. Finally, Otley suggested that the accounting system in his sample may have pro-vided more complete measures of performance than the budgetary system Hopwood described, and that the resulting performance outcomes may have in¯uenced senior managers' future choice of evaluative style.
Although Otley found no signi®cant linear asso-ciation between performance evaluative style and job related tension, he did ®nd evidence of beha-vioral outcomes predicted by his evaluative style variable. For example, a budget constrained style had a positive association with trust in supervisor (r=0.19, p=0.05), a negative association with perceived ambiguity in evaluation (r=ÿ0.22,
p=0.03) and a very weak negative association with felt ambiguity of job (r=ÿ0.14,p=0.10). Otley also found evidence of non-linear associations between behavioral variables and evaluative style (cf. Vag-neur, 1994, 1995).
Otley tested unit budget performance on a sub-set of his sample (49%), respondents who had been in their positions long enough to be ``able to in¯uence'' both the budgeting process and unit performance over one budget cycle. For this subset, the association between evaluative style and output budget performance was strong (r=0.51,p=0.002), although the subset was small (n=19). Other types of budget performance re¯ected little or no signi®cant association with evaluative style.
1.1.1. Mixed results in the accumulated evidence
and a large number of potential model variables because of the many contingent relationships that have been proposed. Table 1 summarizes a sample of tests of association between evaluative style and performance, including tests of both main eects and interactions (the variable types are explained below). Overall, the results have been mixed (see Briers & Hirst, 1990; Vagneur, 1995; Otley et al., 1996 for in-depth discussions).
2. Evolution in variable conceptualization and speci®cation
Subtle dierences and ambiguities have emerged within this literature stream, beginning with the Otley±Hopwood con¯ict itself. That the stream of post-Hopwood empirical testing has failed to converge is due at least in part to changes in
con-ceptualization and to operationalization of the relevant variables.
To operationalize performance evaluative style, Hopwood (1972) created a four-level categorical variable, based on whether a respondent had nominated each of two items (Table 2, column I, items 5 and 7) as being among the three most important criteria used by the supervisor in evalu-ating the individual's performance. Relative posi-tion within the top three did not aect the category score.
Otley (1978) modi®ed both the question content and the calculation of Hopwood's variable (see Table 2, column II). The content changes accoun-ted for dierences in language and frame of refer-ence that Otley observed in his organization relative to Hopwood's. Otley's modi®cation to the calculation was to take account of the relative position of items 5 and 7 within the top three list, if
Table 1
Examples of evaluative style associations with performance
Typea Study N Association Speci®cation Performance
Main eects
A Hopwood (1972, 1973) 167 Ns Unknown Unit-budget
B Otley (1978) 19 Sig Obj (1) Unit-budget
D Kenis (1979) 169 Sig SS (1) Unit-budget
E Govindarajan (1984) 58 Ns Ss (12) Unit-overall
D Gupta (1987) 58 Ns Ss (12) Unit-overall
E Imoisili (1989) 102 Ns Ss (7) Individual
Interaction eects Participation
C Brownell (1982) 38 Sig Ss (9) Individual
D Brownell and Dunk (1991) 79 Sig Ss (9) Individual
Participation and task uncertainity:
D Brownell and Dunk (1991) 79 Sig Ss (9) Individual
Manufacturing automation:
D Dunk (1992) 24 Sig Ss(1) Unit
Environmental complexity:
D Brownell (1987) 56 Neg Ss (9) Individual
Function:
D Brownell (1985) 40 Ns Ss (9) Individual
Strategy:
B Govindarajan (1988) 121 Sig Ss (10) Unit
D Gupta (1987) 58 Sig Ss (12) Unit
Strategic "mission":
D Gupta (1987) 58 Ns Ss (12) Unit
Association: Neg, negative signi®cant; Sig, signi®cant; Ns, not signi®cant
Performance measure: Type: Obj, objective measure; Ss, respondent self-rated. Number of criteria used: (1), (7), (9), (10), or (12)
both were nominated. This change resulted in ®ve categories, by splitting one of Hopwood's cate-gories into two.
Changes in the speci®cation of evaluative style have continued as this literature stream has devel-oped. In fact, at least some speci®cation change has occurred in the majority of studies in the post-Hopwood literature (see Table 3).
The content of most evaluative style variables relates to respondent perceptions of budget use in performance evaluation (e.g. Hopwood, 1972, 1973; Otley, 1978; Brownell, 1982; Govindarajan, 1988), but the construct has been conceptualized in other ways. For example, Hirst (1983) designed a study around the perceptions of the relative importance of quantitative measures in perfor-mance evaluation. Another variation used ques-tions on budget process, including participation, evaluation on variance, goal diculty and pun-ishment (Kenis, 1979). Govindarajan (1984) articulated the evaluative style concept as the extent to which subjective factors (vs formulae) were used to determine bonuses.
Each of these approaches has subtle dierences in its conceptual base. Reconciliation of such dierences would require consideration of both individual psychological responses to performance assessment and the nature of the systemic eects
created by budgets and other formal and informal management control processes (e.g. reward, planning, training and information systems). This presents a signi®cant opportunity for further research drawing on psychology, organizational behavior and behavioral accounting research. In the present study, however, our approach was not to reconcile the various approaches conceptu-ally but to understand and test their operational dierences.
2.1. Content evolution and content validity
Most evaluative style measures have been structured for respondents to rate or select from a list of alternative choices (question content) deter-mined by the researcher. Those scores are then manipulated (a calculation) to form the variable. The research strategy developed by Hopwood (1972, 1973) and replicated by Otley (1978) was to select a sample from one company and to under-take extensive inductive ®eld research within that company in order to develop the question content. This approach assumes that every ®eld site will be dierent, and that it is necessary to identify the particular vocabulary of the site by inductive research in order to provide meaningful choices to respondents.
Table 2
Comparison of variable content dierences
I. Hopwood's content II. Otley's equivalent III. Used in this study
1. How much eort I put into the job. The eort I put into my job. The eort I put into my job. 2. My concern with quality. My concern with quality. My concern with quality.
3. How much pro®t I make. My contribution to company pro®ts. 4. My ability to handle my men. The relationships I have established
with my sta and men.
The relationships I have established with sta.
5. My concern with costs. How eciently I run my unit. How eciently I run my unit. 6. How well I get along with my boss. How well I get on with group sta. How well I get on with my superiors. 7. Meeting the budget. How well I meet my budget. How well I meet my budget.
8. Objective customer service ratings.
9.My attitude to my work and company My attitude toward my work. My attitude toward my work.
10. How well I develop a team.
11. How well I cooperate with colleagues.
Hopwood's respondents were asked:
1. ``When your departmental supervisor is evaluating your performance how much importance do you think he attaches to the following items?'' (A 5-point anchored Likert-type scale was provided to score each criterion.)
Other studies have used a conceptually dierent research strategy, employingdeductivemethods to assess the appropriateness of question content (e.g. Govindarajan, 1984, 1988; Imoisili, 1989;
Harrison, 1992). Content validity rests on the issue of providing sucient choice to measure variation within the sampling frame. For evaluative style, it would be dependent on the extent to which
Table 3
Evoluation in evaluative style variable speci®cation
Study Source Description
A Hopwood (1972/1973)
New Categorical. Nomination to a rank order (top three) list of budget and cost concern from list of seven alternatives (Table 2, column I). Relative rank order within list did not aect the category scoring. The evaluative styles are:
4. Budget constrained styleÐmeeting the budget (item 7, Table 2 , Column I) but not eciency (item 5) was nominated to the list of the three most important.
3. Budget±pro®t styleÐboth meeting the budget and eciency were nominated to the top three.
2. Pro®t conscious styleÐeciency, but not meeting the budget was nominated to the top three.
1. Non-accounting styleÐneither meeting the budget nor eciency was nominated to top three.
B Otley (1978)
Modi®ed Hopwood
Ordinal. Nomination to a rank order list. Modi®ed question content (see Table 2, column II). Relative position within rank order of budget and eciency used. This splits Hopwood's budget-pro®t style (number 3 above) into budget-pro®t (budget precedes eciency) and pro®t-budget (eciency ranked before budget).
Kenis (1979) New Continuous. Summed 5-point Likert-type ratings on budgeting characteristics (e.g. evaluation on variance, goal diculty, punishment).
C Brownell (1982)
New Binary. Nomination to rank order. Content unclear, modi®ed either Otley or Hopwood. Assigns value of 1 to Hopwood's categories 1 and 2 (above); and a 0 to categories 3 and 4.
Hirst (1983) New Continuous. 5-point Likert-type rating, on ®ve questions on quantitative measurement use in evaluation and reward.
Govindarajan (1984)
New Continuous. Raw decimal. Score of respondent perceptions of the percent that performance bonus is formula-based vs subjective based.
Gupta (1987) Govindarajan (1984)
Continuous. Raw decimal. Score of respondent perceptions of the percent that performance bonus is formula-based vs subjective based.
Imoisili (1989) Hybrid Continuous. 7-point Likert-type ratings. Hopwood content. Manipulated reported: ``average of the raw scores to determine the styles''.
D Brownell (1985)
New Continuous. 5-point Likert-type ratings. Content not reported; modi®ed Hopwood or Otley. Sums ratings for budget and cost/revenue concern.
Brownell and Hirst (1986)
Modi®ed Hopwood
Categorical. Nomination to rank order. Modi®ed Hopwood/Otley to ten items.
Brownell (1987)
Brownell (1985)
Continuous. 5-point Likert-type ratings. Same as Brownell (1985)
Govindarajan (1988)
Hopwood Categorical. Nomination to rank order. Same as Hopwood (1972, 1973)
Brownell and Dunk (1991)
Modi®ed Brownell (1985)
Continuous. 7-point Likert-type ratings. Used Hopwood content. Calculated as Brownell (1985), sums ratings on budget and cost concern.
Dunk (1992) Modi®ed Brownell (1985)
Continuous. 5-point Likert-type ratings. Content unclear. Sumed ratings for budget and cost concern
E Harrison (1992)
New Continuous. 5-point Likert-type ratings. Content unreported, used Brownell and Hirst (1986) instrument. Created ratio of the sum of budget and cost concern ratings divided by sum of ratings other items to capture relative scores of accounting criteria to non-accounting criteria.
question content includes the most important ele-ments in the performance evaluative environment. If important elements are missing, responses may re¯ect variation in less important alternatives, and thus may not adequately measure within-sample variation in evaluative style. Therefore, when the evaluative environment is dierent, question con-tent may need to evolve as well. For example, if team development and objective customer service ratings are important criteria in performance eva-luation, the Hopwood and Otley speci®cations (Table 2, columns I and II) might have lower content validity than would a larger set including these two items.
Thus the evolution of content sets would not necessarily reduce between-study comparability so long as the environment in each case has been adequately assessed to ensure that the content oers sucient choice. Hopwood's content was developed from issues he identi®ed in interviews and used language signi®cant to the respondents (shop ¯oor cost center managers at an American integrated steel producer). Otley modi®ed Hop-wood's content to re¯ect dierences he observed during face-to-face interviews in his sample (pro®t center managers in Britain's nationalized coal industry). It thus seems reasonable to assume that the response sets of Hopwood and Otley's studies were suciently inclusive to have high content validity. Implicitly, by providing sucient choice to respondents, they would also have high between-study comparability.
Some studies have introduced modi®ed response sets without explanation (e.g. Brownell, 1982, 1985; Dunk, 1992). It will have to be assumed that these modi®cations were intended to provide improved content validity. However, whether this was accomplished by inductive or deductive means is unclear. Other researchers have adopted content unchanged from earlier studies. For example, Harrison (1993) adopted Brownell and Hirst's (1986) content unchanged, and Imoisili (1989) and Govindarajan (1988) adopted Hopwood's (1972, 1973) content without change. Hopwood's content was designed to capture the evaluative style envir-onment of a 1960s industrial plant; as such it may not be adequate for a sample of Fortune 500 gen-eral managers during the 1980s and 1990s. Unless
appropriateness of a content set has been empiri-cally determined before data collection, ex-post, its validity must be a matter of speculation.
2.2. Calculation evolution and validity
Like variable content, calculation methods have also changed in this literature stream. Hopwood (1972, 1973); Otley (1978) and Brownell (1982) all used nominations to a ranking list, producing somewhat dierent discrete variables (categorical, ordinal and binary, respectively). Brownell (1985); Harrison (1992) and others later used Likert-type ratings on a set of criteria and manipulated those rating scores. The use of continuous variables constituted a fundamentally dierent approach to measuring evaluative style.
The calculations which have been used in the evaluative style literature can be classi®ed into ®ve basic types: (A) categorical variables based on inclusion in a ranking nomination list, (B) ordinal (loosely, ``continuous'') variables based on relative rankings in a nomination list, (C) binary variables based on inclusion in a nomination to a ranking list, (D) ratings or arithmetic sums of ratings and (E) algebraic manipulations of ratings. Table 4 classi®es evaluative style studies by calculation type (these types are also indicated in Tables 1 and 3).
The dierent speci®cations of evaluative style may have reduced between-study comparability, and therefore external validity. Because calcula-tion validity could be directly assessed by empiri-cal testing, we designed a study to explore the comparability of the ®ve calculation types by exploring both their intercorrelations and their relationship to performance outcomes.
3. Method
targeted improvements. Companies were also selected to show a reasonable spread by industry, internal diversity (number of business areas repre-sented), overall size (£ million to £63 billion in revenue; 110 to 120,000 employees), unit size (£4 million to £4.3 billion in revenue; 67 to 55,000 employees) and unit size relative to the total com-pany (1 to 100%). Only one comcom-pany which was approached declined to participate in the study.
Eighty-two managers (three to six from each company, all of whom had budget responsibility for functional or departmental areas within busi-ness units) were interviewed, and the evaluative style criteria they perceived as important were assessed. This research strategy was consistent with the approach undertaken by Hopwood and Otley, in that it sought to develop the relevant content criteria by inductive means. This approach is conceptually dierent from research which uses deductive development of the variable content.
Managers were advised that the discussion was con®dential and only aggregate data from multi-ple companies would be reported. Once all of the interviews were complete, a follow-up ques-tionnaire was developed which provided the data to calculate a set of evaluative style variables. The questionnaire was distributed by mail to all eight-two managers with an accompanying cover letter again assuring con®dentiality of the data. A post-age paid return envelope was enclosed.
Of the 82 questionnaires distributed, 68 were returned. After checks for consistency between ratings and rankings, two subjects whose respon-ses were highly inconsistent were excluded. The
usable response rate was thus 80%. Because of the high response rate, no test was made to see whe-ther those who did not respond represented a sys-tematic sub-set (i.e. response bias). In addition, because respondents came from multiple compa-nies, tests were made for eects from company and unit size, CEO and respondent time in posi-tion, and industry sector and market served by the business unit. None of these sampling variables re¯ected any systematic associations with the eva-luative style variables (see Vagneur (1995) for fur-ther discussion).
Environmental uncertainty and economic con-ditions had been previously identi®ed as aecting evaluative style (Govindarajan, 1984; Imoisili, 1989). The data for this study were collected just as the British economy began to emerge from its deepest post-war recession. All of the companies in the sample (no two of which competed in the same sector) had experienced the eects of the recession. The level of international competition was high, and cost and headcount reduction initiatives were under way in all of the units. Therefore variation in environmental uncertainty and economic conditions was not expected to in¯uence the analysis.
3.1. Variable speci®cation
3.1.1. Content
The follow-up questionnaire sent to respondents asked, ``When your performance is evaluated, how much importance do you think is attached to each of the following''. The ten criteria provided
Table 4
Classi®cation of studies by calculation type
Type Based on (Manipulation to) Studies
A Nomination to ranking (categorical) Hopwood (1972, 1973); Brownell and Hirst (1986) B Nomination relative rankings (ordinal) Otley (1978); Govindarajan (1988) C Nominations (binary) Brownell (1982)
D Ratings (summed/continuous) Kenis (1979); Hirst (1983);Brownell (1985, 1987); Gupta (1987); Brownell and Dunk (1991); Dunk (1992)
E Ratings (algebraic/continuous) Govindarajan (1984); Imoisili (1989);Harrison (1992)
(Table 2, column III) were based on data collected during the interviews. A fully anchored seven-point Likert-type scale (from ``not at all'' to ``cri-tically'') was provided for respondents to rate each of the 10 items. A second question then asked respondents to list in rank order the three most important criteria from the content list. In order to ensure that the criteria were suciently inclu-sive, a third question asked, ``If there are impor-tant factors which are missing from the list above, please make a list of the most important factors your superior uses in the assessment of your per-formance.''
To the extent appropriate to the organizations under study, the criteria were designed to capture the essence of those used by Hopwood and Otley. One criterion (item 2) was unchanged by Otley (1978) from that used by Hopwood (1972). It was unchanged here as well. Two items were re®ned in order to improve relevance across the sample (items 4 and 6). Where there were dierences between Otley and Hopwood, Otley's content was adopted (items 1, 5, 7, and 9). Team development and objective customer service ratings (items 8 and 10) were included because the interviews had dis-closed that these were important evaluative cri-teria in some companies.
3.1.2. Calculation
Five variables were operationalized to re¯ect the calculation types (Table 4; see Table 3 for more speci®cation detail). The variables were as follows:
Variable A, a four-level categorical variable based on nominations to a list of the three most important criteria; Hopwood's (1972, 1973) method was used.
Variable B, an ordinal (``continuous'') variable based on relative rankings of the most important in a nomination list. Otley's (1978) method was selected.
Variable C, a binary variable based on nomina-tions to a ranking list. Brownell's (1982) method was used. This assigned a value of 1 to Hopwood's budget constrained and budget-pro®t styles. Pro®t conscious and non-accounting styles were assigned a value of zero.1
Variable D, a ``continuous'' variable based on summed Likert-type ratings, using Brownell's (1985) approach. This technique summed the rat-ings of budget and eciency, creating an absolute measure of the perceived importance of content items 5 and 7.
Variables E (1 and 2), ``continuous'' variables based on algebraic manipulations of ratings:
E1: Harrison's (1992) approach was selected. This method calculated the ratio of the sum of the absolute ratings of budget and e-ciency (content items 5 and 7), to the sum of the other eight criteria. This provided a rela-tive measure of the perceived importance of content items 5 and 7 vs the other items.
E2: The algebraic manipulation of variable
E1was modi®ed by moving the score on item 3 (contribution to pro®ts) from the denomi-nator to the numerator of the ratio. This provided a relative measure of the perceived importance of all three criteria which might be considered ®nancial measurement based. This addressed the potential for interpreting content item 3 as an accounting based criter-ion.
3.1.3. Performance
There is a lack of consensus as to what con-stitutes valid performance measurements (Steers, 1975). Therefore, three kinds of performance variables were included: objective, self-reported, and researcher-rated. All were standardized before being included in the analysis.
Objective(longitudinal measures):
Abnormal shareholder returns: Five-year aver-age of data (from Datastream) re¯ecting the dierence between company returns and a market index with the same beta.
Actual returns: Five-year average of account-ing pro®t plus dividends.
Self-reported (all on seven-point Likert-type scales):
Weighted strategic performance: A compre-hensive set of strategy factors (sales growth, market share, operating pro®t, new technol-ogy development, pro®t margin, budget per-formance, return on investment, new product development, market development, operating cash ¯ow, cost reduction, personnel develop-ment, public aairs, and cooperation) weigh-ted using the method of Steers (1975).
Budget performance(from the list above).
Sales growth(from the list above).
Researcher-rated (seven-point Likert-type scale):
Consistency in objectives and priorities. Two independent investigators scored within-unit variation in interviewees' views on objectives and priorities, following the method of Machin and Tsai (1983). (All organizations in the sample had improvement in consistency or coordination as stated objectives.)
4. Results
Table 5 presents summary descriptive statistics for the content ratings. Pro®t, unit eciency, and meeting the budget (items 3, 5 and 7) were the most frequent choices in the nomination rankings. Only one respondent failed to nominate at least one of
these three. Except for responses on item 8 (cus-tomer service rating) means and standard devia-tions for the content items clustered in a small range (means from 5.0 to 5.8; standard deviations from 1.03 to 1.50). Factor analysis found all of the content items were independent with items 1, 2 and 3 forming three independent factors that explained 61% of the variance.
Eighteen pairs of content ratings had signi®cant correlations (Table 6). Contribution to pro®ts (item 3) was correlated with budget performance (item 7) but not with eciency of unit (item 5). Only seven of the signi®cant correlation pairs were strong (r 0.40), and only one of these pairs involved any of the three most frequently selected content items (eciency, item 5, with eort into job, item 1;r=0.41,p<0.01).
There were signi®cant associations among the criteria which described the language of eort, quality, customer service, attitude, and teams (content items 1, 2, 8, 9, and 10). Yet there was no strong pattern of correlations among the criteria selected by those who scored above the mean on the importance of meeting the budget, except for a link with pro®t contribution (item 3). In other words, if a respondent rated one of the quality/ team criteria high, they were more likely to score the others high as well. Those who scored meeting budget highly also cared about pro®ts, but pro-vided no predictable pattern in the remainder of their responses. This suggested that there might generally be two groups of respondentsÐthose who perceived quality/team matters as highly
Table 5
Summary statistics for variable content
Item Question content Mean S.D. Min Max % 1st choice % in top three
1. Eort put into job 5.0 1.4 1 7 6 18
2. Concern with quality 5.5 1.5 2 7 5 24
3. Contribution to pro®ts 5.5 1.4 1 7 30 50 4. Relationships with sta 5.2 1.5 1 7 1 14
5. Eciency of unit 5.8 1.0 3 7 26 53
6. Get on with superiors 5.1 1.1 3 7 3 15
7. How well meet budget 5.4 1.3 2 7 11 47
8. Customer service rating 4.3 2.0 1 7 8 20
9. Attitude to work 5.6 1.3 1 7 1 24
10. Team developemnt 5.2 1.4 2 7 9 35
important in their evaluations, and those who rated ®nancial matters as highly important. While some of these might have clustered within parti-cular companies or business units, an overall ANOVA showed that company was not a sig-ni®cant predictor of evaluative style.
The correlations between content scoring and the six evaluative style variables were even more telling. As expected, budget importance in tion (item 7) was correlated with all of the evalua-tive style variables. Budget importance had an exceptionally strong correlation with variable (D) which was also strongly correlated with eciency (item 5). This was to be expected because variable (D) was the sum of the ratings of items 5 and 7.
Surprisingly, variable (C) had a negative corre-lation with importance of budgets (item 7). The variables had been structured to eliminate any reverse eects (high scores on all variables (A±E2) indicated a budget-constrained style). Examina-tion of the data disclosed that 40% of the sample had scored budget importance (item 7) above the mean (5.4) yet did not nominate the item to the top three ranking. That subset was coded as
non-accounting style by Brownell's (1982) speci®ca-tion. An additional 29% of the sample rated the importance of budgets in evaluation below the mean, and paradoxically nominated it to the ranking list; hence they were coded as having a budget constrained style. This counter-intuitive response by 69% of the sample appeared to be the cause of the reverse correlation. A review of view data, followed by additional telephone inter-views to explore this result, found it to be logically consistent: While budget outcomes were highly important to all managers in the sample, for some, budgets were not the direct focus of their attention and therefore of evaluation; rather, these man-agers sought to control operational factors, through which budget performance could be achieved.
Contribution to pro®ts (item 3) did not corre-late signi®cantly with the (A) or (B) measures (Hopwood and Otley). It ®gured prominently in the variables based on Brownell's two approaches, correlating negatively with (C) and positively with (D), as well as (E2), where it had been purposely put in the numerator of the ratio variable.
Table 6
Correlations for variables and content
Item 1. 2. 3. 4. 5. 6. 7. 8. 9. 10.
Content item 1. Eort put into job
2. Concern with quality 0.42**
3. Contribution to pro®ts ÿ0.03 0.08
4. Relationships with sta 0.23 0.18 0.24
5. Eciency of unit 0.41** 0.29* 0.23 0.35**
6. Get on with superiors 0.16 ÿ0.03 ÿ0.08 0.03 0.09
7. How well meet budget ÿ0.15 ÿ0.06 0.35** 0.17 0.24 0.08
8. Customers service rating 0.05 0.40** 0.21 0.36** 0.12 ÿ0.23 0.25*
9. Attitude to work 0.32** 0.37** 0.10 0.31* 0.14 0.04 ÿ0.03 0.28*
10. Team development 0.25* 0.41** 0.28 0.56** 0.24 ÿ0.28* 0.06 0.58*** 0.41***
Calculation type
A Hopwood (1972, 1973) ÿ0.18 ÿ0.22 0.22 ÿ0.01 ÿ0.12 ÿ0.08 0.44*** ÿ0.01 ÿ0.25* ÿ0.02
B Otley (1978) ÿ0.17 ÿ0.17 0.23 ÿ0.02 ÿ0.15 ÿ0.10 0.43*** 0.04 ÿ0.24 ÿ0.03
C Brownell (1982) 0.24 0.20 ÿ0.31* ÿ0.03 0.11 0.07 ÿ0.42*** 0.01 0.32* 0.08
D Brownell (1985) 0.11 0.09 0.34** 0.24 0.70*** 0.07 0.84*** 0.15 0.01 0.14
E1 Harrison (1992) ÿ0.32** ÿ0.41** ÿ0.07 ÿ0.35** 0.28* 0.03 0.58*** ÿ0.38** ÿ0.50*** ÿ0.50***
E2 Modi®ed Harrison ÿ0.43** ÿ0.49*** 0.41** ÿ0.33** 0.15 ÿ0.06 0.50*** ÿ0.38** ÿ0.52*** ÿ0.45***
Table 7 provides the summary descriptive sta-tistics for the six variable calculations. Standar-dizing the means to zero and ranges to 1 disclosed that variable (E2) had a very small standard deviation, and variables (A) and (B) had the lar-gest. All combinations of the six evaluative style variables were signi®cantly correlated (Table 8). Those representing types (A), (B) and (C) were highly correlated. Although not as strong, vari-ables (E1) and (E2) also had a high correlation, suggesting there was little eect in the Harrison-based (1992) approach from moving item 3 (prof-its) from the denominator in variable (E1) to the numerator in variable (E2). The remainder of the combinations had only weak correlations.
A comparison of the correlations of variable (E1) with variables (A), (B) and (C), to those of (E2) with (A), (B) and (C) shows some increase in
the correlations (0.07, 0.07, and 0.14, respectively) for the modi®ed Harrison (E2) calculation. This suggests some limited sensitivity in calculation types (A), (B), and (C), the Hopwood, Otley and Brownell (1982) variables, to the presence of item 3. That is; attention to pro®t is linked to attention to budget, as mentioned in the discussion of Table 6, above. Conversely, the correlation with calculation type (D), the Brownell (1985) variable, showed a decrease (0.14) between (E1) and (E2), possibly because the elements of the (E1) numera-tor (budget + cost) were identical to those of (D), and the introduction of any new item, no matter how closely related, would skew this relationship.
Because performance evaluative style has been modeled as a predictor of performance-related variables, it was important to explore the rela-tionship of each style measure with possible per-formance outcomes. We therefore tested each of the six variable types against the set of perfor-mance measures described above. One-way ANOVA models were used to test the Hopwood, Otley, and Brownell (1982) variables, and linear regressions were used to test the Brownell (1985), Harrison, and modi®ed Harrison constructs. The results are shown in Table 9.
The table reports the results of 36 tests of asso-ciation. For each signi®cant association, the direction of the relationship is also shown. The resulting pattern suggests that the Hopwood and Otley variables are measuring essentially the same relationships, and that these are not the same
Table 8
Correlation matrix for calculation types
Calculation
Type Source A B C D E1
A Hopwood (1972, 1973)
B Otley (1978) 0.98***
C Brownell (1982) 0.88***0.89***
D Brownell (1985) 0.28* 0.25* 0.25*
E1 Harrison (1992) 0.30* 0.26* 0.29* 0.62***
E2 Modi®ed Harrison 0.37** 0.33** 0.43***0.48***0.82***
*p<0.05; **p<0.01; ***p<0.001.
Table 7
Summary statistics for calculation types
Calculation Actual
Possible range Min Max
Standardizeda
Type Source Mean S.D. Mean S.D.
A Hopwood (1972, 1973) 2.52 1.11 1±4 1 4 0 0.37
B Otley (1978) 3.21 1.51 1±5 1 5 0 0.38
C Brownell (1982)* 0 or 1 0 1
D Brownell (1985) 11.2 2.01 1±15 5 15 0 0.14
E1 Harrison (1992) 1.1 0.24 0.25±1.75 0.4 1.8 0 0.16 E2 Modi®ed Harrison 0.48 0.11 0.06±3 0.2 0.8 0 0.04
n=66.
a Standardized to a range of 1.
relationships measured by the binary (Brownell, 1982) or the ratings-based measures. In the case of the binary measure, the dierence may be explained by the high degree of aggregation in the measure. In fact, the only performance item which this measure predicted was similarly highly aggre-gated (the weighted strategic performance self-reported index). In the case of the Hopwood and Otley variables, which were signi®cantly asso-ciated with the two objective measures (abnormal and actual returns), the ANOVAs revealed a u-shaped association with performance. That is, both budget-constrained and non-accounting styles were associated with higher long-term per-formance, while the more balanced styles were not.
Although the table shows inconsistencies among the six style variables in their predictive ability across the performance indices, nonetheless the signi®cant relationships identi®ed, in particular for the ratings-based measures, are consistent in their direction across variable type. They also support Otley's conclusion that ``style of budget use should be matched to circumstances'' (1978, p.
146). That is, in the case of poor performance, a high budget emphasis was appropriate. The nega-tive associations of the ratings variables with ®ve-year abnormal returns, sales growth, and con-sistency suggest that this may have been what was happening. By contrast, the positive association with budget performance (and negative associa-tion with abnormal returns) supports Hopwood's argument that the use of a budget-constrained style produces not real long-term performance, but rather performance to budgets, and possibly manipulation of budgets in order to ensure this.
5. Discussion and conclusions
The earliest contributors to the performance evaluative style literature, Hopwood (1972, 1973) and Otley (1978), both conducted their studies in single companies and used an inductive research design to develop the content for the scoring of their evaluative style variables. Subsequent studies have tended to use a deductive assessment of the suitability of the evaluative style criteria. While
Table 9
Relationship of performance evaluative style variables with performance measures
Performance measure ANOVA Regression
Hopwood Otley Brownell (1982)
Brownell (1985)
Harrison Modi®ed Harrison
Abnormal returns x=8.23 F 3.15 2.48 0.01 4.71 2.64 5.17 S.D.=51.60 p 0.03 0.04 0.75 0.03 0.11 0.03
ÿ75 to 43 Relationship [ [ Neg Neg
Actual returns x=36.08 F 4.90 3.68 1.19 1.81 1.10 2.94 S.D.=55.29 p 0.003 0.008 0.28 0.18 0.30 0.09
ÿ34 to 237 Relationship [ [
Weighted strategic x=0.55 F 1.74 1.58 4.27 0.02 0.81 0.33 S.D.=0.15 p 0.16 0.19 0.04 0.89 0.37 0.57 0.25 to 1.28 Relationship Neg
Budget x=5.13 F 1.02 0.77 2.02 5.32 6.37 4.73 performance S.D.=1.43 p 0.39 0.54 0.16 0.02 0.01 0.03
1 to 7 Relationship Pos Pos Pos
Sales growth x=4.46 F 0.76 0.90 0.31 0.25 5.22 4.59 S.D.=1.43 p 0.52 0.47 0.58 0.62 0.03 0.04
1 to 7 Relationship Neg Neg
Consistency x=3.50 F 0.60 0.48 0.13 5.02 10.37 3.81 S.D.=1.70 p 0.62 0.75 0.72 0.03 0.002 0.06
some studies provide evidence which suggests high content validity can be assumed, others do not. Because ex-post the validity of content can only be speculated upon, researchers should report their method of validating the ability of the variable content to re¯ect the environment being classi®ed. Calculation methods used in the evaluative style literature were here classi®ed into ®ve types. All of the variable types were signi®cantly correlated with the rating of budget importance (item 7). The negative correlation between item 7 and variable type (C) appeared to be the result of the scoring anomaly mentioned above; this was at least in part due to managers' diering approaches to their operational and budgetary responsibilities. This raised the question whether two dierent types of performance focus existed: one based on ®nancial outcomes and the other on operational improve-ment (Vagneur, 1995).
All of the variable types were signi®cantly inter-correlated. However, types (A), (B) and (C), all based on nomination to a ranking list, had high between-type correlation, compared to the other types (D, E1 and E2). These between-type corre-lation results (Table 8) must be viewed as a set of associations among variables all attempting to measure (at least in part) the same thing, namely budget-constrained performance evaluative style. The fact that these measures were all signi®cantly correlated is to be expected; the more interesting issue is how and where, and thus why, they diverge from one another. The more the variables were then modi®ed for subsequent research, the further they appear to have moved from capturing the same dimension that Hopwood originally cap-tured, and therefore the lower their correlations in the A and B columns. The divergence of the asso-ciations with performance, ®rst between the Brownell (1982) variable and the Hopwood and Otley variables, and subsequently between the rat-ings variables and the nomination measures gen-erally, supports this conclusion.
The lower convergence for the ratings-based methods may re¯ect an increased complexity rela-tive to the nominations-based methods. Arguably, however, such complexity may be capturing more of the underlying essence of performance evalua-tive style. The larger number of signi®cant
asso-ciations between these variables and the performance measures suggests that this is in fact the case. If it is, then the divergence of the more recent research may be necessary and appropriate.
5.1. Implications for future research
The impact of the dierences in measurement methods and subtly dierent conceptualizations in the contributions to this literature needs to be carefully assessed. Reconsidering the evaluative style evidence would be a useful step in this pro-cess. For example, Table 1 presents two pairs of main eects studies with highly comparable calcu-lation types and the same level of analysis, but con¯icting results (Hopwood, 1972, 1973 with Otley, 1978; Kenis, 1979 with Gupta, 1987). In other words, those pairs of main eects studies have variable types that should be highly compar-able, yet the results con¯ict. The interaction stu-dies listed in Table 1 provide no pairs of highly comparable calculation types, yet two pairs with the same level of analysis report consistent results (Brownell, 1982 with Brownell & Dunk, 1991; Govindarajan, 1988 with Gupta, 1987). Reconci-liation of such apparent paradoxes will require an analysis of the subtle dierences in the conceptual bases of the dierent approaches. These include dierences in conceptualization of constructs, method, and research strategy and design, in par-ticular the idea of performance evaluative style as a continuum rather than as a set of discrete categories. Most Anglo-American companies use budget-centered performance measurement as a central feature of their management control systems. Since budgets are explicitly intended to in¯uence individual decision making, evidence linking bud-gets and negative behavior or performance sug-gests that, in practice, the management of management controls may be important. Main-stream management theorists rarely consider con-trol systems, yet variation in evaluative style may be an indicator of system factors which are weak or have failed, rather than a causal eect itself (Otley, 1978; see Vagneur, 1996). This is a poten-tial research area that should be explored.
and the nature of the systemic eects created by budgets and other formal and informal manage-ment control processes (e.g. reward, planning, training and information systems). This would require synthesis of two levels of analysis (indivi-dual and system), as well as consideration of psy-chology, organizational behavior, behavioral accounting and systems theory research. This area presents an exciting and signi®cant opportunity to shape the next stage in the development of the stream of research that has followed Hopwood (1972, 1973) and Otley (1978). This may be espe-cially apposite and timely, as there is ``a tendency for management control work springing from very dierent philosophical foundations to complement one another'' (Vagneur, 1996; Vagneur et al., 1996, p. ii). This is perhaps unique to the man-agement control research area. Therefore, the dia-logue now underway among management control researchers provides a useful starting place for such an eort.
Acknowledgements
Special thanks to Tony Berry, Jonathan Levie, David Otley, Alexander Roberts and three anon-ymous reviewers for their thoughtful comments on earlier drafts of this paper.
References
Anthony, R. N. (1965).Planning and control systems: a frame-work for analysis. Boston, MA: Harvard University Press. Argyris, C. (1952).The impact of budgets on people. The
Con-trollership Foundation. Cornell University, School of Busi-ness and Public Administration.
Briers, M., & Hirst, M. (1990). The role of budgetary informa-tion in performance evaluainforma-tion.Accounting Organization and Society,15(4), 373±398.
Brownell, P. (1982). The role of accounting data in perfor-mance evaluation, budgetary participation and organiza-tional eectiveness.Journal of Accounting Research, Spring, 12±27.
Brownell, P. (1985). Budgetary systems and the control of functionally dierentiated organizational activities.Journal of Accounting Research,23(2), 502±512.
Brownell, P. (1987, May). The role of accounting information, environment and management control in multi-national organizations.Accounting and Finance, 1±16.
Brownell, P., & Dunk, A. S. (1991). Task uncertainty and its interaction with budgetary participation and budget empha-sis: some methodological issues and empirical investigation.
Accounting, Organizations and Society,16(8), 693±703. Brownell, P. & Hirst, M. (1986). Reliance on accounting
infor-mation, budgetary participation, and task uncertainty: tests of a three-way interaction.Journal of Accounting Research. 241±249.
Chandler, A. D. (1962).Strategy and structure. Garden City, NY: Doubleday.
Dunk, A. S. (1992). Reliance on budgetary control, manu-facturing process automation and production subunit per-formance: a research note. Accounting, Organizations and Society,17(3), 195±203.
Govindarajan, V. (1984). Appropriateness of accounting data in performance evaluation: an empirical evaluation of envir-onmental uncertainty as an intervening variable.Accounting, Organizations and Society,9(2), 125±135.
Govindarajan, V. (1988). A contingency approach to strategy implementation at the business-unit level: integrating administrative mechanisms with strategy.Academy of Man-agement Journal,31(4), 828±853.
Gupta, A. K. (1987). SBU strategies, corporate±SBU relations, and SBU eectiveness in strategy implementation.Academy of Management Journal,30(3), 477±500.
Harrison, G. L. (1992). The cross-cultural generalizability of the relation between participation, budget emphasis and job-related attitudes.Accounting, Organizations and Society,17
(1), 1±15.
Harrison, G. L. (1993). Reliance in accounting performance measures in superior evaluative style: The in¯uence of national culture and personality. Accounting, Organization and Style,18(4), 319±339.
Hirst, M.H. (1983). Reliance on accounting performance mea-sures, task uncertainty, and dysfunctional behavior: some extensions.Journal of Accounting Research, Autumn, 596± 605.
Hopwood, A. G. (1972). An empirical study of the role of accounting data in performance evaluation. Journal of Accounting Research,10, 156±182.
Hopwood, A. G. (1973).An accounting system and managerial behaviour. London: Saxon House (originally an unpublished Ph.D. dissertation, University of Chicago, 1971).
Imoisili, O.A. (1989). The role of budget data in the evaluation of managerial performance. Accounting, Organizations and Society,14(4), 325±335.
Kenis, I. (1979). Eects of budgetary goal characteristics on managerial attitudes and performance. The Accounting Review,54(4), 707±721.
Machin, J. L. J. & Tsai, C. H. S. (1983). A communication-based methodology for research into and development of management control systems. In T. Lowe & J. L. J. Machin (Eds.), New perspectives in management control (pp. 193± 226). London: MacMillan.
Otley, D. T. (1978). Budget use and managerial performance.
Journal of Accounting ResearchSpring, 122±149.
accounting: achievement and prognosis.Accounting, Organi-zations and Society,5(4), 413±428.
Otley, D. T., Berry, A. J. & Broadbent, J. (1996). Research in management control: an overview of its development. In K. Vagneur, C. Wilkinson, & A. J. Berry, (Eds.),Beyond con-straint: exploring the management control paradox(pp. 5±19). Sheeld, U.K: The Management Control Association. Simon, H., Guetzkow, H., Kozmetsky, & Tyndall, G. (1954).
Centralization vs decentralization in organizing the controller's department. Controllership Foundation Paper.
Steers, R. M. (1975). Problems in the measurement of organi-zational eectiveness.Administrative Science Quarterly, 20, 546±558.
Taylor, F. W. (1911).The principles of scienti®c management.
New York: Harper and Row.
Vagneur, K. (1994). The unintended eects of performance measurement: reassessing the importance to management theory. A London Business School working paper.
Vagneur, K. (1995). Financial performance measurement eects on hierarchical consistency and performance. An unpublished Ph.D. thesis, London Business School. Vagneur, K. (1996). An exploration of the association between
performance and non-®nancial measurement reliance. A London Business School working paper.
Vagneur, K., Wilkinson, C. & Berry, A. J. (Eds). (1996).
Beyond constraint: exploring the management control paradox
(pp. 5±19). Sheeld, U.K: The Management Control Asso-ciation.
White, H. (1961). Management con¯ict and sociometric choice.