Directory UMM :Data Elmu:jurnal:J-a:Journal of Economic Behavior And Organization:Vol43.Issue4.Dec2000:

(1)

Vol. 43 (2000) 487–497

Full-scale real tests of consumer behavior

using experimental data

Aurelio Mattei

DEEP, University of Lausanne, Ecole des HEC, CH-1015 Lausanne, Switzerland

Received 16 September 1998; accepted 22 September 1999

Abstract

This paper reports on the results of three experiments based on the neoclassical theory of con-sumer behavior. All these experiments were carried out with the aim of simulating real-world behavior. Nonparametric tests, applying the theory of revealed preference, show that a significant number of individuals exhibit inconsistent behavior. If nearly optimizing behavior is postulated, then most of these inconsistencies disappear, but random choices produce almost the same results. The hypothesis of a representative consumer (statistical average behavior) seems a more fruitful approach. © 2000 Elsevier Science B.V. All rights reserved.

Keywords: Consumer theory; Experimental economics; Revealed preference; Nonparametric tests

Keywords: C91; D12

1. Introduction

Controlled experiments of individual decision-making under certainty are rare. This is very surprising given the central place of the neoclassical model of consumer behavior in economics.

The first empirical tests of consumer behavior were generally favorable to the theory. Later estimations with flexible functional forms provided evidence contrary to the homo-geneity and symmetry conditions. Subsequently, Varian (1982) implemented a simple test of consumer behavior based on the theory of revealed preference. When this nonparametric test was applied to the data that had been used for the estimation of flexible functional forms, the evidence that these data could have been generated by a utility-maximizing consumer was much stronger. The negative results with flexible forms would seem to be due to the approximation.

E-mail address: [email protected] (A. Mattei).

(2)

Varian was aware that the conclusions of a test based on the theory of revealed preference can be misleading if budget sets do not intersect. Indeed, any model which satisfies the budget constraint is consistent with these data.

With controlled experiments, we can achieve a high power test and, at the same time, satisfy all the hypotheses of the theoretical model. Artificial conditions should be kept at a minimum in order to simulate as closely as possible real behavioral conditions.

An experiment with actual consumption of the goods chosen was recently carried out by Sippel (1997).1Students of law or economics were paid DM 25 to participate in a laboratory experiment implying a choice among eight goods in 10 different budget situations. One of the 10 bundles of goods was drawn at random and consumed on the spot.

In a first experiment, 12 subjects had to fill out 10 order sheets and then spend 60 min to consume the goods. The choices of 11 subjects out of 12 did not satisfy the strong axiom of revealed preference. Only seven subjects (58.3%) satisfied the generalized axiom of revealed preference.

In a second experiment with 30 students, Sippel (1997) implemented a Slutsky-type compensation for the price changes. In this case, 22 subjects (73.3%) did not satisfy the strong and 19 (63.3%) the generalized axiom of revealed preference.

While Sippel’s experiments are a great improvement on previous studies, they are not as close as desirable to the behavior of a regular consumer. In particular, all the goods had to be consumed in the laboratory within a limited period of time (1 h).

In this paper, we report the results of three experiments with a total of about 450 individ-uals. Real goods, real money and the same incentives as in real conditions were used.

The paper is set out as follows. Section 2 provides the theoretical framework of the non-parametric tests which are used to check if the data are consistent with utility maximization. In Section 3, the real power of these tests is appraised using a model of random choices. The experiments and their results are presented in Section 4. Finally, Section 5 offers some concluding remarks.

2. Theoretical framework

It can be proved that a set of data is consistent with the model of utility maximization if, and only if, it satisfies the generalized axiom of revealed preference (GARP, see Varian, 1982, p. 947). This axiom allows multivalued demand functions, whereas the more common strong axiom of revealed preference (SARP) requires single-valued demand functions.

When we use real data, we have to be careful in applying the test of revealed preference. Indeed, a difference of just one cent (of a franc or dollar) between the costs of two bundles is enough to reject the neoclassical model of consumer behavior. Moreover, a ‘nearly op-timizing behavior’ might be more appropriate for an individual consumer. Therefore, it is worthwhile using a weaker test.

Letqi = (q₁i, q₂i, . . . , q_mi )denote the vector of the quantities bought in period i and

pi =(p₁i, p₂i, . . . , p_mi )the vector of the corresponding prices. We can say that qiis directly

(3)

Table 1

The power of the test

Afriat efficiency index Percentage of inconsistent behavior

1.00 0.994

0.99 0.952

0.98 0.886

0.97 0.777

0.96 0.594

0.95 0.432

0.90 0.029

revealed preferred to qj _{if the cost of q}i _{is significantly different from that of q}j_{. Formally,} qi R(e)0qjif

epi·qi ≥pi ·qj 0≤e≤1

where e is called the Afriat efficiency index2 and pi_·_qj _{the inner product of the vectors p}i and qj. For example, if e=0.8, then the cost of bundle qi must be 25 percent greater than that of bundle qj in order to say that qiis directly preferred to qj.

We can check if the data are consistent with ‘near-optimizing behavior’ by constructing a matrix A, whose element aij is equal to 1 if epi·qi≥pi·qjand 0 otherwise. The transitive closure of R(e)0is computed using the Warshall algorithm. A violation of this weakened GARP (or GARPe) is obtained if bij=1 and epj·qj>pj·qi.

3. The power of the test

It is important to analyze the data before checking for consistency with preference maxi-mization because, if there are no budget set intersections, it is impossible to find violations of GARP. Every chosen bundle of goods which exhausts the budget is consistent with GARP. The power of the test is zero.

Bronars (1987) calculates the approximate power of the test by using random consumption data which exhaust the budget set. The data are obtained with an algorithm which generates random budget shares. His second algorithm computes the budget shares as follows:

wi = Pzi

zi

where ziare independent and identically distributed (i.i.d.) uniform random variables. Bud-get shares are multiplied by total expenditure and divided by the prices of the corresponding goods to obtain random consumption data. The percentage of times that GARP is rejected indicates the power of the test. Table 1 shows the power of the test when prices and total

(4)

expenditure are those of our second experiment (see below). These figures were obtained using 1000 random consumption data.

The first line of Table 1 implies that GARP is not satisfied with random choices. The power of the test is very high. One can accept the model of utility maximization if the data satisfy this test. The hypothesis of a ‘near-optimizing behavior’ is much more difficult to test. If a waste of five percent of the income is considered as an acceptable error, then it becomes very difficult to discriminate between a random choice and a utility-maximizing behavior (56.8% of the random choices are considered as consistent behavior). A low value of this approximate test means that it is very unlikely to find violations of GARP.

4. The experiments

In a first experiment, we recruited 20 students attending a micro-economics class. The theory of revealed preference was one of the topics of the course. The students had to choose among eight goods in 20 different budget situations. We told them that we wanted to simulate the behavior of a consumer going to 20 different stores with a given amount of money to spend in each store. As an incentive to participate in the experiment and to behave as real consumers, we informed them that they would receive one of these bundles of goods.

The amount of money to spend in each store varied between SF 30 and 40. Therefore, the value of the chosen bundle of goods was about SF 35 (US$ 23).

We chose eight goods (milk chocolate, salted peanuts, biscuits, text marker, ball-point pen, plastic folder, writing pad and ‘post it’) that could interest the students and were not too difficult to hand out.

The instructions for the experiment were then distributed.3 To participate in the experi-ment, the students had to go to the computer center and use a program based on the EXCEL spreadsheet. To facilitate the computations, we prepared a program, written invisual basic,

with 20 spreadsheets corresponding to the different stores to visit. The subjects had to fill out the quantities bought in each store. All the information was available on the screen, in particular the price of each good.

In the first store, the prices were those of a grocery store. The prices in the other stores were increased or decreased in order to have a maximum number of budget intersections.

The program checked the values selected and informed the participants if the budget constraint was satisfied or not and the amount left, if any. After the quantities in the 20 stores were chosen, the program recorded in a file all the choices of the subject and his name.

Table 2 shows the results of the test. With 25 percent of inconsistent subjects, we ob-tain better results than in previous experiments. Taking a course in micro-economics may improve the behavior of the subjects. Nevertheless, it would be wrong to assume, without reservations, that the neoclassical model describes the behavior of every consumer. A pos-sibility to solve the problem is to accept some kind of errors on the part of the consumer.

(5)

Table 2 Experiment I

Afriat efficiency index Number of inconsistent subjects Percentage

1.00 5 25

0.99 2 10

0.98 1 5

0.95 0 0

Indeed, with an efficiency index of 0.95 all the subjects are consistent, but the power of the test is just 0.432. Hence, it is premature to draw a conclusion for these five subjects.

Without transaction costs, the best behavior is to buy in each store only the relatively cheapest goods and then exchange or sell them to get the desired quantities. We did not observe such a behavior, but after the bundles of goods were distributed, some participants exchanged their goods.

The subjects of our second experiment were business students. We wanted to find out if our results were robust by allowing a large number of participants. The instructions for the experiment were distributed to about 500 students and 100 subjects participated in the second experiment. As set with the first experiment, they had to go to the computer center and use a program based on the EXCEL spreadsheet. Half of the goods used in the first experiment and the prices were changed.4 The amount of money to spend in each store varied between SF 42 and 56. Therefore, the value of the chosen bundle of goods was about SF 48 (US$ 32).

In spite of the computer warning signal, four subjects exceeded the budget constraint by more than one percent. On the other hand, 47 subjects spent less than 99 percent, but more than 95 percent of the amount available in at least one store.5 The expected value of the residual amount was just 2.4 cents. Hence, the incentive to use the available amount to the last penny is not very high. Indeed, some students did not bother to spend much time for such small amounts.

We would have encountered the same problem even if all the goods chosen had been distributed. There is always a last penny to be spent. Moreover, we do not know all the motivations of the participants. Several experiments with the ultimatum bargaining game show that an increase in the financial stakes does not alter the conclusions (see Slonim and Roth, 1998). Some participants complained that the experiment was too long. We found that only 16 subjects spent less than 99 percent of the amount available in the first store. Actually, there is a decrease in motivation to use all the available amount.

If satiety is excluded, the residual amount should not exceed the price of the cheapest good. This is the case for half of the participants. However, for most of the goods fractional quantities were allowed in order to arrive as close as possible to the available amount. The residual amounts of 14 students satisfied this condition.

4_{The goods were: milk chocolate, biscuits, orange juice, iced tea, writing pads, plastic folders, diskettes, ‘post} it’.

(6)

Table 3 Experiment II

1.00 44 44

0.99 30 30

0.98 16 16

0.95 4 4

We think that it would be wrong to enforce strict observance of the budget constraint. The consumers have only a rough idea of budget constraint. They often pay their purchases with a credit card. Normally, they do not spend all the money they have in their purse. Moreover, it is very easy to give an example of inconsistent behavior as a consequence of strict observance of the budget constraint.

A tourist buys a bottle of Coca-Cola and two sandwiches when their prices are, respec-tively, SF 3 and 7. He then arrives at the airport with SF 14 in coins. He wants to spend the whole amount because the coins are worthless outside the country. The prices of the two goods are SF 5 and 4, respectively. In order to spend SF 14, he has to buy two bottles of Coca-Cola and a sandwich. This implies inconsistent behavior. If we require that all the money must be spent, we force the consumer to behave inconsistently. Consistent behavior would be to continue buying a bottle of Coca-Cola and two sandwiches even if all the coins are not used.

Hence, we think that the best solution is to consider the amount spent in each store as the effective budget constraint of the participant. Indeed, inconsistent behavior is more frequent among the subjects that strive to seek the combination of goods which satisfies the budget constraint (57% of inconsistent subjects among the 14 participants with strict observance of budget constraint).6 A similar result will be reported below.

The changes in the power of the test, due to the use of the amount spent as the new budget constraint, were insignificant. The average power was 0.989 for e=1 (instead of 0.994) and 0.431 for e=0.95 (instead of 0.432).

Table 3 presents the results. The inconsistent subjects were 44 and a difference of five percent is not enough to account for a ‘near-optimizing behavior’.7

If we take the first 10 stores, we obtain 26 inconsistent subjects, but the average power of the test is only 0.770 instead of 0.989.

In the second experiment, we also wanted to test if the individual demands were ho-mogeneous of degree zero. The prices of store 17 and the amount of money available there were 25 percent higher than those of store 12. The test requires that the consumer spends exactly all the available amount of money since both prices and income must be multiplied by the same factor. Only in four of the 39 cases, where this condition was fulfilled, was the hypothesis accepted.8 The average angle between the consumption vec-tors in stores 12 and 17 was 20◦_{. We have here a striking example where a hypothesis}

6_{A statistical test of the difference between the two proportions gives a p-value of 0.36.}

(7)

may be valid for a representative consumer, but is clearly not verified for the individual consumer.

In order to facilitate the awarding and, at the same time, to test the superiority of money over goods, we let the participants choose between SF 50 and the bundle of goods they selected in one of the stores drawn at random. In a letter we asked the participants if they wanted to buy themselves the bundle of goods. As expected, all the subjects preferred to do so and receive SF 50 even if in 16 cases the goods were worth more than SF 55. This behavior does not mean that the subjects did not maximize their utility, because the eight goods are only a very small part of all available goods. This result is very robust and confirms a classical statement found in every textbook: to give money is always better than giving goods as in a food stamp program.

Moreover, to give a choice between money and goods does not mean that we disappointed the subjects, because the choice set was enlarged. The participants were interested in the results of our experiment. Of course, we were fully aware that we could not use the subjects for a new experiment.

The great majority of the subjects used in experimental economics are students just because it is easier to recruit them. However, the behavior of a student may not correspond to that of the vast majority of consumers.9To analyze the behavior of experienced consumers, we put an announcement in a magazine for consumer affairs (40 000 circulation copies). There were 434 consumers interested in our experiment. We sent the instructions to these consumers. In this experiment, it was not possible to use our computer program. Hence, we prepared a questionnaire where the participants had to write down their choices. A table with all the prices and the amounts available was enclosed so that one could see at once all the relevant data as in a perfect information case. Moreover, in each store the list of goods whose prices had increased or decreased with respect to those of the previous store, was also given. We changed some of the goods to take into account the tastes of these consumers.10 The prices and the available amount of money were those of our second experiment.

Without a computer, it is more difficult to satisfy the budget constraint. Since most of these participants were not familiar with computers, a questionnaire was the best instrument to use in this experiment.

The number of returned questionnaires was 320. As expected, the budget constraint was more often exceeded in this experiment. With four percent of the subjects, the difference was in some cases beyond 10 percent of the amount available. On the other hand, 12 percent of the participants did not use more than 10 percent of the amount available in at least one store.11 The majority of the participants (62%) spent (in at least one store) less than 99 percent, but more than 90 percent of the amount available.

Only two subjects did not use more than 10 percent of the amount available in the first store. As in the previous experiment, we found a decrease in the motivation to use all the available amount.

9_{In particular, economics students might adapt their behavior to the axioms of the theory they study (See Carter} and Irons, 1991).

10_{See Appendix A for the list of the goods.}

(8)

Table 4 Experiment III

1.00 101 32

0.99 66 21

0.98 50 16

0.95 7 2

If satiety is excluded, the residual amount should not exceed the price of the cheapest good. This was the case for 90 participants. However, for most of the goods fractional quantities were allowed in order to arrive as closely as possible to the available amount. The residual amounts of 40 consumers satisfied this condition.

From the many comments received, we are convinced that the participants did their best to satisfy the budget constraint. Of course, we could have helped them with an interviewer and a computer program, but our aim was to simulate real behavior with the consumers. We did not want to interfere with their choices. Moreover, we wanted to check if there was a difference with respect to the second experiment.

For the same reasons as those given for the second experiment, we take the amount spent in each store as the effective budget constraint of the participants. Changes in the power of the test, due to the use of the amount spent as the new budget constraint, were insignificant. The average power was 0.989 for e=1 (instead of 0.994) and 0.428 for e=0.95 (instead of 0.432).

Table 4 shows the results of the test. The inconsistent subjects were 101 (32%).12 This percentage places itself between the value obtained in the first and that of the second experiment.

If we take the first 10 stores, we obtain 67 inconsistent subjects, but the average power of the test is only 0.767 instead of 0.989.

The uniformity of the results of these and other experiments is striking. A significant number of subjects behave inconsistently. Therefore, we can conclude that it is wrong to assume that all individual consumers have a utility maximizing behavior.

With an efficiency index of 0.95, only 2% of the subjects show inconsistent behavior. However, as Table 1 shows, the power of the test is not very high in this case.

As in the previous experiment, the percentage of inconsistent subjects (56.3%) is greater among the 32 consumers with strict observance of budget constraint.

The validity of our results can also be tested with a sensitivity analysis. We perturbed the expenditure on each good using 1000 i.i.d. uniform random variables with a mean deviation of SF 0.50. Only 20.6 percent of the inconsistent subjects became consistent, whereas 90.1 percent of the consistent subjects remained consistent.

As in the second experiment, we were able to test if the individual demands were ho-mogeneous of degree zero. The hypothesis was accepted in four of the 116 cases when all the available amount of money was spent.13 The average angle between the consumption vectors in stores 12 and 17 was 20◦_.

(9)

We chose the same awarding as in the second experiment. As expected, 96.6 percent of the subjects preferred to receive SF 50, even if for 32 percent of the participants the goods were worth more than SF 55. Ten of the remaining subjects eventually accepted to receive SF 50. The reason given to prefer the goods was that they just wanted to carry on the experiment to the end.

Most of the subjects were proud to participate in an experiment to study the behavior of consumers. They wanted to know the main results and their motivation was very high. Even if in the end they chose to receive the money, they did not feel at all disappointed. Some of the participants asked to be kept in the panel for further experiments.

If we apply a normative interpretation to the neoclassical model, we could say that the inconsistent consumers should be prepared to correct their errors. We did not test this hypothesis because we expected that the majority of these consumers would answer that they felt indifferent as to their choices and the corrected bundles of goods.

A more promising solution of this problem is the rehabilitation of the hypothesis of the representative consumer as a statistical average behavior.

The neoclassical model of consumer behavior was never intended to describe the behavior of ‘an actual person, the Mr. Brown or Mr. Jones who lives round the corner’ (Hicks, 1956, p. 55). The demand function of Mr. Brown may be very special or idiosyncratic and without any analytical interest.

The difference between economics and psychology is that economics is not interested in the behavior of an individual consumer per se, but only inasmuch as he represents the behavior of a significant group of individuals.

In this way, we can define the representative consumer as the hypothetical individual whose behavior corresponds to the average behavior of a group of consumers with the same income and prices.14 The heterogeneity of preferences should make the aggregate behavior more regular (see Grandmont, 1987) and the individual errors should be canceled out on average.

The data of our three experiments confirm this hypothesis. The behavior of all these three average consumers is consistent. This statistical regularity is very interesting given the well-known negative theoretical results15 (see Shafer and Sonnenschein, 1982).

Appendix A shows the mean budget shares of the representative consumer, their standard deviations and the highest standard deviations for the individual participants. As expected, the variations of the budget shares of the representative consumer are more regular and less abrupt.

5. Concluding remarks

This paper shows that the behavior of a significant number of individual consumers is inconsistent with the neoclassical model, at least in its textbook or na¨ıve form. The results of these and other experiments cast serious doubts on the careless use of this model. However,

14_{For more general definitions of the representative consumer, see, e.g. Jerison (1994).}

(10)

Table 5

Budget shares (Experiment 3)

Ser. No. Commodity Representative consumer Individual consumers

Mean s.d.a Highest s.d.a

1 Milk chocolate 0.1555 0.0514 0.3656

2 Biscuits ‘petit beurre’ 0.1392 0.0513 0.3833

3 Orange juice 0.1835 0.0520 0.4132

4 Iced tea 0.1248 0.0511 0.3254

5 ‘Post it’ 0.0644 0.0501 0.2804

6 Audiocassette C90 0.1452 0.0504 0.4690

7 Ball-point pen 0.0280 0.0500 0.3097

8 Battery (R6, 1.5 V) 0.1594 0.0510 0.4625

a_{Standard deviation.}

we often forget that the neoclassical model was never intended to describe the behavior of any individual consumer. The consumer of the theory is an ideal individual whose behavior is meant to correspond to that of a group of consumers with the same income and confronted with the same prices.

The results of our three experiments confirm this hypothesis. The data for the average or representative consumer support the neoclassical model of consumer behavior.16

It would be interesting to test, given similar negative theoretical results (see Kirman, 1992), if the utility function of the representative consumer can be used as a measure of welfare for society at large.

Acknowledgements

Financial support from the Swiss National Foundation for Scientific Research is gratefully acknowledged. Laurent Cretegny and David Escher provided excellent research assistance. We would like to thank the editor and the anonymous referees for useful comments and suggestions that helped to improve the paper.

Appendix A

The budget shares are presented in Table 5.

References

Battalio et al., 1973. A test of consumer demand using observations of individual consumer purchases, Western Economic Journal 11:411–428.

(11)

Carter, J.R., Irons, M.D., 1991. Are economists different, and if so, why? Journal of Economic Perspectives 5, 171–177.

Grandmont, J.-M., 1987. Distributions of preferences and the law of demand. Econometrica 55, 155–161. Hicks, J.R., A Revision of Demand Theory, Clarendon Press, Oxford, 1956.

Jerison, M., 1994. Optimal income distribution rules and representative consumers. Review of Economic Studies 61, 739–771.

Knetsch, J.L., 1992. Preferences and nonreversibility of indifference curves. Journal of Economic Behavior and Organization 17, 131–139.

Kirman, A.P., 1992. Whom or what does the representative individual represent? Journal of Economic Perspectives 6, 117–136.

MacCrimmon, K.R., Toda, M., 1969. The experimental determination of indifference curves. Review of Economic Studies 36, 433–451.

Mattei, A., 1999. Economie expérimentale et modèle intertemporel du consommateur, Swiss Journal of Economics . Swiss Journal of Economics and Statistics 135, 591–605.

May, K.O., 1954. Intransitivity, utility, and the aggregation of preference patterns. Econometrica 22, 1–13. Shafer, W., Sonnenschein, H., 1982. Market demand and excess demand functions, in Arrow, K.J., Intriligator,

M.D. (Eds.), Handbook of Mathematical Economics, Vol. II, North-Holland, Amsterdam, 1982, pp. 671–693. Sippel, R., 1997. An experiment on the pure theory of consumer’s behaviour. Economic Journal 107, 1431–1444. Slonim, R., Roth, A., 1998. Learning in high stakes ultimatum games: an experiment in the Slovak Republic.

Econometrica 66, 569–596.