Random Utility Theory
3.3 Some Random Utility Models
3.3.1 The Multinomial Logit Model
meterθ.6The marginal probability distribution function of each random residual is given by:
Fεj(x)=Pr[εj≤x] =exp
−exp(−x/θ−Φ)
(3.3.1) whereΦ is Euler’s constant(Φ≈0.577). In particular, the mean and variance of the Gumbel variable expressed by (3.3.1) are, respectively,
E[εj] =0 ∀j Var(εj)=σε2=π2
6 θ2 ∀j
(3.3.2)
Further characteristics of the Gumbel r.v. are given in Appendix3.B.
The independence of the random residuals implies that the covariance between any pair of residuals is zero:
Cov[εj, εh] =0 ∀j, h∈I (3.3.3) From this it can be deduced that alternativejs perceived utilityUj, which is the sum of its systematic utilityVj (a constant) and the randomεj, is also a Gumbel random variable with probability distribution function, mean and variance given by:
FUj(x)=Pr[Uj≤x] =Pr[εj≤x−Vj] =exp
−exp
−(x−Vj)/θ−Φ E[Uj] =Vj, Var[Uj] =π2θ2
6 (3.3.4)
Based on the above assumptions about the residualsεj (and therefore about the perceived utilitiesUj), the variance–covariance matrix of the residualsΣεis a scalar multiple (byσε2) of an identity matrix having the same number of rows and columns as the number of alternatives. Figure3.2shows, for a multinomial logit model in- volving four choice alternatives, a graphic representation of the assumptions made regarding the distribution of the random residuals and their variance–covariance ma- trix. This representation, known as a choice tree, should be compared to those of the hierarchical logit models described in the following sections.
The Gumbel variable has an important property known asstability with respect to maximization: the maximum of a set of independent Gumbel variables, all with scale parameterθ, is also a Gumbel variable with parameterθ. More specifically, if {Uj}is a set of independent Gumbel variables having equal parameterθbut different meansVj, the variableUM:
UM=max
j {Uj}
6Some texts define the Gumbel distribution scale parameter to be the reciprocal of θ; that is, α=1/θ. In the text, theθ notation is normally used because of its analytical convenience in the specification of hierarchical logit models. Clearly, it is possible to express all results in terms of the parameterαwith a simple variable substitution.
3.3 Some Random Utility Models 97
ε
=σε2I=π2θ2 6
⎡
⎢
⎢
⎣
A B C D
1 0 0 0
0 1 0 0
0 0 1 0
0 0 0 1
⎤
⎥
⎥
⎦ A B C D
Fig. 3.2 Choice tree and variance–covariance matrix of a multinomial logit model
is again a Gumbel variable with parameterθand with meanVM given by VM=E[UM] =θln
j
exp(Vj/θ ) (3.3.5)
The variableVMis called theExpected Maximum Perceived Utility(EMPU)7or the inclusive utility.The variableY
Y =ln
j
exp(Vj/θ )
which is proportional to it, is called thelogsumbecause of its analytical form.
Because of the property of stability with respect to maximization, the assumption of Gumbel-distributed residuals is particularly convenient in random utility models.
In fact, under the assumptions made here, the probability of choosing alternativej from among those available(1,2, . . . , m)∈I, given by (3.2.4), can be expressed8 in closed form as
p[j] = exp(Vj/θ ) m
i=1exp(Vi/θ ) (3.3.6)
Expression (3.3.6) defines the multinomial logit model, which is the simplest and one of the most widely used random utility models. Under the common assumption that the parameterθ is independent of the systematic utility, the MNL model is invariant (see Sect.3.4) and has a number of important properties that are described in the following.
Dependence on the differences among systematic utilities.9In the case of only two alternatives (AandB), the MNL model (3.3.6) is called binomial logit and can
7The Expected Maximum Perceived Utility variable is dealt with extensively in Sect.3.4.
8A proof of the Gumbel random variable’s stability with respect to maximization and a derivation of the multinomial logit model from the general expression (3.2.3) are presented in Appendix3.B.
9This property and its implications hold for the entire class of invariant models, as was stated in Sect.3.2. In the following, the general results are particularized for the logit model, where they can be obtained analytically.
Fig. 3.3 Diagram of choice probabilityp[A]of a binomial logit model
be expressed as
p[A] = exp(VA/θ )
exp(VA/θ )+exp(VB/θ )= 1
1+exp[(VB−VA)/θ]
As can be seen, the choice probability of alternativeA depends on the differ- ence between the systematic utilities. Furthermore, as shown in Fig.3.3, this choice probability is equal to 0.5 if the two alternatives have equal systematic utilities (VB −VA =0). It has an S-shaped semisymmetric graph for positive and nega- tive values ofVB−VA. In addition, it tends to one asVB−VAtends to−∞(as the systematic utility of alternativeAbecomes infinitely greater than that ofB) and it tends to zero asVB−VAtends to+∞. The rate of variation of the choice probabil- ity ofAwith respect to variations ofVB−VAis larger for values ofVB−VAclose to zero, where it is almost linear, and increases as the variance of the random resid- uals (parameterθ )decreases. As the absolute value ofVB−VAincreases, the slope ofp[A]approaches the horizontal; for large differencesVB−VAthe variations of choice probability have low sensitivity to the variations ofVB−VA.
Similar considerations apply to the more general case of the multinomial logit model withmalternatives. From expression (3.3.6) it can be seen that:
p[j] = 1
1+
h=jexp[(Vh−Vj)/θ]
Influence of residual variance. From (3.3.6) it can be seen that a smaller ran- dom residual variance (smaller parameterθ) leads to a larger choice probability for the alternative with maximum systematic utility. This probability tends to one (a deterministic utility model) as the variance tends to zero. Conversely, as the variance of the residuals increases, the exponents Vj/θ tend to the same value (zero) and the choice probabilities of the different alternatives tend to the same value, equal to 1/m. The effect of the random residual variance is graphically il- lustrated in Fig. 3.2and numerically in Fig. 3.4 for two choice alternatives cor-
3.3 Some Random Utility Models 99 p[A] = exp[(−0.1·tA−1·mcA)/θ]
exp[(−0.1·tA−1·mcA)/θ] +exp[(−0.1·tB−1·mcB)/θ]
tA=20 min cA=3.6 unit VA= −5.6 tB=40 min cB=0.6 unit VB= −4.6
θ=10 θ=1 θ=0.5
pA 0.48 0.27 0.12
pB 0.52 0.73 0.88
Fig. 3.4 Effect of the variance of random residuals on choice probabilities for a binomial logit model
responding to two paths with attributes given by travel time (t ) and monetary cost (mc).
Independence from irrelevant alternatives. From expression (3.3.6), another general property of the logit model can easily be deduced. Choice probability ratios between any two alternatives depend only on the systematic utilities of the two alternatives and, in particular, are independent of the number and systematic utilities of other choice alternatives:
p[j]/p[h] =exp(Vj/θ )/exp(Vh/θ ) (3.3.7) This property, known in the literature as Independence from Irrelevant Alterna- tives (IIA), can sometimes lead to unrealistic results.
Consider, for example, the choice between two alternativesAandBhaving equal systematic utility. In this case, the logit model probability (3.3.6) of choosing each alternative is 0.50 and the ratio between the probabilities of choosingAandB is equal to one:
p[A]/p[B] =exp(VA/θ )/exp(VB/θ )=1
Suppose now that a third alternative C is added to the choice set. AlternativeC has the same systematic utility as the other two, but is otherwise very similar to alternativeB. To give a specific example, suppose that the choice is between trans- port modes, where alternativeAis a car and alternativeB is a bus. Suppose further that the systematic utilities of the two are the same so they have the same choice probability. A third alternativeC is introduced, consisting of a new bus line that runs on the same timetable, makes the same stops, and is generally perceived the same asB. AlternativesB andC would have the same choice probabilities. More- over, because of the IIA property, the ratio between the probabilities of choosing car Aand bus B remains equal to one. Therefore, each of the three alternatives would have a probability of 1/3 of being chosen. Thus, the probability of choos- ing the car would change from 0.50 to 0.33 simply because of the illusory in- crease in the number of choice alternatives. This result is clearly paradoxical and derives from the lack of realism of the basic assumptions of the logit model in the
case described: namely, that the decision-maker perceives the alternatives as com- pletely distinct, and therefore that their random residuals are independent. A more realistic choice model can be obtained by introducing a covariance between the random residuals of alternativesB andC, as shown in the following sections. In general, as shown below, a multinomial logit model has the property that any vari- ation in the choice probability of one alternative (resulting from a change in its attributes) leads to proportional variations in the choice probabilities of all other alternatives.
In applications, the multinomial logit model should be used with choice alterna- tives that are sufficiently distinct for the assumption of independent random residu- als to be plausible.