Directory UMM :Data Elmu:jurnal:T:Transportation Research Part B Methodological:Vol33.Issue1.Feb1999:

(1)

A practical technique to estimate multinomial probit models

in transportation

Denis Bolduc *

DeÂpartment d'eÂconomique, UniversiteÂ Laval, Sainte-Foy, QueÂbec, Canada G1K 7P4

Received 14 September 1994; received in revised form 1 June 1998

Abstract

The Multinomial Probit (MNP) formulation provides a very general framework to allow for inter-dependent alternatives in discrete choice analysis. Up until recently, its use was rather limited, mainly because of the computational diculties associated with the evaluation of the choice probabilities which are multidimensional normal integrals. In recent years, the econometric estimation of Multinomial Probit models has greatly been focused on. Alternative simulation based approaches have been suggested and compared. Most approaches exploit a conventional estimation technique where easy to compute simula-tors replace the choice probabilities. For situations such as in transportation demand modelling where samples and choice sets are large, the existing literature clearly suggests the use of a maximum simulated likelihood (MSL) framework combined with a Geweke±Hajivassiliou±Keane (GHK) choice probability simulator. The present paper gives the computational details regarding the implementation of this practical estimation approach where the scores are computed analytically. This represents a contribution of the paper, because usually, numerical derivatives are used. The approach is tested on a 9-mode transportation choice model estimated with disaggregate data from Santiago, Chile.

ReÂsumeÂ

La formulation probit polytomique (MNP) permet d'analyser et de deÂcrire de fac,on treÁs ¯exible, le choix d'un individu parmi un ensemble de modaliteÂs inter-deÂpendantes. Les nombreux progreÁs eectueÂs au cours des dernieÁres anneÂes concernant l'estimation eÂconomeÂtrique des modeÁles MNP, permet maintenant de contourner la probleÂmatique lieÂe aÁ l'eÂvaluation d'inteÂgrales normales multiples qui deÂ®nissent les prob-abiliteÂs de seÂlection des modaliteÂs. Les diverses approches consideÂreÂes exploitent geÂneÂralement des simula-teurs ecaces agissant comme substituts aux probabiliteÂs exactes de choix. Le simulateur ayant la faveur geÂneÂrale est le GHK, suggeÂreÂ de fac,on indeÂpendante par Geweke, Hajivassiliou et Keane. Pour les situa-tions telles que geÂneÂralement rencontreÂes dans le domaine des transports ouÁ les eÂchantillons ainsi que les ensembles de choix sont de grande taille, la litteÂrature suggeÁre treÁs clairement l'emploi d'une approche du

PART B

Transportation Research Part B 33 (1999) 63±79

(2)

maximum de vraisemblance utilisant le simulateur GHK pour approcher les probabiliteÂs de choix. Le preÂsent article fournit les deÂtails relatifs aÁ l'utilisation de cette meÂthodologie dans un cadre du maximum de vraisemblance avec deÂriveÂes analytiques. L'approche est ensuite testeÂe sur un ensemble de donneÂes deÂcrivant le choix entre neuf modes servant aÁ relier le centre-ville de Santiago aÁ des reÂgions en peÂripheÂrie.

Keywords:Multinomial probit; Simulation based estimate; Discrete choice; Transportation demand modeling

1. Introduction

Since the introduction of discrete choice techniques to analyze transportation related problems, hundreds of studies have focused on the behavioral related aspects associated with the decision process of individuals making a choice among a ®nite set of alternatives. The operational model mostly used has the Multinomial Logit (MNL) form. To have choice probabilities with a closed-form that can be calculated easily is its major advantage over more general strategies. However, the assumption made by this model that the alternatives are mutually independent is often limitative.

An attractive solution to this problem is to use the MultiNomial Probit (MNP) framework. The inter-dependencies are then accounted for through the correlation structure of an error term assumed to be normally distributed. Any error correlation structure can be postulated. Up until recently, its use was rather limited, mainly because of the computational diculties associated with the evaluation of the choice probabilities which are multidimensional normal integrals. Recently, alternative simulation based approaches have been suggested. They are described and compared in Hajivassiliou et al. (1996). Most approaches exploit a conventional estimation technique where easy to compute simulators replace the choice probabilities. For situations such as in transportation demand modelling where samples and choice sets are large, the maximum simulated likelihood framework combined with a Geweke±Hajivassiliou±Keane (GHK) choice probability simulator approach should be favoured. This paper gives the computational details regarding the implementation of this particular estimation strategy where to speed computation, the score vector associated with the likelihood function is computed using analytic expressions. The approach is tested on a 9-mode transportation choice model using disaggregate data from Santiago, Chile.

2. The multinomial probit formulation

A typical transportation mode choice model concerns the choice by individualn;n1;. . ._;_N

of the alternativei in the setCnf1;. . .;Jng which produces the highestVin utility level, i.e. so

that Vin4Vjn, 8j2Cn. In this notation, the choice set is allowed to dier across individuals, to

account for their own speci®c travel mode availabilities. In estimation, it is very important to take this choice set variability into account. To present the proposed estimation approach

intellig-ibility, it is easier to ®rst assume that each individual faces a same choice set C and then, in a

(3)

2.1. MNP with universal choice set

Assuming that each individual n faces the same J alternatives, an MNP model formulation

based on linear-in-parameters utilities may be written as follows:

VinZinin;

with

yin

1 if _V_in4Vin for j1;. . .;J; and

0 otherwise

The variable yin designates the choice made by individual n, Vin is the unobservable utility of

alternative i as perceived by individual n, Zin is a1K vector of explanatory variables

char-acterizing both the alternativeiand the individualn.is aK1vector of ®xed parameters and

®nallyinis a normally distributed random error term of mean zero assumed to be correlated with

the errors associated with the other alternativesj;j1;. . ._;_J;_j₆_i_{. In vector form, one can write}

this relationship as:

VnZnn; nN0;; 1

whereVnandn areJ1vectors andZnis aJKmatrix.

As is well known, the only identi®able parameters in the original model (1) are those that can be retrieved uniquely from the parameters of a scaled model dierenced with respect to the utility of an arbitrary alternative. Below, we use the last alternative as the base and the scaling is per-formed by ®xing to one, the variance of the ®rst error term in the dierenced model. This version

which is referred here as theestimablemodel, can be written as:

U_nX_n_n; _nN0;_; 2

which is the model in Eq. (1) written in deviation with respect to the utility of the last alternative

VJn. More speci®cally, let mJÿ1, then U_n is a m1 vector with components

U_inVinÿVJn,i1;. . .;m. The matrix Xn and the vectorn are de®ned similarly. The scaling

is such that var

n var1nÿJn 1.1 To impose a positive de®nite error covariance matrix

_{, it is preferable to work with a formulation based on a Cholesky decomposition of}_{. Such}

an equivalent model is written as:

U_nX_nSw_n;w_nN0;Im; 3

1 _{To set} _var

1n 1 is equivalent to dividing each row of the dierenced model by the quantity model by the

quantitys var1nÿJn p

. In that case, one would getU_in VinÿVJn=sand the vectorshould be replaced with

(4)

where S is a lower triangular matrix such that SS0_{, with} _s₁₁_{1 to be consistent with the}

scaling used. The parameters one is interested in are theKparametersk in the vector and the

pm m1=2ÿ1 parameters s21;s31;. . ._;_s_m_1;_s_22;. . ._;_s_mm _{that we incorporate in a} _p₁

vector denoted ass. To complete the notation, we call 0;s00_{, the joint vector of parameters}

formed by the vertical concatenation ofand s.

2.1.1. The utilities in deviation with respect to the chosen alternative

To evaluate the log-likelihood function requires calculating for each individual in the sample the probability associated with the choice made. To compute the GHK choice probability

simu-lator for a given individualn, it is convenient to map the utilities into dierences with respect to

the utility of the choice made so that the probability Pnithen becomes an integral over a

non-positive orthant. This particular representation can be obtained by premultiplying the estimable

model in Eq. (2) by a mm linear operator that we call Mi. For any i other than J, the

fol-Premultiplying Eq. (2) by matrixMi gives:

UnMiUnMiXnMin;Xnn; 4

where nN0;i, with i MiM0i. Note that the Un vector thus obtained contains only

negative components: U1n<0;U2n<0;. . .;Umn<0, As seen below, the GHK simulator

exploits this particular feature. The vector n is de®ned similarly to Un and the matrix Xn is of

dimensionmKwith rows Z1nÿZin;. . .;ZJnÿZin. Finally, note that because of Eq. (3),

one may also write:

Vn i MiM0iMiSS0M0i: 5

Eqs. (4) and (5) are particularly useful because they are expressed in terms of the estimable

parameters in and the vectors which, one can recall, is composed of the elements of the

cho-lesky matrix Sin Eq. (3).

2.2. MNP with individual speci®c choice set

To account for individual speci®c choice sets is relatively straightforward. Let n denote an

individualn1;. . ._;_N_{which chooses that alternative}_i_{in choice set}_C_n _f₁_;. . ._;_J_n_g_{for which}

Vin4Vjn, 8j2Cn. The corresponding MNP model is obtained by excluding from Eq. (1) those

(5)

estimablemodel, is amounts to removing the appropriate rows from Eq. (2). Such a formulation

can be obtained with the use of anmnmmapping matrixEn, wheremnJnÿ1. It is de®ned

as an identity matrix with rows associated with the alternatives not available, deleted. In the case

of an universal choice set,Enwould simply be an identity matrix of sizem. With this in mind, one

can therefore use a unique and general notation to refer to both cases of the MNP model for-mulation.

2.3. A general notation for the MNP model

Consider the previously introducedmmmapping matrix Mi that maps themJÿ1

uti-lity dierencesU

nof theestimablemodel in Eq. (2) into a deviation with respect to the alternative

i chosen by individual n. Consider also themnm mapping matrixEn which removes the

uti-lities associated with the alternatives unavailable to n. Those two operations can be combined

into the following operator:

Min EnMi: 6

Thismnmmapping matrix, when premultiplying the utility vectorU_n gives:

Un

This is themn1vector of the utilities associated with the alternatives available to individualn

expressed in deviation with respect to the chosen alternative. Note again that in the case of a

universal choice set, MinMi, 8n sinceEn would be an identity matrix. Using the so calledMin

matrix and theestimablemodel in Eq. (2), the ®nal notation that we exploit to compute the choice

probabilities is:

UnMinUnMinXnMinnXnn; 7

wherenN0;in, withinMinSS0M0in. This notation should make clear that depending on

the mode availability and the choice made, the error covariance matrix vary between

observa-tions. However, it should be noted that a commonS Cholesky matrix appears in the covariance

structure. The matrix Xn represents a mnK matrix of the deviation of the explanatory

vari-ables of each available alternative j other than alternativeiwith respect to the explanatory

vari-ables of the chosen alternative i. Again, note that by de®nition, the Un is a vector with negative

components, i.e.: U1n40;. . .;Umnn40. To compute a given choice probability, we will also need

to consider a Cholesky transformed version of this formulation. This is written as:

UnXnSnwn;wnN0;Imn; 8

(6)

SnS0nin MinSS0M0in: 9

For estimation, we will exploit this relationship between the individual speci®c Cholesky matrix

Sn and the unique Cholesky matrix S containing the parameters to estimate. To clarify the

dis-tinction between those two matrices, note thatSnrefers to the formulation expressed in deviation

with respect to the chosen alternative with only the available alternatives being considered whileS

refers to theestimablemodel expressed in deviation with respect to the last alternative with all the

alternatives included, whether it is available or not. Obviously, a given observationncontributes

to estimate only those elements ofSreferring to the available alternatives.

3. Model estimation

3.1. The choice probabilities

DenotePnias the choice probability associated with the alternative i;i2Cn chosen by

indi-vidualn. Given the formulations in Eqs. (7) or (8), this is also the probability of drawingUnwith

each component U1n;. . .;Umn;n being non positive. It can be computed as an mn- dimensional

integral of the form:

Pni

where nn;in is a multivariate normal density with mean zero and covariance matrix in.

Unless mn is small, the choice probability in Eq. (10) cannot be computed using a numerical

integration technique. One solution is to simulate Pni. Many simulators with good properties

have been suggested. The most useful ones are described and compared in Hajivassiliou et al. (1996). The GHK simulator, that is used here, was clearly found to be the one with the best the-oretical and empirical properties. Thethe-oretical and analytical details on this simulator may be found in BoÈrsch-Supan and Hajivassiliou (1993), Hajivassoliou (1993) and Geweke et al. (1992). Still, we provide computational details regarding its implementation because we need speci®c expressions to derive analytical relationships to compute the scores. The GHK simulator exploits the recursive structure imposed by the Cholesky transformation present in Eq. (8). For a given

observation n, using Eq. (8), one can write:

(7)

U1n40!w1n4

To write it in a more compact way, we use the notation:

w1n 4 a1n

Therefore, the choice probabilityPnican be written as:

Pni prUn40 prw1n4a1n;w2n4a2nw1n;. . .;wmn;n4amn;nw1n;. . .;wmnÿ1;n: By conditioning, one can also write:

Pni prU1n40prU2n40jU1n40. . .prUmn;n40jU1n40;U2n40;. . .;Umnÿ1;n40 prw1n4a1nprw2n4a2nw1njw1n4a1n. . .:

3.2. The GHK simulator

Letrdenote a draw. Callwnra given realizationrof the vectorwnsuch that Eq. (11) is satis®ed.

Based onRsuch draws, the following expression:

(8)

where

fnri prw1n;r4a1n;rprw2n;r4a2n;rw1n;rjw1n;r4a1n;r. . .;

is a choice probability conditional onwnrwhich can be used to provide an unbiased simulator for

Pni. The GHK simulator de®ned in Eq. (13) is smooth with respect to the parameters k,

k1;. . ._;_K_{and the}_s₁₁_;_n_;. . ._;_s_m

n;mn;n elements forming the lower part of Cholesky matrixSn. It

is also known to have good asymptotic properties. For proofs, refer to Hajivassiliou and McFadden (1990). For notational compactness and because of the normality assumption, we can

writefnrias follows:

fnri a1n;ra2n;r. . .amn;n;r 1nr2nr. . .mn;nr; 14

where_{denotes a standard normal cumulative distribution function. Using Eq. (12), the}_a_ln_;_r _are

computed as:

random uniform number taken from the (0,1) interval.

3.3. The simulated likelihood function

The estimation method considered is based on the maximisation of the natural logarithm of the

simulated likelihood function. Denoting as the joint vector of parameters to estimate (i.e. it

containsands), the simulated log-likelihood function which is written as:

LX

is maximized with respect to. Technical details on the computation of the analytical ®rst- order

derivatives are provided in Appendix A. To our knowledge, this is the ®rst implementation of the simulated likelihood framework based on the GHK probability simulator which uses

analy-tical instead of numerical derivatives.2 _{The computation of}_@L=@_{is quite straightforward when}

2 _{One of the referees claimed that numerical derivatives should be more reliable and as fast as suggested approach}

(9)

compared to the computation of@L=@s. The latter is computed using a chain rule which exploits a

jacobian transformation between s and sn. This transformation arises from the relationships in

Eq. (9). Appendix A provides the details on the computation of@L=@sn. The derivative@L=@s of

the log-likelihood function with respect to the vector s of interest is evaluated as @s0

n=@s@L=@sn,

where @s0_n=@s is a jacobian matrix whose calculation is detailed in Appendix B. In the

imple-mentation of the estimation algorithm, we use a BHHH approach which avoids the need to compute the second-order derivatives which are in this case, rather involved. The computer pro-gram written in Fortran 77 has gone through several stages of testing. Dierent Monte Carlo based tests were made on a SUN workstations. We now use it on some real data describing a choice situation among nine inter-related travel mode alternatives.

4. An application

To test the methodology, we use a data bank about the choice of modes for the morning peak journey to work to the central business district (CBD) of Santiago. This data bank has been described and employed several times in the past. Gaudry et al. (1989) focused on the estimation of the valuation of time saving by the transportation mode users. They found that the values obtained with linear Logit or linear nested Logit speci®cations were particularly high. By using

nonlinear speci®cations of Box±Cox Logit type, they were able obtain more satisfactory results.3

In our application, we use the same data and the same model speci®cation in order to ®nd out whether to allow for a ¯exible correlation structure of the utilities can lead to reasonable VOT values. Details on the sample size, the dierent mode shares and their availability are displayed in Table 1.

We will ®nd that to do so certainly helps to improve the value of time (VOT) estimates. Still, their seems to remain some room for improvement. To implement a Box±Cox technique within a MNP setting represents a too formidable task. We suspect that an MNP extension of the MNL formulation with lognormally distributed VOT implemented in Ben-Akiva et al. (1993) would be a reasonable approach to address this problem. Our exploratory analyses using the random VOT based MNL speci®cation clearly point into this direction. Such a more general MNP model fra-mework with randomly distributed VOT coecients will be considered in a later research.

The important variables entering the model speci®cation are travel timetinandcin=wn, the cost

for individualnof travelling by modeias a proportion of his/her net personal income per min of

work [Chilean pesos/min ($)]. To use cost as a proportion of income allows the VOT to vary deterministically with income. The other VOT related variables used in the speci®cation are walking and in vehicle time for all modes and a waiting time variable for all modes other than car. Finally, in addition to eight alternative speci®c dummies, appear two other explanatory variables of socio-economic type. The ®rst one, speci®c to car driver alternatives 1 and 6 indicates the number of cars in the household as a proportion of number of driving permit holders. The last variable is a sex dummy included for modes 2, 3 and 7 listed in Table 1. The model estimates

3 _{In order to implement their Box-Cox methodology, they removed observations that contained zero values for some}

(10)

obtained using a basic MNP i.i.d. speci®cation are displayed in column 1 of Table 2. An MNP speci®cation with i.i.d. errors can easily be estimated using Gauss±Hermite quadrature to com-pute the choice probabilities entering the log-likelihood function. Only 12 quadrature points were used to perform the integrals. The solution was obtained in 14 iterations using 0 as the starting value for all the parameters. The implied VOT estimates as a percentage of net income are pro-duced in the same column of Table 3. Those values are in the same range as the ones obtained using a linear logit speci®cation of Gaudry et al. (1989). The value of time sensitivity in vehicle, as a percentage of net personal income is computed as:

@cin=wn

@tin

@Vin

@tin

= @Vin @cin=wn

tm

cin=wn

:

As well known, in linear speci®cation, the VOT measures are evaluated as ratios of parameters. The two other VOT estimates were produced similarly. Standard errors were computed using the delta method.

4.1. The model formulation with correlated utilities

To capture the similarities between the transportation modes and to keep a rather parsimo-nious parametric speci®cation of the error covariance structure, we use the ®rst-order Generalized Autoregressive [GAR(1)] process approach suggested in Bolduc (1992). In order to obtain such a formulation, the original model in Eq. (1) is replaced with:

VnZnTPÿ1n; n N0;IJ; 17

whereTis aJ-diagonal matrix which contains standard deviationi in theith position, andPis a

matrix for capturing the covariance eects using functions based on few underlying unknown

Table 1

Statistics on the sample used

Alternative Chosen Percent Availability

1. Car-driver 168 0.12933 681

2. Car-passenger 66 0.05081 730

3. Shared-taxi 58 0.04465 833

4. Metro 295 0.22710 407

5. Bus 430 0.33102 1287

6. Car-driver±metro 101 0.07775 530

7. Car-passenger±metro 41 0.03156 594

8. Shared-taxi±metro 65 0.05004 828

9. Bus-metro 75 0.05774 841

(11)

parameters. It can be viewed as a restricted version of a saturated factor analytic formulation.

This model can be obtained from an initial model VnZnTn, with nWnn, being

assumed. This last autoregressive process is assumed to be based on a JJ Boolean (0±1)

contiguity matrix W which, in this particular application, relates the ®rst two alternatives

toge-ther and does the same with the last seven alternatives. The Wmatrix used is as follows:

Table 2

Estimation resultsa

Variables MNP i.i.d. SML MNP R50 3. Shared-taxi 0.61 (4.74) 0.82 (5.25) 0.61 (2.31) 0.51 (1.66) 0.27 (0.89) 4. Metro 3.11 (16.56) 3.10 (15.97) 2.81 (5.78) 2.73 (5.84) 2.88 (5.88) 5. Bus 1.49 (12.30) 1.65 (11.11) 1.47 (8.27) 1.45 (8.38) 1.49 (7.96) 6. Car-driver±metro ÿ0.02 (ÿ0.01) 0.18 (0.80) ÿ0.59 (ÿ1.15) ÿ0.49 (ÿ1.01) ÿ0.59 (ÿ1.20) 7. Car-passenger±metro 0.36 (2.68) 0.58 (3.60) ÿ1.46 (ÿ0.96) ÿ1.94 (ÿ1.07) ÿ2.33 (ÿ1.11) 8. Shared-taxi±metro 0.66 (3.65) 0.90 (4.54) ÿ0.09 (ÿ0.16) ÿ0.14 (ÿ0.22) ÿ0.20 (ÿ0.62) 9. Bus±metro 0.90 (5.02) 1.12 (5.78) 1.23 (6.60) 1.20 (6.74) 1.25 (7.72) Other variables

Cost/income ($) ÿ0.02 (ÿ8.61) ÿ0.02 (ÿ7.59) ÿ0.02 (ÿ3.89) ÿ0.02 (ÿ3.81) ÿ0.02 (ÿ3.83) Walk time (min) ÿ0.08 (ÿ9.37) ÿ0.08 (ÿ9.24) ÿ0.07 (ÿ4.04) ÿ0.06 (ÿ3.98) ÿ0.07 (ÿ4.01) In vehicle time (min) ÿ0.05 (ÿ5.50) ÿ0.05 (ÿ5.62) ÿ0.04 (ÿ3.51) ÿ0.04 (ÿ3.48) ÿ0.05 (3.55) Waiting time (min) ÿ0.25 (ÿ5.28) ÿ0.18 (ÿ6.15) ÿ0.13 (ÿ3.56) ÿ0.13 (ÿ3.54) ÿ0.13 (ÿ3.64) No. cars/no. permit holders

(alternatives 1 and 6)

1.29 (5.61) 1.24 (5.44) 1.54 (3.16) 1.45 (3.13) 1.52 (3.14)

Sex dummies (male=1) (alternatives 2,3 and 7)

ÿ0.42 (ÿ3.64) ÿ0.39 (ÿ3.69) ÿ0.41 (ÿ3.21) ÿ0.43 (ÿ3.27) ÿ0.49 (ÿ3.53)

ÿ1482.39 ÿ1473.60 ÿ1447.40 ÿ1442.83 ÿ1443.86

Number of iterations 14 7 20 22 18

Run time (min and s)/iteration on SUN UltraSparc 1

0.06 1.44 1.50 7.26 6.43

(12)

W

0 1 0 0 0 0 0 0 0

1 0 0 0 0 0 0 0 0

0 0 0 1 1 1 1 1 1

0 0 1 0 1 1 1 1 1

0 0 1 1 0 1 1 1 1

0 0 1 1 1 0 1 1 1

0 0 1 1 1 1 0 1 1

0 0 1 1 1 1 1 0 1

0 0 1 1 1 1 1 1 0

To insure the invertibility of the PIJÿW matrix, so that n could be replaced with Pÿ1n

for a value ofde®ned on the (ÿ1,1) interval, theWmatrix is normalized so that each row sum

to one. In the postulated structure, the correlation coecientand a maximum ofJÿ1 standard

error terms can be estimated. [For more details on the use of GAR(1) processes to approx-imate the error covariance structure in discrete choice modelling and on parameter estimability

issues, refer to Bolduc, (1992)]. In our application, the second standard deviation term 2 is set

to 1.

4.2. Estimation results

Estimation results of the dierent MNP versions considered are displayed in Table 2. Column 2 results refers to the SML MNP solution based on 50 draws, assuming cross-correlation between the alternative speci®c errors and homoscedasticity. In other words, all sigmas are ®xed to 1 and

onlyis estimated. The results obtained clearly show that correlation is present between the

uti-lities. The GAR(1) setting makes it possible to summarize the full correlation structure using a single correlation coecient. The ®t is better but the estimated parameters are not so dierent from the MNP i.i.d. solution. Columns 3 and 4 of Table 2 refer to the model speci®cation where seven standard deviation terms are estimated. Recall that for identi®cation, the second standard deviation term is ®xed to 1. Therefore, in this speci®cation referred to as unconstrained in the tables each utility has its own heteroscedasticity eect. According to the estimation results obtained, heteroscedasticity is signi®cantly present. The model ®t is much better than with the homoscedastic structure. Values of time estimates, especially when 250 draws are used, are much

Table 3

Value of time as a percentage of net personal incomea

Variable MNP i.i.d. SML MNP R50 homoscedastic

SML MNP R50 unconstrained

SML MNP R250 unconstrained

SML MNP R250 constrained

In vehicle time 201.7 (4.72) 211.3 (4.64) 184.0 (4.43) 198.6 (4.42) 214.5 (4.50) Walking time 345.7 (6.46) 353.9 (6.05) 288.2 (5.16) 297.7 (5.04) 315.8 (5.01) Waiting time 773.5 (4.74) 854.5 (4.83) 558.8 (4.03) 590.1 (3.99) 606.3 (4.02)

(13)

lower than in the MNP i.i.d. case. One known problem associated with maximum simulated likelihood is the bias introduced by simulating. This is because the GHK is a technique to simu-late the choice probability, not its natural log. The bias can be proved to be present using Jensen's inequality. It is also known that it becomes unsigni®cant with large number of simulation draws. This justi®es our use of 250 draws for estimation. Note that with 50 draws, the results are very close to those obtained with large number of draws. The improvement in the ®t using larger number of draws can be attributed to the presence of the bias. All this indicates that the GHK simulator performs very well.

The last column of Table 2 refers to a constrained version of the model where the hetero-scedastic structure is postulated to be consistent with the de®nition of the alternatives provided in Table 1. In this version, we are postulating homoscedasticity among groups of alternatives. The

groups formed using the restrictions: 16; 21; 3 4 8; 59 with 7 remaining

free are car-driver, car passenger, taxi-metro, bus and car-passenger±metro. This is done to demonstrate the ¯exibility of the approach. In this case, the simulated log-likelihood value is

ÿ1443.86 which is marginally higher in absolute value than the one corresponding to the

unrest-ricted version of the model, which indicates that the restrictions make sense. This is con®rmed with a log-likelihood ratio test. All parameter estimates closely resemble the values obtained with the unconstrained model. The run time per iteration was approximately 6.4 min.

The constrained estimation gains in terms of number of iterations to reach optimum but the computation time per iteration is almost the same, since only mappings are applied in computing the derivatives, so a comparable number of manipulations are required at a given iteration. The implied VOT estimates for in vehicle time, walking time and waiting time are not as good as those obtained with the more general error structure. The unconstrained version performs best in pro-ducing smaller VOT estimates. However, there appears to be some room for improvement. An MNP formulation with lognormally distributed VOT coecients is a potentially good alternative for solving the problem of high VOT estimates. This is left for a future research.

5. Conclusion

Our application demonstrated the feasibility of MNP estimation when applied to choice situa-tion based on large choice sets. In the most general case that we considered in the applicasitua-tion, each utility was characterized by a speci®c standard error and the correlation among two natural blocks of utilities was modelled using a generalized autoregressive error structure. Based on the observed improvements in the log-likelihood values, to take into account these inter-relationships between the model errors was important. The technique was applied to the 9-mode transportation choice model considered in Gaudry et al. (1989).

(14)

Acknowledgements

The author would like to thank Professor Marc Gaudry for kindly providing the Santiago data bank used for this study.

Appendix A. The ®rst-order derivatives

A.1. The simulated likelihood function

The estimation method considered here maximises the natural logarithm of the simulated

likelihood function. Noting that denotes the joint vector of parameters to estimate (i.e. it

con-tainsands), then the simulated log-likelihood function is written as:

LX

_lnr _{is the conditional probability of choosing alternative} _i _{given a particular}

drawrof the vectorwnin Eq. (8) of the text. Note that we use directly the conventions established

in Eqs. (11)±(15). We now provide the details regarding the computation of the ®rst-order deri-vatives.

A.2. First-order derivatives

Recall that the joint vector of parameters is denoted as. Using Eq. (A1), we get:

@L

computation of@aln;r=@, the following recursion:

(15)

needs to be taken into account. Recall thatuhn;r denotes a particular draw from the random

uni-form distribution de®ned over the unit interval. We now give the explicit relationships forands.

A.3. Derivatives with respect tob

@L

and where using Eq. (15) and the recursion in Eq. (A4), one has:

@aln;r

A.4. Derivatives with respect tos

As previously mentioned, this derivative is computed using a chain rule linkingsn tos. Below

we provide@L=@sn. Then,@L=@sis computed as@s0n=@s@L=@snwhere the jacobian matrix@s0n=@sis

detailed in Appendix B. Now recall that the sn vector is formed by concatenating the elements

sij;n,i5j in a column vector. Then the elements of the@L=@sn vector are:

where using Eq. (15), one can note that:

(16)

Appendix B. The jacobian matrix

In this appendix, we detail the relationship betweensn ands. From Eq. (9), one can write:

_in_S_n_S0

n MinSS0M0in: B1

LetLbe a matrix such thatvecS L0s and letK0 be the matrix that mapsstovecS0. Then

Eq. (B1) may be written as:

vec_in_M_in_S_M_in_vec_S_M_in_S_M_in_L0_s

MinMinSvecS0 MinMinSK0s:

Then this implies that

@vec_in

@s0 MinSMinL 0_M

inMinSK0B0in: B2

Similarly callingL0_nandK0_n the matrices mappingsntovecSnandvecS0n, respectively. Then:

vec_in_vec_S_n_S0

n SnImmL

0

nSn ImnSnK

0 nsn;

and therefore:

@vec_in

@s0 n

SnImnL

0

n ImnSnK

0 nA

0

n: B3

Taking a pseudo-inverse of the matrix in Eq. (B3), then:

@s0_n

@vec_inA

0

n ;

which ®nally implies that:

@s0 n

@s

@vec_in0

@s

@s0 n

@vec_inBinA

0

n ; B4

whereBinand A

0

n are detailed in Eqs. (B2) and (B3), respectively.

References

(17)

Bolduc, D., 1992. Generalized autoregressive errors in the multinomial probit model. Transportation Research BÐ Methodological 26B(2), 155±170.

BoÈrsch-Supan, A., Hajivassiliou, V., 1993. Smooth unbiased multivariate probability simulators for maximum like-lihood estimation of limited dependent variable models. Journal of Econometrics 58, 347±368.

Gaudry, M.J.I., Jara-Diaz, S.R., Ortuzar, J., 1989. Value of time sensitivity to model speci®cation. Transportation Research BÐMethodological 23B(2), 151±158.

Geweke, J., Keane, M., Runkle, D., 1992. Alternative computational approaches to inference in the multinomial probit model, Research Department, Federal Reserve Bank of Minneapolis.

Hajivassiliou, V.A., 1993. Simulation estimation methods for limited dependent variable models. In: (Eds.) Handbook of Statistics, Vol. 11. Maddala, G.S., Rao, C.R., Vinod, H.D., pp. 519±543, North Holland, Amsterdam.

Hajivassiliou, V.A., McFadden, D., 1990. The method of simulated scores for the estimation of LVD models with an application to external debt crises, working paper, Cowles Foundation for Research in Economics, Yale University, Connecticut.