Directory UMM :Data Elmu:jurnal:J-a:Journal of Econometrics:Vol95.Issue1.2000:

(1)

*Tel.: 852-2358-7600; fax: 852-2358-2084. E-mail address:l#[email protected] (L.f. Lee)

A numerically stable quadrature procedure for

the one-factor random-component discrete

choice model

Lung-fei Lee

*

Department of Economics, Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong

Received 1 January 1997; received in revised form 1 December 1998; accepted 1 April 1999

Abstract

The Gaussian quadrature formula had been popularized by Butler and Mo$tt (1982 Econometrika 50, 761}764) for the estimation of the error component probit panel model. Borjas and Sueyoshi (1994, Journal of Econometrics 64, 164}182) pointed out some numerical and statistical di$culties of applying it to models with group e!ects. With a moderate or large number of individuals in a group, the likelihood function of the model evaluated by the Gaussian quadrature formula can be numerically unstable, and at worst, impossible to evaluate. Statistical inference may also be inaccurate. We point out that some of these di$culties can be overcome with a carefully designed algorithm and the proper selection of the number of quadrature points. However, with a very large number of individuals in a group, the Gaussian quadrature formulation of integral may have large numerical approximation errors. ( 2000 Elsevier Science S.A. All rights reserved.

Keywords: Discrete choice; Random component; Quadrature

1. Introduction

Consider the following random-component, binary choice model:

yH_ij"x

ijb#eij, (1)

(2)

for groupsj"1,2,Jand individualsi"1,2,N_jin thejth group, wherexis

a vector of exogenous variables. The disturbance in (1) is generated by an error component structure:

e_ij"u

j#vij, (2)

where u

i and vij are mutually independent, uj are i.i.d. for all j, and vij are

i.i.d. for all i and j. The sign of the latent yH determines the observed dichotomous dependent variableIasI"1 ifyH'0; it is 0 otherwise. LetFbe the conditional distribution function ofeconditional onu, andf be the density ofu. The log-likelihood function for the model is

¸"₊J

For a probit model, both u and v are assumed to be normally distributed. A typical normalization for the probit model speci"es a zero mean and a unit variance for the total disturbancee. Letobe the correlation between individuals within a group. For this normalized random-component model, o is also the variance of u. Therefore,u

j is N(0,o) andvij is N(0, 1!o). The log-likelihood

function for this probit error component model can be compactly written by using the symmetry of a normal density as

¸"+J

whereUis the standard normal distribution function, andDis a sign indicator such thatD

ij"1 ifIij"1;Dij"!1, otherwise.

The joint probability of individual responses within a group in (3) involves a single integral whose integrand is a product of univariate probability func-tions. Butler and Mo$tt (1982) pointed out that these joint probabilities could be e!ectively evaluated by the Gauss}Hermite quadrature. The Gaussian quad-rature evaluates the integral:=_~=e~z2

where M is the designated number of points; zs and ws are the M-point Gauss}Hermite abscissas and weights. Ifgin (4) is a polynomial of degree less than 2M!1, the Gauss}Hermite integration evaluation is exact. The theory

(3)

is based on orthogonal polynomials (see e.g., Press et al., 1992, Chapter 4). The abscissas and weights are available from Stroud and Secrest (1966) and Abramowitz and Stegun (1964). Press et al. (1992) provides computing codes for generating the abscissas and weights with any speci"c number of points. Given the Gauss}Hermite points z

m and weights wm, the likelihood in (3) can be

evaluated by the Gauss}Hermite formula as

¸"₊J j/1

ln

G

+M m/1

w m

1

p1@2 Nj

<

i/1

U

C

DijAxijb#(2o)1@2zm

(1!_o)1@2

BDH

. (5)

Butler and Mo$tt (1982) illustrated the usefulness of the Gaussian quadrature with a panel data consisting of a sample of 1550 cross-sectional units with a maximum of 11 periods each. Based on stability of estimates, they suggested that the two- or four-point quadratures would be su$cient. The Gaussian quadrature approach is computationally e$cient relative to other quadrature techniques such as trapezoidal integration (Heckman and Willis, 1975). Sub-sequently, the Gaussian quadrature approach has been often used in the empiri-cal econometrics literature.

In a recent publication, Borjas and Sueyoshi (1994) pointed out some numer-ical and statistnumer-ical di$culties that occurred when they tried to apply the technique to study probit models with structural group e!ects where the number of individuals in a group was large. In their model, individuals belonging to a given group share a common component in the speci"cation of a conditional mean, and there are many groups. The group e!ect speci"cation is an error component probit model as in (1) and (2). They argued that a likelihood formulation with the Gaussian quadrature could be numerically unstable, and, at worst, impossible to evaluate with computers if the number of individuals in some groups, i.e.,N

j, was large. This is so because the integrand of a numeric

integration in (3) involves the product of cumulative probabilities for all mem-bers in a group. With a hypothetical sample of 500 observations per group and assuming a likelihood contribution of 0.5 for each member in a group, the value of the integrand can be as small as e500C-/(0.5)+e~346.6, which is below the existing absolute minimum for a computer. The numerical problem occurs when one tries to evaluate the product that consists of many terms of small numbers. Based on Monte Carlo results, Borjas and Sueyoshi (1994) also pointed out that statistical inference based on the Gaussian quadrature likelihood function can be quite inaccurate. Nominal levels of signi"cance can be much smaller than actual levels of signi"cance in hypothesis testing.

(4)

1This raises a challenging question on whether there are other superior integration methods (numerical or stochastic ones) which can be better than the Gaussian quadrature formulation. For those possible formulations, our recommended algorithm will also be useful.

large approximation errors when the number of individuals in a group is large. The consequence of the latter may be larger standard errors for maximum likelihood estimates (MLE) when the number of individuals in a group increases.1

2. Gaussian quadrature and a numerically stable algorithm

We suggest an algorithm that can overcome the numerical problem. The numerically unstable problem can be resolved if the summation and product operators behind the logarithmic transformation in (5) can somehow be inter-changed. The possible interchanging summation and product is"rst discussed in Lee (1996) for some related problems in simulation estimation. It can be summarized in the following proposition.

Proposition. For any constants a

tr, t"1,2,¹ and r"1,2,R, the following

whereu_trare weights for t*1,which can be computed recursively as

u

The result follows by induction. h

(5)

The summation over the quadrature points and the product of individual probabilities in ¸ of (5) can be interchanged with a weight adjustment. The logarithmic transformation is then applied to the product of adjusted terms. In consequence, the log likelihood function ¸ in (5) can be evaluated by the following iterative algorithm:

Algorithm. The log likelihood¸can be evaluated as

¸"₊J j/1

Nj + i/1

ln

G

+M m/1

U

C

DijAxijb#(2o)1@2zm

(1!o)1@2

BD

ui~1,jmH, (6)

where the weightsu

ijmcan be computed recursively by

u

ijm"

U

C

DijAxijb#(2o)1@2zm

(1!_o)1@2

BD

ui~1,jm +M_s/1U

C

DijAxijb#(2o)1@2zs

(1!_o)1@2

BD

ui~1,js

, i"1,2,N_j!1;j"1,2,J,

(7)

starting withu

0jm"wm/p1@2form"1,2,Mand for allj.

In a group e!ect model, individuals in a group correspond to periods (&time') and the number of groups is the number of cross-sectional units in a panel data model with cross-sectional time series. The recursive evaluation of the weights

w

ijmin (7) is over individualsiwithin a group where ordering can be arbitrary.

The w

ijm for m"1,2,M are weights; they are positive numbers and

+M_m/1w

ijm"1. The product overiin¸of (5) has been e!ectively taken out by the

logarithmic transformation in (6). This formulation avoids the evaluation of the product of probabilities and can be numerically stable. Except for the weighting adjustment, the expression of¸in (6) resembles the log likelihood function for a pooled probit likelihood function. The weighting has e!ectively corrected the correlation e!ect of individuals within a group. The evaluation of the log likelihood function in (6) may be slightly more complicated than the evaluation of the conventional one in (5) as the former involves updating the weights in (7) in a recursive fashion. However, the formulation of the log likelihood in (6) is a by-product of the weighting scheme as the term in the bracket of (6) is exactly the denominator term in (7). In this regard, the updating of the weighting scheme in (7) does not impose much additional computational burden. In a subsequent Monte Carlo experiment, numerical evidence will be provided to demonstrate the e!ectiveness of this iterative formulation and compare it with the conven-tional algorithm if possible.

(6)

2The computation cost di_!erences will be entirely due to these additional calculations when a common optimization subroutine is used.

can be rewritten as¸"₊Jj

/1ln[+Mm/1exp(hjm)], where

h

jm"ln(um/Jp)# Nj + i/1

lnU[D

ij((xijb#J2ozm)/J1!o2)].

Denotep

j"maxMhjm: m"1,2,MN. The¸can be evaluated as ¸"₊J

j/1

C

p

j#ln

A

M + m/1

exp(h

jm!pj)

BD

. (8)

This modi"cation might be valuable ifh

jmfor allm"1,2,M, are not much less

than p

j for each j. This formulation may be more expensive than both the

conventional and the recommended algorithms as one has to sort out the maximum quantity amongh

jmfor m"1,2,Mand compute the di!erence of h

jm!pj for each j.2 In any case, we will compare this approach with our

recommended approach in the subsequent Monte Carlo experiments.

3. Monte Carlo results

Monte Carlo experiments are designed to investigate the numerical stability of the proposed algorithm and compare it with others in terms of computing time. In addition, estimation results may reveal the relevance of the number of Gaussian quadrature points and the performance of the MLE and related test statistics.

In the main design of our experiments, sample data are generated from the model:

yH

ij"b1#b2xij#uj#vij, (9)

whereuis N(0,o) andvis N(0, 1!o). x

ijis generated from an N(0, 1) random

generator with a 0.5 correlation coe$cient for individuals in a groupj. The true parameters are set tob₁"0,b₂"1, ando"0.3. The underlyingR2foryHis therefore 0.5. Sinceois the variance ofu, its value is restricted between 0 and 1 in the estimation. We have experimented with samples with various groups (J), various numbers of individuals (N) in a group, and various Gaussian quadrature points (M). There are either 50 or 100 groups in a sample. The sample in the main design is balanced because the number of individuals in a group is the

(7)

same for all groups, i.e.,N

j"Nfor allj"1,2,J. We consider cases with small

and large Ns. For each case, the number of replications is 400. We report summary statistics on the empirical mean (Mean), empirical standard deviation (Em.SD), the average maximized log likelihood value (lnlk), and the average CPU time in seconds per replication. In addition to the main design, we have also experimented with designs with a larger proportion of variance due touin the overall e, a larger number of regressors in a model, and panels with unbalanced observations. The MLE will also be compared with the"xed e!ect probit estimates (FEPE). The optimization subroutine is the DFP algorithm from the GQOPT package. All computations are performed in a cluster of SUN SPARCstation 20 workstations. The Gauss}Hermite points and weights are generated by the subroutine&gauher'in Numerical Recipes by Press et al. (1992). Table 1 reports results for MLEs when N is either 10 or 100. Various

M"2}20 are tried. The numerical algorithm is stable as all replications con-verge. For cases withN"10, this is expected because there was no report on instability of the conventional algorithm in Butler and Mo$tt (1982) with small

&time' dimension in discrete panel data. All the estimates of b₁ and b₂ are unbiased. This is true for various numbers of Gaussian quadrature points and groups. There is some moderate amount of downward bias in the estimate of

o when only a two-point quadrature is used. The biases become small when four- or eight-point quadratures are used. The lnlk improves asM increases from two to four. The improvement in lnlk from M"4 to 8 is small. It is interesting to note that, withMgreater than 8, no improvements are observed as the likelihood function becomes stable. The CPU times are approximately linear inMandJ. For the cases withN"10, the four- or eight-point quadratures are su$cient. These results con"rm the suggestion in Butler and Mo$tt (1982) for the use of a four-point quadrature. Their suggestion was derived from a sample with large cross-sectional units but similar&time'dimensions. It is, however, by no means a universal rule. To compare time costs of the recommended algo-rithm with the conventional formulation by Butler and Mo$tt (1982) and the one in (8), we re-estimate a case (N"10, J"50 and M"4) with the two alternative formulations. All these three algorithms provide identical estimates but there are some di!erences in time cost. The conventional algorithm took 1.103 CPU seconds on average per replication to converge and the algorithm in (8) took 1.333 CPU seconds. Our iterative algorithm's time cost is 1.152 CPU seconds on average. Thus, our iterative algorithm is slightly more time consum-ing than the conventional one but is less so than that in (8). Subsequently, we have done more experiments (reported in Table 3). For a case with larger

N"50, J"50 and M"16, the conventional algorithm took 13.486 CPU seconds; our iterative algorithm took 13.623 CPU seconds; and the algorithm (8) took 19.483 CPU seconds.

(8)

Table 1

Error component group e_!ect model: main design

True parameters:b₁"0,b₂"1, ando"0.3; balanced panels

Mean Em.SD lnlk CPU Mean Em.SD lnlk CPU

N"10; J"50 N"10; J"50

M"2 M"16

b₁ !0.0003 (0.1125) !234.937 0.63 !0.0052 (0.0999) !233.865 3.95

b₂ 1.0174 (0.1037) 1.0144 (0.1006)

o 0.2420 (0.0604) 0.2905 (0.0773)

M"4 M"20

b₁ !0.0046 (0.1067) !233.937 1.15 !0.0051 (0.0999) !233.865 5.86

b₂ 1.0165 (0.1012) 1.0144 (0.1006)

o 0.2840 (0.0745) 0.2905 (0.0772)

M"8 M"30

b₁ !0.0052 (0.1007) !233.863 2.06 !0.0051 (0.0999) !233.866 7.26

b₂ 1.0143 (0.1006) 1.0144 (0.1006)

o 0.2907 (0.0776) 0.2905 (0.0772)

N"10; J"100 N"10; J"100

M"2 M"8

b₁ !0.0027 (0.0768) !471.00 1.23 !0.0028 (0.0692) !468.38 4.68

b₂ 1.0110 (0.0692) 1.0101 (0.0670)

o 0.2418 (0.0412) 0.2960 (0.0553)

M"4 M"16

b₁ !0.0027 (0.0711) !468.56 2.30 !0.0026 (0.0692) !468.38 7.85

b₂ 1.0132 (0.0675) 1.0100 (0.0671)

o 0.2873 (0.0514) 0.2961 (0.0553)

N"100; J"50 N"100; J"100

M"2; M"2;

b₁ !0.0043 (0.1208) !2256.81 9.03 !0.0034 (0.0847) !4516.42 14.34

b₂ 1.0043 (0.0387) 1.0052 (0.0270)

o 0.2069 (0.0342) 0.2038 (0.0213)

M"4; M"4;

b₁ !0.0087 (0.1497) !2192.21 18.00 !0.0057 (0.1144) !4382.59 30.26

b₂ 1.0560 (0.0392) 1.0611 (0.0273)

o 0.1970 (0.0370) 0.1901 (0.0251)

M"8; M"8;

b₁ !0.0087 (0.1389) !2180.86 28.84 !0.0101 (0.1115) !4357.31 66.23

b₂ 1.0274 (0.0446) 1.0402 (0.0297)

o 0.2519 (0.0493) 0.2354 (0.0307)

M"16; M"16;

b₁ !0.0149 (0.1209) !2177.24 55.53 !0.0031 (0.0960) !4348.72 112.42

b₂ 0.9967 (0.0494) 1.0111 (0.0318)

o 0.3009 (0.0590) 0.2832 (0.0366)

M"20; M"20;

b₁ !0.0077 (0.1050) !2176.87 69.09 !0.0021 (0.0833) !4347.81 139.20

b₂ 0.9901 (0.0495) 1.0043 (0.0320)

o 0.3111 (0.0608) 0.2938 (0.0375)

(9)

Table 2

Likelihood ratio test}level of signi"cance

H 0:b2"1

N J M 10% 5% 2% 1%

10 50 2 15.75 9.00 4.75 2.25

10 50 4 14.25 7.25 3.50 2.50

10 50 8 13.00 6.00 2.00 1.25

10 50 16 13.00 5.50 2.00 1.25

100 50 4 63.00 52.75 40.25 34.50

100 50 8 34.25 26.25 18.75 14.25

100 50 16 16.75 12.00 6.75 3.75

100 50 20 13.50 7.50 4.00 2.75

the conventional Gaussian quadrature for the case withN"100 as under#ow problems occurred when Ms greater than 4 were used (Borjas and Sueyoshi 1994, Appendix A.2.2). On the contrary, our algorithm is stable for all replica-tions withM"2}20. The estimates ofbs with variousMandJare all unbiased. There are downward biases in the estimate ofo. The biases decrease as M in-creases from 2 to 16 or 20. Except forb₁, the Em.SDs of the estimates ofb₂and

otend to increase withM. For a larger number of groupsJ, the properMtends to be slightly larger so as to achieve a small bias. The lnlk values show better goodness of"t whenM"16 or 20 is used. It is evident from these Monte Carlo results thatM"4 is too small forN"100 as the biases ino are substantial. Borjas and Sueyoshi (1991) reported statistical inaccuracy on the level of signi"cance in hypothesis testing based on random e!ect probit estimates with the conventional Gaussian quadrature algorithm. With a conventional "fth percent nominal level of signi"cance, the actual level of signi"cance can be more than 40th percent. This problem occurred because a four-point quadrature was the one used for their reported Monte Carlo results (Borjas and Sueyoshi 1994, Tables 1 and 2). Table 2 reports results on the likelihood ratio test for the null hypothesisb₁"1 based on variousMfor our study. The inaccurate results in Borjas and Sueyoshi are recon"rmed when M"4 is used for N"100. But whenMincreases, the degree of inaccuracy in the level of signi"cance decreases. While there are still some discrepancies whenM"20 is used, the di!erences are reasonably small. The discrepancies are in general smaller for the case with

N"10 than the case withN"100. These results indicate the importance of the proper selection ofM.

(10)

Table 3

Error component group e_!ect model: additional designs

True Mean Em.SD lnlk CPU Mean Em.SD lnlk CPU

N"100; J"50 N"100; J"50

M"20 M"30

b₁ 0.0 !0.0011 (0.1261) !1698.98 53.62 0.0049 (0.1237) !1696.54 81.63

b₂ 1.0 1.1011 (0.0553) 1.0709 (0.0576)

o 0.6 0.5088 (0.0433) 0.5381 (0.0444)

N"100; J"50 N"100; J"100

M"16 M"20

b₁ 0.0 !0.0486 (0.2188) !1985.04 161.72 !0.0246 (0.1696) !3966.58 361.02

b₂ 1.0 0.9786 (0.0590) 0.9982 (0.0350)

b₃ 0.5 0.4853 (0.0737) 0.4989 (0.0500)

b₄ 0.0 0.0002 (0.0152) !0.0002 (0.0112)

b₅ !0.5 !0.4870 (0.0502) !0.4964 (0.0334)

b₆ !1.0 !0.7052 (0.4808) !0.9167 (0.3108)

o 0.3 0.3288 (0.0754) 0.3034 (0.0433)

(N"10,J"25) (N"10,J"25) (N"100,J"25) (N"100,J"25)

M"4 M"16

b₁ 0.0 !0.0068 (0.1648) !1215.31 8.43 0.0054 (0.1158) !1207.22 34.79

b₂ 1.0 1.0385 (0.0632) 0.9905 (0.0654)

o 0.3 0.2226 (0.0654) 0.3110 (0.0793)

M"8 M"(4; 16)

b₁ 0.0 0.0038 (0.1489) !1209.09 17.94 0.0038 (0.1163) !1207.25 29.74

b₂ 1.0 1.0115 (0.0622) 0.9945 (0.0620)

o 0.3 0.2753 (0.0703) 0.3053 (0.0734)

(N"10,J"45)

(N"100,J"5) N"50; J"50

M"(4; 16) M"8

b₁ 0.0 !0.0058 (0.1130) !428.65 6.32 !0.0036 (0.1178) !1107.56 13.62

b₂ 1.0 0.9977 (0.0828) 1.0128 (0.0502)

o 0.3 0.3030 (0.0827) 0.2809 (0.0512)

N"500; J"50 N"500; J"50

M"4 M"20

b₁ 0.0 0.0056 (0.1678) !10775.65 92.99 !0.0030 (0.1681) !10648.84 370.80

b₂ 1.0 1.0723 (0.0349) 1.0112 (0.0716)

o 0.3 0.1647 (0.0413) 0.2754 (0.0950)

M"10 M"30

b₁ 0.0 0.0066 (0.2123) !10667.62 201.39 0.0103 (0.1340) !10639.28 607.10

b₂ 1.0 1.0738 (0.0547) 0.9791 (0.0749)

o 0.3 0.1832 (0.0710) 0.3219 (0.0988)

N"500; J"100 N"500; J"100

M"36 M"42

b₁ 0.0 !0.0078 (0.1237) !21263.68 1373.98 !0.0083 (0.1108) !21259.62 1572.32

b₂ 1.0 1.0115 (0.0577) 1.0018 (0.0601)

o 0.3 0.2798 (0.0769) 0.2937 (0.0800)

(11)

but have smaller variances than those with o"0.3. The estimates of bs have slightly larger upward biases. The recursive algorithm is numerically stable for models with largeoand those with more regressors. For the latter, four more regressors, sayx

3to x6, are introduced in addition to the constant term and

the regressorxin (9).x

3is a uniform random variable.x4is an ordered discrete

variable taking values from 1 to 5, of which their occurrence probabilities are, respectively, 0.1, 0.2, 0.2, 0.3, and 0.2.x

5is a dichotomous indictor with equal

probabilities for its two categories 0 and 1. All these three additional regressors are i.i.d. for alliandj. The fourth additional regressorx

6is a uniform random

variable which is independent across groups but is invariant for members within a group. The true coe$cients of these additional regressors are set to (b₃, b₄, b₅, b₆)"(0.5, 0.0,!0.5,!1.0). Except the estimates of b₆, all the estimates have small biases. WithJ"50, the estimate ofb₆has a 30% down-ward bias. WhenJincreases to 100, the downward bias is reduced to only 10%. For x

6, because it is a &time'invariant variable, the group dimension J plays

a crucial role.

All the preceding results are for balanced panels. It remains of interest to investigate the estimation of unbalanced panels. The preceding results in Table 1 indicate that, forN"10, M"4 is su$cient but forN"100,M"16 will be appropriate. In an unbalanced sample, some panels might have small

Nbut the others have largeN. An issue for unbalanced panels is on the selection ofM. The third part of Table 3 reports Monte Carlo results on the estimation of models with unbalanced panels. The heading with (N"10,J"25) and (N"100,J"25) refers to unbalanced panels with a total of 50 groups; among them half have N"10 and the remaining half have N"100. With these unbalanced panels,M"4 is insu$cient as much downward bias appears in the estimate ofo. M"16 is needed. These results are expected as the largerMis needed for long panels. With long panels in a sample, the presence of short panels will not ease the demand for the largerMbut will not pose additional burden either. However, the strategy for the selection of a single, su$ciently largeMto accomodate panels of various lengths is conservative but expensive. A better strategy may be to select varying M for each group. In Table 3,

M"(4;16) refers to the selection ofM"4 for groups withN"4 butM"16 forN"100. The results indicate the latter strategy is desirable. Its estimates are slightly more accurate, the time cost is less, and the lnlk value is similar to the one of a constantM"16. The unbalanced panels design with (N"10, J"45) and (N"100,J"10) provides additional evidence.

The remaining part of Table 3 reports estimates for samples withN"500. These estimates provide more evidence on the need of using a larger number of Gaussian points when the&time' dimensionN becomes larger. The case with

(12)

Table 4 Fix e!ect model True parameter:b₂"1

N J Mean Em.SD JH Iter CPU

10 50 b₂ 1.1655 0.1395 45.37 5.84 0.05

10 100 b₂ 1.1637 0.0997 90.97 5.98 0.12

50 50 b₂ 1.0282 0.0409 49.99 5.51 0.27

100 50 b₂ 1.0118 0.0256 50.00 5.25 0.62

100 100 b₂ 1.0132 0.0190 100.00 5.44 1.29

500 50 b₂ 1.0031 0.0127 50.00 5.06 2.58

500 100 b₂ 1.0030 0.0087 100.00 5.15 5.38

(10 45) b₂ 1.0663 0.0799 46.02 5.67 0.11

(100 5)

(10 25) b₂ 1.0217 0.0374 47.80 5.44 0.32

(100 25)

3We experimented with the algorithm in (8) for a case withN"500,J"50 andM"20. That algorithm did not encounter any numerical under#ow problem in that case. It provided similar coe$cient and likelihood estimates but its CPU time cost was 582.77 s per replication. The latter is much more than the 370.82 s for the recommended iterative algorithm.

4It has been shown in Chamberlain (1980) that ifNremains"nite whenJgoes to in"nity, the FEPE ofbmay not be consistent. The FEPE ofbis consistent and distribution-free with respect to the distribution ofuifNgoes to in"nity.

impossibility with the conventional quadrature.3The estimates ofbs are again unbiased for allMs from 4 to 30. With a smallM, the magnitude of bias forois larger than those in Table 1 with small and moderateNs. WithMbeing 20 or 30 forJ"50 andM"36 or larger forJ"100, the biases ofobecome reasonably small. However, their Em.SDs are larger than those ofN"100 in Table 1. The latter's poor statistical property must be due to the possibility that the Gaussian quadrature approximation becomes poorer asNbecomes larger.

For comparison, some results on the estimates of the"xed e!ect probit panel model are provided in Table 4. The FEPE can be e!ectively derived from the Newton}Raphson algorithm as described in Hall (1978). Borjas and Sueyoshi (1994) compared the performance of the FEPE with the MLE of a random e!ect model withN"100 andM"4. Here we supplement their comparisons with a few of our Monte Carlo designs. The FEPE provides estimates ofb₂andu

js. In

a groupj, if all its members have the same discrete response, it is known that the FEPE of u

jwill be in"nity. In Table 4, JHrefers to the (average) number of

groups in a sample where not all members of a group have the same response. The estimates are consistent only ifNgoes to in"nity (Chamberlain, 1980).4The

(13)

FEPE is computationally simpler and inexpensive even with large N. The number of iterations (iter) for convergence of the Newton}Raphson algorithm is almost invariant with respect toNandJ. The FEPEs ofb₂have larger biases and variances than those of the random e!ect estimates for models with small

N"10. But asNincreases to 50, the bias is reduced and is only slightly larger than that of the random e!ect estimate (in Table 3). ForN"100 or large, the estimates ofb₂are unbiased. ForN"50 or larger, the Em.SDs of the FEPEs of

b₂can even be smaller than those of the random e!ect estimates. In conclusion, the FEPEs can be preferred to the Gaussian quadrature random e!ect MLEs whenNis large. But for small or moderateN, the FEPE would not be a better procedure. This comparison con"rms once more the conclusion in Borjas and Sueyoshi (1994).

Acknowledgements

I appreciate having valuable comments and suggestions from two anonymous referees and an associated editor. Financial support from the RGC of Hong Kong under grant no. HKUST595/96H for my research is gratefully acknow-ledged.

References

Abramowitz, M., Stegun, I., 1964. Handbook of mathematical functions with formulas, graphs, and mathematical tables. National Bureau of Standards Applied Mathematics Series No. 55, US Government Printing O$ce, Washington, D.C.

Borjas, G.J., Sueyoshi, G.T., 1994. A two-stage estimator for probit models with structural group e!ects. Journal of Econometrics 64, 165}182.

Butler, J.S., Mo$tt, R., 1982. A computationally e$cient quadrature procedure for the one-factor multinomial probit model. Econometrica 50, 761}764.

Chamberlain, G., 1980. Analysis of covariance with qualitative data. Review of Economic Studies 47, 225}238.

Hall, B.H., 1978. A general framework for time series-cross section estimation. Annales de l'INSEE 30}31, 177}202.

Heckman, J.J., Willis, R.J., 1975. Estimation of a stochastic model of reproduction: an econometric approach. in: Terleckyj, N. (Ed.), Household Production and Consumption. Cambridge Univer-sity Press, New York, NY.

Lee, L.F., 1996. Estimation of dynamic and arch tobit models, Department of Economics. HKUST Working paper no. 96/97-2.

Press, W.H., Teukolsky, S.A., Vetterling, W.T., Flannery, B.P., 1992. Numerical Recipes, 2nd Edition, Cambridge University Press, New York.