(c) Pongsa Pornchaiwiseskul, Faculty of Economics, Chulalongkorn University
1
Binary Choice
!LPM
!logit
!logistic regresion
!probit
Multiple Choice
!Multinomial Logit
(c) Pongsa Pornchaiwiseskul, Faculty of Economics, Chulalongkorn University
3
(c) Pongsa Pornchaiwiseskul, Faculty of Economics, Chulalongkorn University
4
(c) Pongsa Pornchaiwiseskul, Faculty of Economics, Chulalongkorn University
5
Question: What determines the choice selection?
Model to determine the probability of an event under a given condition (value of independent variables)
Pr(choice#j)=F j (X 1 ,X 2 ,1,X K )
where X3s are determinants for the
probability.
(c) Pongsa Pornchaiwiseskul, Faculty of Economics, Chulalongkorn University
7
Note that
1)
2) function F j () must return a value between 0 and 1
∑ =
j
j ) 1
# Pr( choice
(c) Pongsa Pornchaiwiseskul, Faculty of Economics, Chulalongkorn University
8
Example
JOIN=1 if the observation will join the government-run health insurance program
= 0, otherwise
(c) Pongsa Pornchaiwiseskul, Faculty of Economics, Chulalongkorn University
9
JA=1 if the observation will join Plan A
= 0, otherwise
JB=1 if the observation will join Plan B
= 0, otherwise
JC=1 if the observation will join Plan C
= 0, otherwise
Note that JA+JB+JC=1 always.
General Structure
Note that
) ,...,
, ( )
1
Pr( JOIN = = F X 1 X 2 X K
) ,...,
, ( 1
) 0
Pr( JOIN = = − F X 1 X 2 X K
1 ) ,...,
, (
0 ≤ F X 1 X 2 X K ≤
(c) Pongsa Pornchaiwiseskul, Faculty of Economics, Chulalongkorn University
11
Define
Assumption of LPM Linearity of F( . )
Note that there is no error term
K K X X
X
P = β 1 1 + β 2 2 + L + β ) 1
Pr( =
= JOIN P
(c) Pongsa Pornchaiwiseskul, Faculty of Economics, Chulalongkorn University
12
Formulation of LPM
E(JOIN)=(1)P+(0)(1-P)=P
==> JOIN=P+v
where v is an error term. E(v)=0
=> OLS is valid but not the best. Why?
) 1
2 (
2 1
1 X X X v - - - -
JOIN = β + β + L + β K K +
(c) Pongsa Pornchaiwiseskul, Faculty of Economics, Chulalongkorn University
13
Note that
V( ν ) = V(JOIN) but
V(JOIN) =(1-P) 2 P+(0-P) 2 (1-P)
= P(1-P)
==> V( ν ) is not constant. It depends on the independent variables (X3s)
==> Violation of a CLRM
assumption or ν is heteroscedastic
Define
where
) 1
( 1
P w P
= −
) 2
* (
*
* 2 2
* 1 1
*
- - - ν -
β β β
+ +
+
+
=
K K X
X X
JOIN
L
ν ν w
K k
wX X
wJOIN JOIN
k k
=
=
=
=
*
*
*
,...,
for 1
(c) Pongsa Pornchaiwiseskul, Faculty of Economics, Chulalongkorn University
15
Note that
==> OLS is BLUE for Model (2)
1
) 1
) ( 1
( 1
) ( )
( * 2
=
− −
=
=
P P P
P V w
V ν ν
(c) Pongsa Pornchaiwiseskul, Faculty of Economics, Chulalongkorn University
16
Estimation of LPM
Step 1 run OLS for unweighted model (1)
==> JOIN = X ββββ
Note that JOIN is the estimate for P
==> OLS is BLUE for Model (2)
(c) Pongsa Pornchaiwiseskul, Faculty of Economics, Chulalongkorn University
17
Step 2 compute the weight
Step 3 compute
Step 4 estimate the weighted model (2) using OLS
) 1
( 1
JOIN JOIN
w
−
=
*
* 2
* 1
* , X , X ,..., X K JOIN
Step 5 re-compute JOIN using the new set of ββββ .
Note that LPM does not assure that
or 0 ( , ,..., ) 1
1 0
2
1 ≤
≤
≤
≤
X K
X X
F
JOIN
(c) Pongsa Pornchaiwiseskul, Faculty of Economics, Chulalongkorn University
19
Correction
If X ββββ <0, set JOIN=0 If X ββββ >1, set JOIN=1
(c) Pongsa Pornchaiwiseskul, Faculty of Economics, Chulalongkorn University
20
P
X β
1
1
(c) Pongsa Pornchaiwiseskul, Faculty of Economics, Chulalongkorn University
21
Less expensive in computer time. No non-linear equations
is the effect of X on the probability. In general, the explanatory variables should be unitless or are
expressed in percentage
k
X k
P = β
∂
∂
Assumption of Logit F() is a logistic function
Note that always.
K K Z
X X
X Z
P e
β β
β + + +
=
= + −
2 L
2 1
1
1 1
1 ) (
0 ≤ F Z ≤
No error term
(c) Pongsa Pornchaiwiseskul, Faculty of Economics, Chulalongkorn University
23
1.0
0.5
Z
e − Z
+ 1
1
0
(c) Pongsa Pornchaiwiseskul, Faculty of Economics, Chulalongkorn University
24
Note that OLS does not apply ML Estimation of Logit model or
Note that Y=JOIN
∑
∏
=
=
−
−
− +
=
−
=
n
i
i i
i i
n
i
Y i
Y i
P Y
P Y
L
P P
L
i i1 1
) 1 (
)]
1 ln(
) 1
( ) ln(
[ ln
max
) 1
( ) ( max
β
β
(c) Pongsa Pornchaiwiseskul, Faculty of Economics, Chulalongkorn University
25
Note that
First-order conditions For k=1,1,K
0 1 ]
) 1
( [
1 ] ln [
1 1
+ =
−
−
= +
∂
∂
∑
∑
=
= −
−
n
i
Z i
ki n
i
Z Z i
ki k
Zi i i i
e Y e
X
e Y e
L X β
e Z
P = +
− 1
1 1
Solving FOC for ML estimates.
Second-order Conditions
yields Variance-covariance matrix of
∑
∑
=
−
=
+
−−
−
− +
∂ =
∂
∂
n
i
Z i
ki ji n
i
Z Z i
ki ji k
j
Zi i i i
e Y e
X X
e Y e
X L X
1
2 1
2 2
] ) 1
( ) 1
( [
) ] 1
[ ( ln
β β
β ˆ
(c) Pongsa Pornchaiwiseskul, Faculty of Economics, Chulalongkorn University
27
Variance-Covariance Matrix for
Note that it is not the estimated VC matrix.
Do Z-test or Chi-square test instead of t- test or F-test on parameters
2 1
) ln ( ˆ
−
∂
∂
− ∂
=
k j
V L
β β β
β ˆ
(c) Pongsa Pornchaiwiseskul, Faculty of Economics, Chulalongkorn University
28
Interpretation
sign of β k ==>direction of the effect of X k on the probability to JOIN.
k k
Z
k
Zi i
e e X
P β { } β
) 1
( 2 = +
= +
∂
∂
−
−
(c) Pongsa Pornchaiwiseskul, Faculty of Economics, Chulalongkorn University
29
No R 2 for a logit model since there is no error term.
Define
It is a measure for goodness-of-fit.
JOIN>0.5 ==> predict that JOIN=1 JOIN<0.5 ==> predict that JOIN=0
(n) size sample
n predictio correct
= #
− R 2 pseudo
Assumption of Logistic Regression F( . ) is a logistic function but the
observation(experiment) for each given set of independent variables (X) will be repeated several times.
Only the proportion of JOIN=1 can
be observed.
(c) Pongsa Pornchaiwiseskul, Faculty of Economics, Chulalongkorn University
31
From Logit Model
Note that P is the expected proportion of population JOINing given X3s
K K X X
P X
P = β + β + + β
− 1 1 2 2 L
ln 1
(c) Pongsa Pornchaiwiseskul, Faculty of Economics, Chulalongkorn University
32
Define
R i =observed proportion of observation with the same value of X i that JOIN.
Derived Model
i Ki K i
i i
i X X X
R
R = β + β + + β + ν
− 1 1 2 2 L
ln 1
) Why?
1 ( ) 1
(
i i
i
i N R R
V ν = −
(c) Pongsa Pornchaiwiseskul, Faculty of Economics, Chulalongkorn University
33
Define Estimation where
*
*
* 2
* 1
*
2
1i
X
iX
Ki iX
R i = β + β + + β K + ν )
1
( i
i
i R R
N
w = −
i i i
ki i ki
i i i
w
K k
X w X
R w R
R
iν ν =
=
=
= −
*
*
*
,..., 1 ln 1
for
=> OLS is BLUE
Interpretation of the parameters same as those for logit model as the
underlying function is also logistic
(c) Pongsa Pornchaiwiseskul, Faculty of Economics, Chulalongkorn University
35
Assumption of Probit
F() is a cumulative distribution function of a standard normal.
Note that always.
K K X X
X Z
Z P
β β
β + + +
= Φ
=
2 L
2 1
1
) (
1 )
(
0 ≤ Φ Z ≤
No error term
(c) Pongsa Pornchaiwiseskul, Faculty of Economics, Chulalongkorn University
36
1.0
0.5
Z
e − Z
+ 1
1
0
)
( Z
Φ
(c) Pongsa Pornchaiwiseskul, Faculty of Economics, Chulalongkorn University
37
Assumption of Multinomial Logit Define PA i = Pr(JA i =1)
PB i = Pr(JB i =1) PC i = Pr(JC i =1)
Choose the choice of plan C as the reference.
ZA
ii
i e
PC PA =
ZB
ii
i e
PC PB =
where where
Ki K
i i
i X X X
ZA = α 1 1 + α 2 2 + L + α
Ki K
i i
i X X X
ZB = β 1 1 + β 2 2 + L + β
(c) Pongsa Pornchaiwiseskul, Faculty of Economics, Chulalongkorn University
39 i
i
i i
i i
ZB i ZA
ZB ZA
i i i
ZB ZA
i i i
e PC e
e PC e
PB PA
e PC e
PB PA
+
= +
+ +
+ = +
+ + =
1
1 1 1
(c) Pongsa Pornchaiwiseskul, Faculty of Economics, Chulalongkorn University
40 i
i i
i i
i
ZB ZA
ZB i
ZB ZA
ZA i
e e
PB e
e e
PA e
+
= +
+
= +
1
1
(c) Pongsa Pornchaiwiseskul, Faculty of Economics, Chulalongkorn University
41
ML Estimation of Multinomial Logit model
Solving FOC yields
)]
1 ln(
) 1
(
) ln(
) ln(
[ ln
max
) 1
( ) (
) (
max
1 1
) 1
(
i i
i i
n
i
i i
i i
n
i
JB JA i
i JB
i JA
i
PB PA
JB JA
PB JB
PA JA
L
PB PA
PB PA
L
i i i i−
−
−
− +
+
=
−
−
=
∑
∏
=
=
−
−
or
β β
β α ˆ , ˆ
Interpretation
ZA k ZA
ZA
ki i
i i
i
e e
e X
PA α
+
= +
∂
∂
1
( k )
ZB k
ZA ZA
ZA ZA
i i
i i
i
e e e
e
e α + β
+
− + 2
) 1
(
ZA k ZA
ZA ZA
ZA ZA
i i
i
i i
i
e e
e e
e
e α
+
− + +
= +
1 1 1
ZA k ZA
ZB ZA
ZA ZA
i i
i
i i
i
e e
e e
e
e β
+ +
+
− +
1
1
(c) Pongsa Pornchaiwiseskul, Faculty of Economics, Chulalongkorn University
43
Interpretation
own-effect cross-effect
sign of α k ==>direction of the own-effect of X k on the probability to JOIN A.
sign of β k ==>direction of the cross-effect of X k on the probability to JOIN A.
k i i k
i i
ki
i PA PA PA PB
X
PA = − α − β
∂
∂ ( 1 )
+ +
(c) Pongsa Pornchaiwiseskul, Faculty of Economics, Chulalongkorn University
44
Nested Logit /Serial Logit Ordered Logit
Generalized Extreme-Value (GEV)
(c) Pongsa Pornchaiwiseskul, Faculty of Economics, Chulalongkorn University
45
Models for Limited Dependent Varaibles
Censored Regression
Tobit Models