• Tidak ada hasil yang ditemukan

Microeconometrics: Binary Dependent Variable

N/A
N/A
Protected

Academic year: 2018

Membagikan "Microeconometrics: Binary Dependent Variable"

Copied!
83
0
0

Teks penuh

(1)

Microeconometrics:

Binary Dependent Variable

Department of Economics

Universitas Padjadjaran

(2)
(3)
(4)
(5)

Additional References

Dougherty,

Introduction to Econometrics

, 4

th

Ed, 2011

*best for basics*

Golder, M., Advanced Quantitative Analysis: Maximum

Likelihood Estimation,

(6)

Estimators we (will) know

Ordinary Least Square (OLS)

estimator

If we have a SLR of and is exogenous,

then we have

Instrumental Variable (IV) estimator

If we have a SLR of and is endogenous,

then we have where

Maximum Likelihood (ML) estimator

(7)

Why uses binary dependent variable?

Observed vs unobserved variables

Suppose we want to analyse socioeconomic

factors underlying some people to:

Corrupt

Smoke

Borrow money

Get a scholarship

(8)

Why uses binary dependent variable?

Observed vs unobserved variables

It would be best to know (observe)

Utility derived from corruption, smoking, or

borrowing money, having a boy/girl-friend(s)…

The actual (factual) cash fow of families

A consistent way of measuring poverty

(9)

Why uses binary dependent variable?

Observed vs unobserved variables

It would be best to know (observe)

Utility derived from corruption, smoking, or

borrowing money, having a boy/girl-friend(s)…

The actual (factual) cash fow of families

A consistent way of measuring poverty

(10)

Why uses binary dependent variable?

Observed vs unobserved variables

What we observe is that

Some people corrupt

Some people smoke

Some people borrow money

Some people get scholarships

(11)

The mechanism

Suppose:

But , the utility of smoking, is unobserved.

We, however, observe

and

(12)

The mechanism

So we estimate

We know the value of , either 0 or 1

Because of this, we may think as if is an event

which outcomes is 0 or 1

Therefore, essentially what we want to know is

(13)

The Linear Probability Model

Using formula for expected value:

]

(14)

The Linear Probability Model

If we estimate

with is either 0 or 1

using OLS, we have a Linear

Probability Model

(15)

The Linear Probability Model

We know from previous lectures about

OLS that:

We assume

we can write as

Therefore we can write our LPM

(16)

LPM Interpretation

Suppose we have a more complete set

of independent variables:

We cannot interpret our ’s as usual,

because

changes ONLY from 0 to 1

(vice versa)

(17)

LPM Interpretation

Suppose we have a more complete set

of independent variables:

If is continuous:

“If increases/decreases by 1 (unit), the

probability of increases/decreases by

percentage points”

(18)

LPM Interpretation

Suppose we have a more complete set of

independent variables:

If is dummy variable (e.g 1=male):

“Suppose there are two individuals who are

identical in every respect but 1 individual is

male the other one is female; The probability

of of male is percentage points higher or

lower (than female)”

(19)
(20)

Limitations of LPM

Distribution of the error term is not following

Normal Distribution, so test statistics are not

robust

Suppose

(21)

Limitations of LPM

Distribution of the error term is not

following Normal Distribution, so test

statistics are not robust

Suppose:

The probability when is

and when is

(22)

Limitations of LPM

Heteroskedasiticity

Since the error term follows Bernoulli

distribution, then

Variance of the error term:

(23)

Limitations of LPM

Nonfulfllment of : Does it make sense to

(24)

What is a better model for

estimating E(y

i

)?

Since probability of an event has to be

between 0 and 1, a good model would be

a nonlinear function of x that its result

never gets negative or larger than 1 !

A class of function that we have already

seen in statistics and satisfy this

(25)

What is a better model for

estimating E(y

i

)?

(26)

What is a better model for E(y

i

)?

We denote CDFs using the letter F

Where F is a CDF

Therefore to model a binary dependent variables we need

to

choose a CDF

and to have an estimation method

appropriate for estimating and

 

¿ ¿

(27)

Solution

We need a math function for , or , or , that

always results in values between 0 and 1

Whatever the values of independent

variables are (can be from to +), the values

of dependent variable will be between 0 and

1

In general:

)

(28)

Solution 1: Logit Model

F can be in the form of

equivalently:

(29)

Solution 1: Logit Model

Taking the log of both sides

Hence

We call

L

i

Logit model

We estimate logit model using

Maximum Likelihood method

(30)

Logit Model: Coefcients &

Marginal Efects

Coefcients are not Marginal Efects

(not directly interprettable)

Because of non-linearity setting in the

model

Therefore

(31)

Logit Model: Coefcients &

Marginal Efects

To get the marginal efect, we need to

diferentiate:

(32)

Solution 2: Probit Model

Suppose we have an equation:

But is unobservable

What we observed is actually , which takes the

value of 1 if and 0 otherwise

(33)

Solution 2: Probit Model

Hence

The distribution of is

standard normal

(34)

Solution 2: Probit Model

Since the normal distribution is

symmetric, we can write

And may be estimated using ML

(35)

Probit Model: Coefcients &

Marginal Efects

Coefcients are not Marginal Efects

(not directly interpretable)

Because of non-linearity setting in the

model

Therefore

(36)

Marginal Efects

To get the marginal efect, we need to

diferentiate:

(37)
(38)
(39)
(40)
(41)

Gender Inequality and Poverty in Indonesia: Evidence from Household Data

Kinanti Z. Patria

(42)
(43)

Estimation of Logit and Probit

Models

We do not use OLS, rather we use the

Maximum Likelihood Method

MLE (Maximum Likelihood Estimator) of the

unknown parameters are the value of the

parameters that maximize the likelihood

function

(44)

MAXIMUM LIKELIHOOD

ESTIMATOR

(45)

Maximum Likelihood Estimator

Remember that our data is Random

Variable

Follows certain probability density

function (pdf) or probability distribution

Suppose we have 5 observations of

variable Y

What is the odds that we will have these

observations from a normal distribution

with ?

(46)

Maximum Likelihood Estimator

Remember that our data is Random

Variable

Follows certain probability density

function (pdf) or probability distribution

Suppose we have 5 observations of

variable Y

What is the odds that we will have these

observations from a normal distribution

with ? ?

(47)

Maximum Likelihood Estimator

“Maximum Likelihood is

just

a

systematic way of searching for the

parameter values of our chosen

distribution that maximize the

(48)

a resource for teaching an econometrics course. There is no need to refer to the author. The content of this slideshow comes from Section R.2 of C. Dougherty, Introduction to Econometrics, fourth edition 2011, Oxford University Press.

Additional (free) resources for both students and instructors may be downloaded from the OUP Online Resource Centre

http://www.oup.com/uk/orc/bin/9780199567089/.

Individuals studying econometrics on their own who feel that they might beneft from participation in a formal course should consider the London School of Economics summer school course

EC212 Introduction to Econometrics

http://www2.lse.ac.uk/study/summerSchools/summerSchool/Home.aspx

or the University of London International Programmes distance learning course EC2020 Elements of Econometrics

(49)

Method of ML

The method of maximum likelihood is

intuitively appealing, because we

attempt to fnd

the values of the true

parameters

that

would have most

likely

produced the data that we in

fact observed.

For most cases of practical interest,

the performance of maximum

(50)

1 L

m

m

some simple examples. • Suppose that you have a

normally-distributed random variable X with unknown

population mean m and

standard deviation s, and that you have a sample of two

observations, 4 and 6. For the time being, we will assume that

s is equal to 1.

0.0 0.1 0.2

0 1 2 3 4 5 6 7 8

0.00 0.02 0.04 0.06

(51)

2

)

(

2

1

2

1

)

(

x

e

x

f

Note constants:

=3.14159

e=2.71828

(52)

L

m

m 0.0175

1 L

m

m

Suppose initially you

consider the hypothesis m

= 3.5. Under this

hypothesis the probability density at 4 would be

0.3521 and that at 6 would be 0.0175.

0.0 0.1 0.2

0 1 2 3 4 5 6 7 8

0.00 0.02 0.04 0.06

(53)

3.5 0.3521 0.0175 0.0062

L

m

m 0.0175

1 L

m

m

The joint probability density, shown in the

bottom chart, is the product of these, 0.0062.

0.00 0.02 0.04 0.06

0 1 2 3 4 5 6 7 8

0.0 0.1 0.2 0.3

(54)

m 0.0540

4.0 0.3989 0.0540 0.0215

L

m

1 L

m

Next consider the

hypothesis m = 4.0. Under this hypothesis the

probability densities associated with the two

observations are 0.3989 and 0.0540, and the joint

probability density is 0.0215.

0.0 0.1 0.2

0 1 2 3 4 5 6 7 8

0.00 0.02 0.04 0.06

(55)

L

m

m 0.1295

3.5 0.3521 0.0175 0.0062 4.0 0.3989 0.0540 0.0215 4.5 0.3521 0.1295 0.0456

1

Next under the hypothesis

m = 4.5, the probability densities are 0.3521 and 0.1295, and the joint

probability density is 0.0456.

0.00 0.02 0.04 0.06

0 1 2 3 4 5 6 7 8

0.0 0.1 0.2 0.3

(56)

4.0 0.3989 0.0540 0.0215 4.5 0.3521 0.1295 0.0456 5.0 0.2420 0.2420 0.0585

L

m

m 0.2420

0.2420

1

Under the hypothesis m = 5.0, the probability

densities are both 0.2420 and the joint probability density is 0.0585.

0.00 0.02 0.04 0.06

0 1 2 3 4 5 6 7 8

0.0 0.1 0.2

(57)

3.5 0.3521 0.0175 0.0062 4.0 0.3989 0.0540 0.0215 4.5 0.3521 0.1295 0.0456 5.0 0.2420 0.2420 0.0585 5.5 0.1295 0.3521 0.0456

L

m

m 0.1295

1

Under the hypothesis m = 5.5, the probability

densities are 0.1295 and 0.3521 and the joint

probability density is 0.0456.

0.0 0.1 0.2 0.3

0 1 2 3 4 5 6 7 8

0.00 0.02 0.04 0.06

(58)

4.0 0.3989 0.0540 0.0215 4.5 0.3521 0.1295 0.0456 5.0 0.2420 0.2420 0.0585 5.5 0.1295 0.3521 0.0456

L

m

L

m

m 0.1295

1

The complete joint density function for all values of m

has now been plotted in the lower diagram. We see that it peaks at m = 5.

0.00 0.02 0.04 0.06

0 1 2 3 4 5 6 7 8

0.0 0.1 0.2

(59)

10

Now we will look at the mathematics of the example. If X is normally distributed with mean m and standard deviation s, its density function is as shown.

2

(60)

11

For the time being, we are assuming s is equal to 1, so the density function simplifies to the second expression.

 2

2 1

2

1

)

(

 

e

X

(61)

12

Hence we obtain the probability densities for the observations where X = 4 and X = 6.

(62)

13

The joint probability density for the two observations in the sample is just the product of their individual densities.

joint density

(63)

14

In maximum likelihood estimation we choose as our estimate of m ,the value that gives us

the greatest joint density for the observations in our sample. This value is associated with the greatest probability, or maximum likelihood, of obtaining the observations in the

sample.

joint density

(64)

MLE AND REGRESSION

ANALYSIS

(65)

1

X

X

i

b

1

b

1

+

b

2

X

i

Y =

b

1

+

b

2

(66)

3

X

X

i

b

1

b

1

+

b

2

X

i

Y =

b

1

+

b

2

(67)

6

Potential values of Y close to b1 + b2Xi will have relatively large densities ...

X

X

i

b

1

b

1

+

b

2

X

i

Y =

b

1

+

b

2

(68)

X

X

i

b

1

b

1

+

b

2

X

i

Y =

b

1

+

b

2

X

7

... while potential values of Y relatively far from b1 + b2Xi will have small

(69)

8

The mean value of the distribution of Yi is b1 + b2Xi. Its standard deviation is

s, the standard deviation of the disturbance term.

X

X

i

b

1

b

1

+

b

2

X

i

Y =

b

1

+

b

2

(70)

9

Hence the density function for the ex ante distribution of Yi is as shown.

X

X

i

b

1

b

1

+

b

2

X

i

Y =

b

1

+

b

2

(71)

10

The joint density function for the observations on Y is the product of their individual densities.

(72)

11

Now, taking b1, b2 and s as our choice variables, and taking the data on Y

and X as given, we can re-interpret this function as the likelihood function

for b1, b2, and s. REMEMBER THIS

(73)

12

We will choose b1, b2, and s so as to maximize the likelihood, given the data

on Y and X. As usual, it is easier to do this indirectly, maximizing the

log-likelihood instead.

2

log

(74)

13

As usual, the frst step is to decompose the expression as the sum of the logarithms of the factors.

Z

log

2

log

2

1

log

...

2

1

log

(75)

14

Then we split the logarithm of each factor into two components. The frst component is the same in each case.

Z

log

2

log

2

1

log

...

2

1

log

(76)

15

Hence the log-likelihood simplifes as shown.

Z

log

2

log

2

1

log

...

2

1

log

(77)

16

To maximize the log-likelihood, we need to minimize Z. But choosing

estimators of b1 and b2 to minimize Z is exactly what we did when we derived

the least squares regression coefcients.

Z

log

2

log

2

1

log

...

2

1

log

(78)

17

Thus, for this regression model, the maximum likelihood estimators of b1 and

b2 are identical to the least squares estimators.

Z

log

2

log

2

1

log

...

2

1

log

(79)

18

As a consequence, Z will be the sum of the squares of the least squares residuals.

where

)

(

...

)

(

where

(80)

19

To obtain the maximum likelihood estimator of s, it is convenient to

rearrange the log-likelihood function as shown.

Z

log

log

2

2

1

log

1

log

(81)

20

Differentiating it with respect to s, we obtain the expression shown.

Z

log

log

2

2

1

log

1

log

2

log

(82)

21

The frst order condition for a maximum requires this to be equal to zero. Hence the maximum likelihood estimator of the variance is the sum of the

squares of the residuals divided by n.

log

log

2

2

1

log

1

log

2

log

(83)

22

Note that this is biased for fnite samples. To obtain an unbiased estimator,

we should divide by nk, where k is the number of parameters, in this case 2.

However, the bias disappears as the sample size becomes large.

Z

log

log

2

2

1

log

1

log

2

log

Referensi

Dokumen terkait

Lamanya proses dalam pengelolaan data pasien, meliputi diagnosa penyakit diabetes mellitus , konsultasi kebutuhan nutrisi makanan / konsultasi gizi serta

o Menyimpulkan pengaruh konsentrasi, luas permukaan bidang sentuh, dan suhu terhadap laju reaksi berdasarkan teori tumbukan, Membedakan diagram energi potensial dari reaksi

bidang pekerjaan sesuai bidang  pekerjaan dengan  memerhatikan  struktur dan  kebahasaan baik  secara lisan maupun  tulis. 3.46

4.1 Panduan Tata Laksana Nasional Pencegahan Penularan HIV dari Ibu ke Bayi tahun 2005 belum berjalan maksimal. Merujuk pada Buku Panduan Tata Laksana Nasional Pencegahan

Kurangnya kerjasama antar sesama teman sebaya, sehingga dari proses pembelajaran tidak mendapatkan hasil yang maksimal ketika diberikan materi rangkaian kata beregu

Penelitian ini memiliki dua tujuan yakni tujuan umum dan tujuan khusus.Tujuan umum dari penelitian ini adalah untuk mendapatkan jumlah produksi yang optimal sehingga dapat

Hutan tropis yang kaya dengan berbagai jenis tumbuhan merupakan sumber daya hayati dan sekaligus gudang senyawa kimia baik berupa senyawa kimia hasil metabolisme primer yang

STRATEGI KOLEJ UNIVERSITI INSANIAH PEMBANGUNA N FIZIKAL PEMBANGUNA N FIZIKAL • Pembinaan kampus tetap oleh Kerajaan Negeri • Penswastaan pengurusan asrama • Perkongsian