• Tidak ada hasil yang ditemukan

APPENDIX 4A

Dalam dokumen PREFACE - Spada UNS (Halaman 119-124)

10. In this text, we will largely rely on the OLS method for practical rea- sons: (a) Compared to ML, the OLS is easy to apply; (b) the ML and OLS estimators of β1andβ2are identical (which is true of multiple regressions too); and (c) even in moderately large samples the OLS and ML estimators ofσ2do not differ vastly.

However, for the benefit of the mathematically inclined reader, a brief introduction to ML is given in the appendix to this chapter and also in Appendix A.

and written as1

LF(β1,β2,σ2)= 1 σn

2πnexp

−1 2

(Yiβ1β2Xi)2 σ2

(4)

Themethod of maximum likelihood,as the name indicates, consists in estimating the unknown parameters in such a manner that the probability of observing the given Y’s is as high (or maximum) as possible. Therefore, we have to find the maximum of the function (4). This is a straightforward exercise in differential calculus. For differentiation it is easier to express (4) in the log term as follows.2(Note:ln=natural log.)

ln LF= −nlnσn

2ln (2π)−1 2

(Yiβ1β2Xi)2 σ2

= −n

2lnσ2n

2ln (2π)−1 2

(Yiβ1β2Xi)2

σ2 (5)

Differentiating (5) partially with respect to β1,β2, andσ2, we obtain

ln LF

∂β1 = −1 σ2

(Yiβ1β2Xi)(−1) (6)

ln LF

∂β2 = −1 σ2

(Yiβ1β2Xi)(−Xi) (7)

ln LF

∂σ2 = − n 2σ2 + 1

2σ4

(Yiβ1β2Xi)2 (8)

Setting these equations equal to zero (the first-order condition for opti- mization) and letting β˜1,β˜2, and σ˜2 denote the ML estimators, we obtain3

1

˜ σ2

(Yi− ˜β1− ˜β2Xi)=0 (9)

1

˜ σ2

(Yi− ˜β1− ˜β2Xi)Xi =0 (10)

n 2σ˜2 + 1

2σ˜4

(Yi− ˜β1− ˜β2Xi)2=0 (11)

1Of course, if β1,β2, andσ2are known but the Yiare not known, (4) represents the joint probability density function—the probability of jointly observing the Yi.

2Since a log function is a monotonic function, ln LF will attain its maximum value at the same point as LF.

3We use˜(tilde) for ML estimators and ˆ(cap or hat) for OLS estimators.

After simplifying, Eqs. (9) and (10) yield Yi =˜1+ ˜β2

Xi (12) YiXi= ˜β1

Xi+ ˜β2

X2i (13)

which are precisely the normal equations of the least-squares theory ob- tained in (3.1.4) and (3.1.5). Therefore, the ML estimators, the β˜’s, are the same as the OLS estimators, the βˆ’s, given in (3.1.6) and (3.1.7). This equal- ity is not accidental. Examining the likelihood (5), we see that the last term enters with a negative sign. Therefore, maximizing (5) amounts to minimiz- ing this term, which is precisely the least-squares approach, as can be seen from (3.1.2).

Substituting the ML (=OLS) estimators into (11) and simplifying, we obtain the ML estimator of σ˜2as

˜ σ2= 1

n

(Yi− ˜β1− ˜β2Xi)2

= 1 n

(Yi− ˆβ1− ˆβ2Xi)2 (14)

= 1 n

uˆ2i

From (14) it is obvious that the ML estimator σ˜2 differs from the OLS estimator σˆ2=[1/(n−2)]

ˆ

u2i, which was shown to be an unbiased esti- mator of σ2 in Appendix 3A, Section 3A.5. Thus, the ML estimator of σ2is biased. The magnitude of this bias can be easily determined as follows.

Taking the mathematical expectation of (14) on both sides, we obtain

E(σ˜2)= 1

nE uˆ2i

= n−2 n

σ2 using Eq. (16) of Appendix 3A, (15) Section 3A.5

=σ2−2 2

which shows that σ˜2 is biased downward (i.e., it underestimates the true σ2) in small samples. But notice that as n, the sample size, increases in- definitely, the second term in (15), the bias factor, tends to be zero. There- fore, asymptotically (i.e., in a very large sample), σ˜2 is unbiased too, that is, limE(σ˜2)=σ2 as n→ ∞. It can further be proved that σ˜2 is also a

consistentestimator4; that is, as nincreases indefinitelyσ˜2converges to its true value σ2.

4A.2 MAXIMUM LIKELIHOOD ESTIMATION OF FOOD EXPENDITURE IN INDIA

Return to Example 3.2 and regression (3.7.2), which gives the regression of food expenditure on total expenditure for 55 rural households in India.

Since under the normality assumption the OLS and ML estimators of the regression coefficients are the same, we obtain the ML estimators as β˜1= ˆβ1=94.2087 and β˜2= ˆβ2=0.4386. The OLS estimator of σ2 is

ˆ

σ2=4469.6913, but the ML estimator is σ˜2=4407.1563, which is smaller than the OLS estimator. As noted, in small samples the ML estimator is downward biased; that is, on average it underestimates the true variance σ2. Of course, as you would expect, as the sample size gets bigger, the difference between the two estimators will narrow. Putting the values of the estimators in the log likelihood function, we obtain the value of −308.1625. If you want the maximum value of the LF, just take the antilog of −308.1625. No other values of the parameters will give you a higher probability of obtaining the sample that you have used in the analysis.

APPENDIX 4A EXERCISES

4.1. “If two random variables are statistically independent, the coefficient of correlation between the two is zero. But the converse is not necessarily true; that is, zero correlation does not imply statistical independence. How- ever, if two variables are normally distributed, zero correlation necessarily implies statistical independence.” Verify this statement for the following joint probability density function of two normally distributed variables Y1

and Y2 (this joint probability density function is known as the bivariate normal probability density function):

f(Y1,Y2)= 1 2π σ1σ2

1ρ2exp

1 2(1ρ2)

× Y1µ1

σ1

2

2ρ(Y1µ1)(Y2µ2)

σ1σ2 + Y2µ2

σ2

2

4SeeApp. Afor a general discussion of the properties of the maximum likelihood estima- tors as well as for the distinction between asymptotic unbiasedness and consistency. Roughly speaking, in asymptotic unbiasedness we try to find out the limE(σ˜n2) asntends to infinity, wherenis the sample size on which the estimator is based, whereas in consistency we try to find out how σ˜n2behaves as nincreases indefinitely. Notice that the unbiasedness property is a repeated sampling property of an estimator based on a sample of given size, whereas in con- sistency we are concerned with the behavior of an estimator as the sample size increases indefinitely.

whereµ1=mean of Y1

µ2=mean of Y2

σ1=standard deviation of Y1

σ2=standard deviation of Y2

ρ =coefficient of correlation between Y1andY2

4.2. By applying the second-order conditions for optimization (i.e., second- derivative test), show that the ML estimators ofβ1,β2, andσ2 obtained by solving Eqs. (9), (10), and (11) do in fact maximize the likelihood function (4).

4.3. A random variable Xfollows the exponential distributionif it has the fol- lowing probability density function (PDF):

f(X)=(1)eX for X>0

=0 elsewhere

where θ >0 is the parameter of the distribution. Using the ML method, show that the ML estimator of θ isθˆ=

Xi/n, where nis the sample size.

That is, show that the ML estimator of θ is the sample mean X¯.

119 Beware of testing too many hypotheses; the more you torture the data, the more likely they are to confess, but confession obtained under duress may not be admis- sible in the court of scientific opinion.1

As pointed out in Chapter 4, estimation and hypothesis testing constitute the two major branches of classical statistics. The theory of estimation con- sists of two parts: point estimation and interval estimation. We have dis- cussed point estimation thoroughly in the previous two chapters where we introduced the OLS and ML methods of point estimation. In this chapter we first consider interval estimation and then take up the topic of hypothesis testing, a topic intimately related to interval estimation.

5.1 STATISTICAL PREREQUISITES

Before we demonstrate the actual mechanics of establishing confidence in- tervals and testing statistical hypotheses, it is assumed that the reader is fa- miliar with the fundamental concepts of probability and statistics. Although not a substitute for a basic course in statistics, Appendix A provides the essentials of statistics with which the reader should be totally familiar.

Key concepts such as probability, probability distributions, Type I and Type II errors, level of significance, power of a statistical test, and confidence intervalare crucial for understanding the material covered in this and the following chapters.

1Stephen M. Stigler, “Testing Hypothesis or Fitting Models? Another Look at Mass Extinc- tions,” in Matthew H. Nitecki and Antoni Hoffman, eds., Neutral Models in Biology,Oxford University Press, Oxford, 1987, p. 148.

5

TWO-VARIABLE

REGRESSION: INTERVAL

Dalam dokumen PREFACE - Spada UNS (Halaman 119-124)