Sampling Variance of the OLS estimator

(1)

Introduction to

Econometrics

Ekki Syamsulhakim Undergraduate Program Department of Economics

(2)

Sampling Variance of the

OLS estimator

• _{We know that when is not biased}

• _{The variance of can be computed}

using the formula:

(3)

(4)

(5)

Estimator Variance, Perfect

(6)

Estimator Variance, Perfect

(7)

Estimator Variance, Perfect

(8)

Estimator Variance, Perfect

(9)

Estimator Variance, Perfect

(10)

Estimator Variance, Perfect

(11)

Estimator Variance, Perfect

(12)

Estimator Variance, Perfect

(13)

(14)

(15)

(16)

(17)

Inference

• _{We assume that}

_{unobserved error}

_is

normally distributed in the population

(18)

(19)

(20)

(21)

(22)

Hypothesis Testing

• _{t-test or (later) F-test (individual}

coefcient vs overall model tests)

• _{two sided vs one sided test}

– _{Your hypothesis}

– _{check the theory}

– _{our research question}

• _{2 methods}

– _{t-stat method}

(23)

Hypothesis Testing

• _{The long steps:}

– _{State the null and alternative hypothesis}

– _{Choose the level of signifcance}

– _{For t-test method: observe t-statistics}

and compute t-critical

– _{For p-value method: compute p-value}

– _{State the decision rule}

(24)

Regression

reg rent room sqrm if rent<4000000 & sqrm<3000 & room<30

(25)

(26)

(27)

Compute t-crit

t-crit (a, df=n-k-1=11043-2-1=11040) = 1.960179

Rejection criteria:

Reject H0 if |t-stat |> |t-crit|

Conclusion:

Since our |t-stat| > |t-crit| or 22.57> 1.960179, we reject H0.

Conclusion:

Since our t-stat > t-crit (22.57 > 1.960179) we reject H0.

Therefore we have sufcient evidence that number of room has an impact on rent

(28)

(29)

Rejection criteria:

Reject H0 if p-value < Conclusion:

Since our p-value=0.0000… is less than =0.05, we reject H0.

Therefore we have sufcient evidence that number of room has an impact on rent

•

(30)

One sided t-test (ex: t-stat

app)

• _{As number of room increases, it is}_sensible_to

think that the rent also increases (probably based on theory)

• _{We can (should) use 1 tail test}

– _{We must compute new t-critical as the output of STATA /}

(31)

(32)

One sided t-test (ex: t-stat

app)

Compute t-crit for 1 sided

t-crit (2a, df=n-k-1=11043-2-1=11040) = 1.645 (positive side)

Rejection criteria:

Reject H0 if |t-stat |> |t-crit|

Conclusion:

Since our |t-stat| > |t-crit| or 22.57> 1.645, we reject H0.

(33)

One sided t-test (ex: p-value

app)

• _{As number of room increases, it is sensible}

to think that the rent also increases

• _{We can (should) use 1 tail test}

(34)

One sided t-test (ex: p-value

Because we are doing 1 tail test, P-value given by Econometric Software must be divided by 2;

Hence calculated =0.0000…

(35)

Example:

Therefore we have sufcient evidence that number of room has a positive

impact on rent

(36)

Testing Other Hypotheses About

• _{Consider a simple model relating the}

annual number of crimes on college

campuses (crime) to student

enrollment (enroll)

• _{This is a constant elasticity model,}

where is the elasticity of crime with

respect to enrollment

(37)

Testing Other Hypotheses

About

• It is not much use to test H0: , as we

expect the total number of crimes to increase as the size of the campus increases

• _{A more interesting hypothesis to test}

would be that the elasticity of crime with respect to enrollment is one

H0 :

– _{This means that a 1% increase in enrollment}

(38)

Testing Other Hypotheses

About

• A noteworthy alternative is

H

₁

:

,

which implies that a 1% increase in

enrollment increases campus crime

by

more than

1%

• _{If , then, in a relative sense—not just}

an absolute sense—crime is more of

a problem on larger campuses.

(39)

Testing Other Hypotheses

About

• _{The estimated elasticity of crime with}

respect to enroll, 1.27, is in the

direction of the alternative .

• _But

_{is there enough evidence}

_to

conclude that ?

(40)

Testing Other Hypotheses

About

• if the null is stated as H

0

:

• _{where is our hypothesized value of ,}

then the appropriate t statistic is

• _{The usual t statistic is obtained when}

.

(41)

Testing Other Hypotheses

About

• _{The correct t statistic is}

• _{The one-sided 5% critical value for a}

t distribution with df is about 1.66

• _{So we clearly reject in favor of at}

the 5% level

(42)

F-test (F-stat approach)

H0: b1=b2=0 all coefcients are zero (or: all independent

variables do not afect dependent variables; or: room and sqrm do not afect rent)

HA: At least one of bi is NOT zero (or: at least one independent

variable is NOT zero)

F-stat = 434.33

F-crit (a=0.05,k=2,n-k-1=26)2.99 Because F-stat > F crit, reject H0

Conclusion: we have sufcient evidence that at least one of our independent variable is useful in explaining house rent

(43)

F-test (p-value approach)

H

₀

:

b

₁

=

b

₂

=0 all coefcients are zero

H

_A

: At least one of

b

_i

is zero

Using p-value approach, we can see that our p-value for F-test is 0.000… which is less than our (default) a=0.05 Hence, reject H0

(44)

Joint / Multiple hypothesis

test

• _{We often test hypotheses involving}

more than one of the population

parameters.

– _{test a single hypothesis involving more}

than one of the .

– _{test multiple hypotheses (multiple linear}

restrictions – the F-test)

(45)

Testing Multiple Linear Restrictions:

The

F -

Test

• _{We begin with the leading case of}

testing whether a set of independent

variables has no partial efect on a

dependent variable

– _{we want to test whether a group of}

variables has no efect on the dependent variable.

– _{the null hypothesis is that a set of variables}

(46)

Testing Multiple Linear Restrictions:

The

F -

Test

• _{consider the following model that explains major}

league baseball players’ salaries:

(4.28)

salary is the 1993 total salary, years is years in the

league, gamesyr is average games played per year,

bavg is career batting average (for example, bavg = 250), hrunsyr is home runs per year, and rbisyr is

runs batted in per year.

(47)

Testing Multiple Linear Restrictions:

The

F -

Test

• _{Suppose we want to test the null}

hypothesis that, once years in the league and games per year have been controlled for, the statistics measuring performance

—bavg, hrunsyr, and rbisyr—have no

efect on salary.

• _{Essentially, the null hypothesis states that}

(48)

Testing Multiple Linear Restrictions:

The

F -

Test

• In terms of the parameters of the model, the null hypothesis is stated as

(4.29)

The null (4.29) constitutes three exclusion restrictions:

If (4.29) is true, then bavg, hrunsyr, and rbisyr have no efect on log(salary), after years and gamesyr have

been controlled for, and therefore should be excluded

from the model.

(49)

Testing Multiple Linear Restrictions:

The

F -

Test

• What should be the alternative to (4.29)? If what we have in mind is that “performance statistics matter, even after controlling for years in the league and games per year,”

then the appropriate alternative is simply

is not true

The alternative (4.30) holds if at least one of or is diferent from zero. (Any or all could be diferent from zero.)

(50)

Testing Multiple Linear Restrictions:

The

F -

Test

• The steps to be done:

1. Conduct a regression for the unrestricted model (in the example above, the model with all performance variables included)

• _{Note the SSR and R}2

2. Conduct a regression for the restricted model (in the example above, the model with none of the

performance variables included)

• Note the SSR and R2

3. Calculate the F-Statistic, that is

Where is numerator degree of freedom = and is called the denominator degree of freedom =

(51)

(52)

Testing Multiple Linear Restrictions:

The

F -

Test

• _{The outcome of the joint test may seem}

surprising in light of the insignifcant t

-statistics for the three variables.

• _{What is happening is that the two}

variables hrunsyr and rbisyr are highly correlated, and this multicollinearity

makes it difcult to uncover the partial

(53)

The

R

-Squared form of the

F

Statistic

• It is often more convenient to have a form of the

F statistic that can be computed using the R -squareds from the restricted and unrestricted models.

• _{One reason for this is that the}_R_{-squared is}

always between zero and one, whereas the SSRs can be very large depending on the unit of