49426-sample.pdf

(1)

Week 9 - Heteroskedasticity and Generalised Least Squares (no Week 8 due to the midsem)

Heteroskedasticity - variance of error term is not constant, but varies across observations

(violating an MLR assumption):

. E.g. variance in wages may ar(u |x , ..x )

V _{i 1} . _k = σ_i²

depend on education.

With heteroskedasticity, the OLS estimators are still unbiased and consistent, but the OLS standard errors of the estimators are biased.

This invalidates t and F significance tests, and

means that OLS is not the “best” (i.e. most efficient) estimator.

But, we can adjust t and F stats so they are robust to heteroskedasticity. The t-test is valid asymptotically. F-test doesn’t work, but heteroskedasticity robust versions are available in econometric software (regress y x1 x2, robust).

SLR with heteroskedasticity

● β^︿₁= β₁+ _SST , remembering that

x2

Σ(x_i−x)u_i

ST (x )

S x2 = Σ _i−x²

● Var(β )^︿₁ = _SST , remembering that

x2

Σ(x_i−x) σ² _i²

ar(u) σ²=V _i

● Consistent estimator (i.e. approaches the true estimator as approaches infinity) forn under heteroskedasticity : (replace with )

ar(β )

V ^︿₁

︿

V ar(β )

1

︿

SSTx2

Σ(x_i−x)²^︿u_i²

σ_i² u^︿2_i

○ This estimator applies in homoskedasticity as well

○ If σ_i²= σ²(constant), then formula simplifies to the usual:Var(β )₁ = σ²/SST_x MLR with heteroskedasticity

Consistent estimator in MLR for Var(β )^︿_j under heteroskedasticity: V ar

︿

(β ) , where

j

︿ =

SSR_j² Σ_{i ij}r u^︿²^︿_i²

× _n−k−1ⁿ

is the i-th residual and is the sum of squared residuals from regressing on all other

r^︿_ij SSR_j² x_j

independent variables.

●

√

^{V ar(β )}

^︿

^︿^j ⁼ “heteroskedasticity robust standard error for ” (a.k.a.^β^j White/Huber/Eicker standard errors)

● Used for inference

(2)

Considerations for heteroskedasticity

● All formulas are only valid in large samples.

● Heteroskedasticity robust standard errors may be larger or smaller than their non-robust counterparts. The differences are often small in practice.

● We can get better estimates if the form of heteroskedasticity is known

○ Use prior research (e.g.

studies suggest that high wage & education

individuals may also have more variability in wages compared to individuals with less wages & education)

○ Plot the residuals

● Using the logarithmic transformation for the dependent variable often reduces heteroskedasticity (below L, log wages; below R, level wages)

vs

Formal tests for heteroskedasticity

Test for a null of homoskedasticity: H₀:Var(u|X)= σ², equivalent to H₀:Var(u|X) (u |X) E(u|X)]

=E ² − [ ²

[since , by zero conditional mean assumption (MLR Assumption 4)]

(u |X)

=E ² E(u|X)= 0

. Essentially, we’re testing whether is related to any of the ’s.

(u )

=E ² = σ² u² x

If we are prepared to assume the relationship between u²and each is linear, we can test forx linear heteroskedasticity using the Breusch-Pagan test: Given u²= δ₀+ δ_{1 1}x + . + δ.. _{k k}x +v, test

. We can’t observe the true errors, so we substitute the squared

H₀: δ₁= δ₂= . = δ.. _k = 0

residuals, ^︿2u for u².

(3)

Breusch-Pagan test

1. Outline the test: ^︿u²= δ₀+ δ_{1 1}x + . + δ.. _{k k}x +v H; ₀ : δ₁= δ₂ = . = δ.. _k= 0.

2. Regress y on the x’s and obtain the residuals, (^︿u predict residuals; predict [variable name for residuals],r).

3. Regress ^︿u²on all ’s to get x_j R²︿_u2(i.e. how well our model, ^︿u² = δ₀+ δ_{1 1}x + . + δ.. _{k k}x +v , explains variation in the squared residuals).

4(a). Obtain the F-statistic: ^(R ^)/k (a large test statistic / a high R-squared is evidence

2 uhat2

(1−R² )/(n−k−1)

uhat2

against the null hypothesis). OR:

4(b). Obtain the Lagrange multiplier statistic: LM =n·R²︿_u2 ~ χ_k²(“chi-squared” distribution).

5(a). Reject the null hypothesis if the F-statistic is greater than the critical F-value with (k, n-k-1) degrees of freedom. OR:

5(b). Reject the null hypothesis if the LM statistic is exceeding the critical value on the χ² distribution with k degrees of freedom at the desired significance level.

Rejecting the null indicates the existence of linear heteroskedasticity

Example. Regressing ^︿u²on ’s to get x R²︿_u2: regress uhat2 x1 x2 ...

Using the log form for in this case reduces heteroskedasticity:y

The White test allows us to detect non-linear heteroskedasticity by using squares and cross products of all the ’s e.g. x ^︿u²= δ₀+ δ_{1 1}x + δ_{2 2}x + δ_{3 1}x ²+ δ_{4 2}x ²+ δ_{5 1 2}x x +v. However, the test could involve a huge number of regressors, using up degrees of freedom, making it hard with small sample sizes.

(4)

The modified White test uses the fact that the fitted values are linear functions of the x’s:

. So, if we square the fitted values, we have a function of all the squares

x .. x

y_i

︿= β^︿₀+ β^︿₁ _1i+ . + β^︿_k _ki

and cross-products of the x’s!

Modified White test

1. Outline the test: u^︿2_i = δ₀+ δ₁y^︿_i+ δ₂y^︿2_i +v H; ₀: δ₁= δ₂= 0; Ha:heteroskedasticity of an unknown form^.

2. Estimate the original regression of y on the x’s and obtain the residuals and fitted values.

3. Conduct the auxiliary regression u^︿2_i = δ₀+ δ₁y^︿_i+ δ₂y^︿2_i +v, and obtain the R²_u .

i

︿2

4a. Form the F-statistic, and reject if greater than the critical value of the F distribution with k (=2) DoF on the numerator and n-k-1 (=n-3) DoF on the denominator. OR:

4b. Form the LM statistic, and reject if exceeding the critical value on the χ² distribution with k (=2) DoF at the desired significance level.

How to deal with heteroskedasticity? (1) Estimate the model by OLS and calculate robust standard errors (still unbiased and consistent, but inefficient); (2) Use an alternative estimator that directly accounts for the heteroskedasticity.

49426-sample.pdf

Week 9 - Heteroskedasticity and Generalised Least Squares ​(​no Week 8 due to the midsem)

︿

︿

√

︿

Week 9 - Heteroskedasticity and Generalised Least Squares (no Week 8 due to the midsem)

^︿