Estimation, inference and testing - Essays on Impulse Response Inference in Vector Autoregressi

Now the response ofyt to the structural shocks under the alternative set of parameters is

Φ^∗_y(h) =

Φ^∗_g1(h) Φ^∗_g2(h)

Φ_g1(h) Φ_g2(h)

=Φy(h) (2.10)

So thatΦy(h)is point identified. In addition, we have

Φ^∗_x(h) =ΛH⁻¹

Φ^∗_f₁(h) Φ^∗_f₂(h)

+Γ

Φ^∗_g1(h) Φ^∗_g2(h)

=Λ

Φf1(h) Φf2(h)

+Γ

Φg1(h) Φg2(h)

=Φx(h)

(2.11)

So that the approximate responseΦx(h)is point identified.

The above result shows that the structural impulse response of the observed factor yt, along with the approximate responses of the information variables Xt, are impacted by neither the rotation introduced in the principal component first step, nor the potential permutation and scaling from identification through non-Gaussianity. However, we do stress that this is a statistical identification result, rather than an economical one, as the structural shocks do not necessarily carry any specific economic meaning. To properly label the structural shocks, as suggested in Lanne et al. (2017), one could inspect the shape and signs of the estimated impulse responses, and refer to a variety of economic theory and intuition. This process will be further discussed and demonstrated in section 2.5

arranged in descending order, where the normalizationT⁻¹F˜⁰F˜=Ir₁is used. It is commonly known in the factor model literature (see Bai and Ng (2006) and Gonc¸alves and Perron (2014)) that ˜ft

actually estimates a rotation of the true factors, i.e.H ft, where

H=V˜⁻¹F˜⁰MYF T

Λ⁰Λ

N (2.13)

where ˜V is ther₁×r₁diagonal matrix with the diagonal elements being ther₁the largest eigenvalues of the matrix T⁻¹N⁻¹M_YX X⁰M_Y⁰, in descending order. Correspondingly, let ˜g_t = (f˜_t⁰,y⁰_t)⁰, ˜G= (g˜₁, . . . ,g˜_T)⁰= (F,Y˜ ). Then ˜g_t=Qg_t where







H 0

0 Ir₂





 (2.14)

withHdefined above.

After obtaining an estimate of the unobserved factors, following Lanne et al. (2017), we consider a maximum likelihood estimation of the VAR model. To this end, assume that εi,t has a density σ_i⁻¹fi(σ_i⁻¹x;λi)whereλi is a vector of parameters.

Let θ = (σ⁰,λ⁰,π⁰,β⁰)⁰ where σ = (σ1, . . . ,σr)⁰, λ = (λ₁⁰, . . . ,λ_r⁰)⁰, π =vec(A1, . . . ,Ap) and β =vecd^◦(B) wherevecd^◦ denotes the vector obtained by removing the diagonal elements of B fromvec(B). We consider the VAR parameter spaceΘ=Θ_σ×Θ_λ×Θ_π×Θ_β where

1. Θσ =R^r₊.

2. Θ_λ=Θ_λ₁×. . .Θ_λ_r with eachΘ_λ_i being an open subset ofR^dⁱ. Letd=∑^r_i=1d_i.

3. Θ_π ⊂Rⁿ²^p an open set such that Assumption 2.4 is satisfied. In addition, for any π∈Θ_π we have that the correspondingπ^∗ whereπ^∗=vec(A^∗₁, . . . ,A^∗_p) and A^∗_i =QAiQ⁻¹ satisfies π^∗∈Θ_π almost surely.

4. Θ_β =vecd^◦(B) with B is an open set of structural matrices satisfying a normalization scheme with diagonal elements being 1. For an example of such set (and the corresponding normalization scheme), see Lanne et al. (2017).

The parameter space assumptions guarantee thatθ₀is an interior point. To guarantee the validity of maximum likelihood estimation, we need to be more specific with the distribution of the structural shocks, on which we impose the following assumption:

Assumption 2.5. (Density) The following conditions hold for f₁, . . . ,fr: (1) f_i(x;λi)>0and is twice continuously differentiable with respect to(x;λi).

(2) f_i,x(x;λ_i,0)>0, f_i,xx(x;λ_i,0)and f_i,xλ(x;λ_i,0)are integrable with respect to x.

(3) ^Rsup_λ_i_∈Θ

λikf_i,λ_i(x;λi)kdx<∞and^Rsup_λ_i_∈Θ

λikf_{i,λ λ}(x;λi)kdx<∞ (4) The matrix E[l_θ,t(θ₀,G)l_θ⁰_,t(θ₀,G)]is positive definite.

(5) For all x∈R,

x²f_i,x²(x;λi,0)

f_i²(x;λi,0) andkf_i,λ(x;λ_i,0)k² f_i²(x;λi,0) are dominated by c₀(1+|x|^c¹)with c₀≥0,0≤c₁≤2and

|x|^c²fi(x;λ_i,0)dx<∞

(6) For all x∈Randλi∈Θ_λ_i,

f_i,x²(x;λi) f_i²(x;λi) and

f_i,xx(x;λi) fi(x;λi)

are dominated by a₀(1+|x|^a¹),

f_i,xλ(x;λi) fi(x;λi)

and

f_i,x(x;λi) fi(x;λi)

f_i,λ(x;λi) fi(x;λi)

are dominated by a₀(1+|x|^a²),

f_i,λ(x;λi) f_i²(x;λi)

and

f_{i,λ λ}(x;λi) fi(x;λi)

are dominated by a₀(1+|x|^a³), with a0≥0,0≤a₁,a₂,a3≤2, and

(|x|^2+a¹+|x|^1+a²+|x|^a³)fi(x;λ_i,0)dx<∞

(7) For all x∈Randλi∈Θ_λ_i,

f_i,xxx(x;λi) fi(x;λi)

f_i,xx(x;λi)f_i,x(x;λi) f_i²(x;λi)

and

f_i,x(x;λi) fi(x;λi)

are dominated by b₀(1+|x|^b¹),

f_{i,λ λx}(x;λi) f_i(x;λi)

f_i,x(x;λi)f_{i,λ λ}(x;λi) f_i²(x;λi)

f_i,λ(x;λi)f_i,xλ⁰ (x;λi) f_i²(x;λi)

and

f_i,x(x;λi)f_i,λ(x;λi)f_i,λ(x;λi)⁰ f_i³(x;λi)

are dominated by b₀(1+|x|^b²), with b₀≥0,0≤b₁,b₂≤2, and

(|x|^b¹+|x|^b²)fi(x;λ_i,0)dx<∞

These assumptions impose restrictions on the smoothness and tail behavior of the density function, and are similar to Assumptions 4 and 5 in Lanne et al. (2017), with the additional requirement ofa₁,a₂,a₃,b₁,b₂,c₁≤2. More specifically, (1), (2), (3) and (4) guarantees the differentiability and the information matrix equality. (5), (6) and (7) ensures the proper asymptotic behavior of the score vector and hessian matrix, and also allows for bounding the estimation error introduced in the principal component estimation of the factors. While the additional requirement on the magnitude of the parameters might seem restrictive, notice that these parameters represents a trade-off between smoothness and heavy tail. Larger parameters allow for the density and derivatives to have large fluctuations, but at the price of more restrictive tail behavior. In macroeconomic applications, most non-Gaussian distributional assumptions are used to accommodate heavy tails of the variables and shocks, which implies that the magnitude of the aforementioned parameters should not be too large.

In fact, Assumption 2.5 is satisfied by many commonly used non-Gaussian density function, such as the Student’st-distribution (with degrees of freedom larger than 4) and the logistic density.

Now we introduce some additional notations due to the principal component estimation. Taking

into account the rotation introduced in factor estimation, let V =plim ˜V, ΣFF˜ =plim

F˜⁰MYF T

H₀=plimH=V⁻¹ΣFF˜ Σ_Λ, Q₀=plimQ=





 H₀ 0

0 Ir₂







(2.15)

LetA^∗_i,0=Q₀A_i,0Q⁻¹₀ . WhileQ₀B₀may not be an element ofB, by Proposition 1 and 2 in Lanne et al. (2017), there must exist matrices diagonal matrices D₁, D₂ and permutation matrix P such that B^∗₀=Q₀B₀D₁PD₂ ∈B. Correspondingly, let σ₀^∗ =D⁻¹₂ P⁰D⁻¹₁ σ₀ denote the standard error of D⁻¹₂ P⁰D⁻¹₁ εt andλ₀^∗ denote the reassembled parameter vector after permutation P⁰. Now let π₀^∗=vec(A^∗_1,0, . . . ,A^∗_p,0),β₀^∗=vecd^◦(B^∗₀)andθ₀^∗= (σ₀^∗⁰,λ₀^∗⁰,π^∗

0,β₀^∗⁰)⁰.

Now we show that the MLE exists and consistently estimates such transformation of the true VAR parameters. The likelihood function is

L_T(θ,G) =1 T

∑

t=p+1

l_t(θ,ggg_t)

=1 T

∑

t=p+1

∑

i=1

logfi(σ_i⁻¹ιiB⁻¹et(θ,ggg_t);λi)−log(det(B))−

∑

i=1

logσi

# (2.16)

whereggg_t= [g_t⁰, . . . ,g⁰_t−p]⁰,ιiis a 1×rvector with itsi-th element being 1 and other elements being 0 and

et(θ,ggg_t) =gt−A₁gt−1− · · · −Apgt−p (2.17) For simplicity of notation, here we ignore the fact thatet does not depend on the elements ofθother thanπ.

Theorem 2.2. (Consistency) Under Assumptions 2.1, 2.2, 2.3, 2.4 and 2.5, there exists a double indexed sequence of solutionsθˆ_N,T to the FOCs L_θ,T(θ,G) =˜ 0such thatθˆ_N,T−θ₀^∗−→^p 0as N,T →

∞.

Proof. Consider a sphereΘ^∗_ε ={θ^∗:kθ^∗−θ₀^∗k=ε}for some sufficiently smallε. We have LT(θ^∗,G)˜ −LT(θ₀^∗,G) =˜ h

LT(θ^∗,G)˜ −LT(θ^∗,GQ⁰₀)i +h

L_T(θ₀^∗,GQ⁰₀)−L_T(θ₀^∗,G)˜ i +

LT(θ^∗,GQ⁰₀)−LT(θ₀^∗,GQ⁰₀) i

(2.18)

Let ˜ggg_t= [g˜⁰_t, . . . ,g˜_t−⁰ _p]⁰andggg_t^∗= [Q₀g_t⁰, . . . ,Q₀g_t⁰_−p]⁰. (i)For the first term, a mean value expansion gives

LT(θ^∗,G)˜ −LT(θ^∗,GQ⁰₀)

=1 T

∑

t=p+1

vec(ggg˜_t−ggg^∗_t)⁰l_g,t(θ^∗,ggg_t^†)i (2.19)

whereggg^†_t is between ˜ggg_t andggg_t^∗. Let

x_i(θ,ggg_t) =σi(θ)⁻¹ιiB(θ)⁻¹e_t(π(θ),ggg_t) (2.20)

Then

vec(ggg˜_t−ggg^∗_t)⁰l_g,t(θ^∗,ggg^†_t)

=(˜ggg_t−ggg^∗_t)⁰





 Ir

−A^∗₁⁰ ...

−A^∗_p⁰





 B^∗−1⁰







σ₁^∗−1^f^1,x^(x¹^(θ

∗,ggg_t^†);λ₁^∗) f₁(x1(θ^∗,ggg^†_t);λ₁^∗)

... σ_r^∗−1^f^r,x^(x^r^(θ

∗,ggg^†_t);λ_r^∗) f_r(xr(θ^∗,ggg_t^†);λ_r^∗)







(2.21)

We have

|L_T(θ^∗,G)˜ −L_T(θ^∗,GQ⁰₀)|

≤ 1 T

∑

t=p+1 t s=t−p

∑

kg˜s−Q₀gsk²

!!1/2

1 T

∑

t=p+1

kl_g,t(θ^∗,ggg^†_t)k²

!1/2 (2.22)

By Lemma A.1 in Bai (2003), we have 1 T

∑

t=1

kg˜_t−Q₀g_tk²=O_p(δ_NT⁻²) (2.23)

whereδNT =min{√ N,√

T}. Meanwhile, by Assumption 2.5, for anyt, we have

sup

θ^∗∈Θ^∗_ε

kl_g,t(θ^∗,ggg^†_t)k²≤Const· sup

θ^∗∈Θ^∗_ε

1≤i≤rmax(1+|x_i(θ^∗,ggg^†_t)|^a¹) (2.24)

Notice that xi(θ^∗,ggg_t^†)

=xi(θ^∗,ggg_t^∗) +σ^∗−1ιiB^∗−1(g_t^†−Q₀g_t−A^∗₁(g^†_t−1−Q₀g_t−1)− · · · −A^∗_p(g^†_t−p−Q₀g_t−p))

(2.25)

Then by the Minkowski inequality we have

sup

θ^∗∈Θ^∗_ε

1 T

T t=1

∑

|xi(θ^∗,ggg_t^†)|²

≤ sup

θ^∗∈Θ^∗_ε

1 T

∑

t=1

|x_i(θ^∗,ggg^∗_t)|²1/2

+Const·1 T

∑

t=1

kg_t^†−Q₀gtk²1/2!2

≤ sup

θ^∗∈Θ^∗_ε

1 T

∑

t=1

|x_i(θ^∗,ggg^∗_t)|²1/2

+Const·1 T

∑

t=1

kg˜_t−Q₀g_tk²1/2!2

(2.26)

From the expressionxi(θ^∗,ggg^∗_t), due to boundedness of the coefficients forggg_t^∗ whenθ^∗∈Θ^∗_ε, it is straightforward to verify that{x_i(θ^∗,ggg^∗_t)|θ^∗∈Θ^∗_ε}is equicontinuous (as functions ofggg^∗_t). Then by Assumption 5 and Theorem 3.2 in Rao (1962), we can establish that

sup

θ^∗∈Θ^∗_ε

1 T

∑

t=1

|x_i(θ^∗,ggg^∗_t)|²=O_p(1) (2.27)

Combining the above, we have

sup

θ^∗∈Θ^∗_ε

1 T

∑

t=p+1

kl_g,t(θ^∗,ggg_t^†)k²=O_p(1) (2.28)

Therefore

sup

θ^∗∈Θ^∗_ε

|L_T(θ^∗,G)˜ −LT(θ^∗,GQ⁰₀)|=Op(δ_NT⁻²) =op(1) (2.29) asN,T →∞.

(ii)Similarly, for the second term we have

sup

θ^∗∈Θ^∗_ε

|L_T(θ₀^∗,GQ⁰₀)−LT(θ₀^∗,G)|˜ =op(1) (2.30)

asN,T →∞.

(iii)For the third term, notice that this term is not affected by the estimation of factors and does not depend onN. Then following the proof of Theorem 1 in Lanne et al. (2017), we have a Taylor expansion

L_T(θ^∗,GQ⁰₀)−L_T(θ₀^∗,GQ⁰₀)

=(θ^∗−θ₀^∗)⁰L_θ_,T(θ₀^∗,GQ⁰₀) +1

2(θ^∗−θ₀^∗)⁰

L_θ,T(θ^†,ggg^∗_t)−E

l_{θ θ,t}(θ^†,ggg^∗_t)

(θ^∗−θ₀^∗) +1

2(θ^∗−θ₀^∗)⁰

E(l_{θ θ,t}(θ^†,ggg^∗_t))−E

l_{θ θ,t}(θ₀^∗,ggg^∗_t)

(θ^∗−θ₀^∗) +1

2(θ^∗−θ₀^∗)⁰E

l_{θ θ}_,t(θ₀^∗,ggg^∗_t)

(θ^∗−θ₀^∗)

=S₁+S₂+S₃+S₄

(2.31)

ForS₁, notice that

et(π₀^∗,ggg_t^∗) =Q₀gt−A^∗_1,0Q₀gt−1− · · · −A^∗_p,0Q₀gt−p=Q₀et(π0,ggg_t) (2.32)

we have

LT(θ₀^∗,GQ⁰₀) =1 T

T t=p+1

∑

lt(θ₀^∗,ggg^∗_t)

∑

t=p+1

∑

i=1

logf_i(σ_i^∗−1ιiD⁻¹₂ P⁰D⁻¹₁ B⁻¹₀ Q⁻¹₀ Q₀e_t(π₀,G);λ_i^∗)

−log|det(Q₀B₀D₁PD₂)| −

∑

i=1

logσ_i^∗

=1 T

∑

t=p+1

(lt(θ0,ggg_t)−Const)

=LT(θ₀,G)−(T−p) T ·Const

(2.33)

which implies thatl_θ,t(θ₀^∗,ggg_t^∗) =l_θ_,t(θ0,ggg_t)andL_θ,T(θ₀^∗,GQ⁰₀) =L_θ_,T(θ0,G). Hence, we have

L_θ_,T(θ₀^∗,GQ⁰₀) =L_θ_,T(θ₀,G)−→^p 0 (2.34)

which givesS₁−→^p 0 and thus

sup

θ^∗∈Θ^∗_ε

S₁−→^p 0 (2.35)

ForS₂, for a compact and convex setΘ^∗₀⊂Θthat containsθ₀^∗as an interior point, similar to Lemma 2 in Lanne et al. (2017), we can establish that:

sup

θ^∗∈Θ^∗₀

1 T

T t=

∑

l_{θ θ,t}(θ^∗,ggg_t^∗)−E[lθ θ,t(θ^∗,ggg^∗_t)]

−p

→0 (2.36)

and thatE[l_{θ θ},t(θ^∗,ggg^∗_t)]is continuous inθ^∗ atθ₀^∗. This implies that for small enoughε such that Θ^∗_ε ⊂Θ^∗₀, we have

sup

θ^∗∈Θ^∗_ε

S₂−→^p 0 (2.37)

ForS₃andS₄, we use the continuity ofE[l_{θ θ,t}(θ^∗,ggg_t^∗)]established above, and the negative definite- ness ofE[l_{θ θ,t}(θ₀^∗,ggg^∗_t)]. Then we have

sup

θ^∗∈Θ^∗_ε

S₃+S₄<−Const·ε² (2.38)

Combining (i), (ii) and (iii), we have

P sup

θ^∗∈Θ^∗_ε

L_T(θ^∗,G)˜ <L_T(θ₀^∗,G)˜

→1 (2.39)

asN,T →∞, which implies the existence and the consistency of a solution sequence, see Serfling (1980), pp. 147-148 and Shao (2003), pp. 290.

To perform inference, we now derive the asymptotic distribution of the parameter estimates.

Theorem 2.3. (Asymptotic Normality) Under Assumptions 2.1, 2.2, 2.3, 2.4 and 2.5, we have

√

T(θˆ^∗−θ₀^∗)−→^d N(0,(−E[l_{θ θ}_,t(θ₀,ggg_t)])⁻¹) (2.40)

as N,T →∞and

√ T

N →0.

In addition, −L_{θ θ}(θ˜^∗,G)˜ −1

is a consistent estimator of(−E[lθ θ,t(θ0,ggg_t)])⁻¹. Proof. Let ˆθ^∗be the MLE. We have

√

T L_θ,T(θˆ^∗,G) =˜

√

T L_θ,T(θ₀^∗,G) +˜

√

T L_{θ θ,T}(θ˜^∗,G)(˜ θˆ^∗−θ₀^∗) (2.41)

√

T(θˆ^∗−θ₀^∗) = (L_{θ θ}_,T(θ˜^∗,G))˜ ⁻¹√

T L_θ_,T(θ₀^∗,G)˜ (2.42) ForL_{θ θ,T}(θ˜^∗,G), we have˜

L_{θ θ}_,T(θ˜^∗,G) =˜ 1 T

∑

t=p+1

l_{θ θg,t}(θ^∗,ggg˜_t)

= 1 T

∑

t=p+1







l_{σ σ,t}(θ^∗,ggg˜_t) l_{σ λ,t}(θ^∗,ggg˜_t) l_{σ π,t}(θ^∗,ggg˜_t) l_{σ β,t}(θ^∗,ggg˜_t) l_{σ λ}_,t(θ^∗,ggg˜_t)⁰ l_{λ λ,t}(θ^∗,ggg˜_t) l_{λ π,t}(θ^∗,ggg˜_t) l_{λ β,t}(θ^∗,ggg˜_t) l_{σ π,t}(θ^∗,ggg˜_t)⁰ l_{λ π,t}(θ^∗,ggg˜_t)⁰ l_{π π,t}(θ^∗,ggg˜_t) l_{π β}_,t(θ^∗,ggg˜_t) l_{σ β}_,t(θ^∗,ggg˜_t)⁰ l_{λ β,t}(θ^∗,ggg˜_t)⁰ l_{π β}_,t(θ^∗,ggg˜_t)⁰ l_{β β,t}(θ^∗,ggg˜_t)







(2.43)

Following Lanne et al. (2017), let

Σ=diag(σ₁, . . . ,σr) (2.44) gt= (1,g⁰_t−1, . . . ,g⁰_t−p)⁰ (2.45)

and

et(θ,ggg_t) =diag(e_1,t(θ,ggg_t), . . . ,e_r,t(θ,ggg_t)) (2.46) ex,t(θ,ggg_t) =diag(e1,x,t(θ,ggg_t), . . . ,er,x,t(θ,ggg_t)) (2.47)

where

ei,t(θ,ggg_t) =ι_i⁰B⁻¹et(θ,ggg_t) (2.48) e_i,x,t(θ,ggg_t) = f_i,x(σ_i⁻¹ei,t(θ,ggg_t);λi)

f_i(σ_i⁻¹e_i,t(θ,ggg_t);λi) (2.49) In addition, define block diagonal matrices

e_xx,t(θ,ggg_t) =diag(e_1,xx,t(θ,ggg_t), . . . ,er,xx,t(θ,ggg_t)) (2.50) e_{λ λ,t}(θ,ggg_t) =diag(e_{1,λ λ,t}(θ,ggg_t), . . . ,e_{r,λ λ,t}(θ,ggg_t)) (2.51) e_xλ,t(θ,ggg_t) =diag(e_1,xλ,t(θ,ggg_t), . . . ,e_r,xλ,t(θ,ggg_t)) (2.52)

where

ei,xx,t(θ,ggg_t) = f_i,xx(σ_i⁻¹ei,t(θ,ggg_t);λi)

f_i(σ_i⁻¹ei,t(θ,ggg_t);λi) − f_i,x(σ_i⁻¹ei,t(θ,ggg_t);λi) f_i(σ_i⁻¹ei,t(θ,ggg_t);λi)

(2.53) e_i,xλ,t(θ,ggg_t) = f_i,xλ⁰ (σ_i⁻¹ei,t(θ,ggg_t);λi)

f_i(σ_i⁻¹e_i,t(θ,ggg_t);λi)

− f_i,x(σ_i⁻¹ei,t(θ,ggg_t);λi) fi(σ_i⁻¹ei,t(θ,ggg_t);λi)

f_i,λ⁰ (σ_i⁻¹ei,t(θ,ggg_t);λi) fi(σ_i⁻¹ei,t(θ,ggg_t);λi)

(2.54)

e_{i,λ λ,t}(θ,ggg_t) = f_i,λ_i_λ_i(σ_i⁻¹e_i,t(θ,ggg_t);λi) f_i(σ_i⁻¹ei,t(θ,ggg_t);λi)

− f_i,λ(σ_i⁻¹ei,t(θ,ggg_t);λi) f_i,λ⁰ (σ_i⁻¹e_i,t(θ,ggg_t);λi) (2.55)

Then the blocks of the Hessian are

l_{σ σ,t} =Σ⁻²+2Σ⁻³et(θ,ggg_t)ex,t(θ,ggg_t) +Σ⁻⁴e_t²(θ,ggg_t)exx,t(θ,ggg_t) (2.56) l_{σ λ}_,t =−Σ⁻²et(θ,ggg_t)e_xλ,t(θ,ggg_t) (2.57) l_{σ π,t} =gt⊗B⁻¹⁰(Σ⁻²ex,t(θ,ggg_t) +Σ⁻³et(θ,ggg_t)exx,t(θ,ggg_t)) (2.58) l_{σ β}_,t =H⁰(B⁻¹⊗B⁻¹⁰)[et(θ,ggg_t)⊗(Σ⁻²ex,t(θ,ggg_t) +Σ⁻³et(θ,ggg_t)exx,t(θ,ggg_t))] (2.59)

l_{λ λ,t} =e_{λ λ,t} (2.60)

l_{λ π,t} =−(I_{r p+1}⊗B⁻¹⁰Σ⁻¹)(gt⊗e_xλ,t(θ,ggg_t)) (2.61) l_{λ β,t} =−H⁰(B⁻¹⊗B⁻¹⁰Σ⁻¹)(et(θ,ggg_t)⊗e_xλ,t(θ,ggg_t)) (2.62) l_{π π,t} = (Ir⊗B⁻¹⁰Σ⁻¹)(gtg_t⁰⊗e_xx,t(θ,ggg_t))(Ir⊗B⁻¹⁰Σ⁻¹)⁰ (2.63) l_{π β}_,t =gt⊗h

Ir⊗diag(e_x,t(θ,ggg_t))⁰

(B⁻¹⁰⊗Σ⁻¹B⁻¹)Hi +gt⊗h

B⁻¹⁰Σ⁻¹ e_t(θ,ggg_t)⁰⊗exx,t(θ,ggg_t)

(B⁻¹⁰⊗Σ⁻¹B⁻¹)H

i (2.64)

l_{β β,t} =H⁰(B⁻¹⊗B⁻¹⁰)(et(θ,ggg_t)et(θ,ggg_t)⁰⊗exx,t(θ,ggg_t))(B⁻¹⁰⊗Σ⁻¹B⁻¹)H +H⁰(B⁻¹⊗I_r) (et(θ,ggg_t)diag(ex,t(θ,ggg_t))⁰)⊗I_r

(Σ⁻¹B⁻¹⊗B⁻¹⁰K_rrH) +H⁰Krr(B⁻¹⁰Σ⁻¹⊗B⁻¹) ((diag(e_x,t(θ,ggg_t))et(θ,ggg_t))⊗Ir) (B⁻¹⁰⊗Ir)H +H⁰(B⁻¹⊗B⁻¹⁰)KrrH

(2.65)

whereHis a special elimination matrix defined following footnote 8 of Lanne et al. (2017), andK_rr is the commutation matrix such thatKrrvec(A) =vec(A⁰)for anyr×rmatrixA.

Now notice that

vech(L_{θ θ,T}(θ˜^∗,G)˜ −L_{θ θ}_,T(θ˜^∗,GQ⁰₀))

=1 T

∑

t=p+1

vec(ggg˜_t−Q₀ggg_t)⁰∂vech(l_{θ θ,t}(θ^∗,ggg^†_t))

∂ggg

# (2.66)

To characterize ^∂vech(l^{θ θ,t}^(θ

∗,ggg^†_t))

∂ggg , consider the derivative of an arbitrary element from each block.

First forl_{σ σ,t}, let

e_i,xxx,t(θ,ggg_t) =f_i,xxx(σ_i⁻¹e_i,t(θ,ggg_t);λi)

f_i(σ_i⁻¹e_i,t(θ,ggg_t);λi) +2 f_i,x(σ_i⁻¹e_i,t(θ,ggg_t);λi) f_i(σ_i⁻¹e_i,t(θ,ggg_t);λi)

−3f_i,xx(σ_i⁻¹e_i,t(θ,ggg_t);λi)f_i,x(σ_i⁻¹e_i,t(θ,ggg_t);λi) f_i²(σ_i⁻¹e_i,t(θ,ggg_t);λi)

(2.67)

Then consider thei-th diagonal element ofl_{σ σ,t}(θ^∗,ggg^†_t), we have

∂

∂gt−s

σ_i^∗−2+2σ_i^∗−3ei,t(θ^∗,ggg_t^†)ei,x,t(θ^∗,ggg^†_t) +σ_i^∗−4e²_i,t(θ^∗,ggg^†_t)ei,xx,t(θ^∗,ggg_t^†)

2σ_i^∗−3e_i,x,t(θ^∗,ggg_t^†) +2σ_i^∗−4e_i,t(θ^∗,ggg^†_t)e_i,xx,t(θ^∗,ggg^†_t) +2σ_i^∗−4e_i,t(θ^∗,ggg^†_t)e_i,xx,t(θ^∗,ggg_t^†) +σ_i^∗−5e²_i,t(θ^∗,ggg^†_t)ei,xxx,t(θ^∗,ggg^†_t)

ι_i⁰B^∗−1(−A^∗_s)

(2.68)

Forl_{σ λ}_,t, consider the j-th element of itsi-th diagonal block, we have

∂

∂g_t−s

−σ_i^∗−2e_i,t(θ^∗,ggg_t^†)e^(j)_i,xλ

i,t(θ^∗,ggg^†_t)

e⁽_i,xλ^j)

i,t(θ^∗,ggg^†_t)−e_i,t(θ^∗,ggg_t^†)f_i,xλ^(j)(σ_i^∗−1ei,t(θ^∗,ggg^†_t);λi) f_i(σ_i^∗−1e_i,t(θ^∗,ggg_t^†);λi)

− f_i,xx(σ_i^∗−1e_i,t(θ^∗,ggg^†_t);λi)f_i,λ⁽^j)(σ_i^∗−1e_i,t(θ^∗,ggg^†_t);λi) f_i²(σ_i^∗−1ei,t(θ^∗,ggg^†_t);λi)

+2f_i,x²(σ_i^∗−1e_i,t(θ^∗,ggg_t^†);λi)f_i,λ^(j)(σ_i^∗−1e_i,t(θ^∗,ggg^†_t);λi) f_i³(σ_i^∗−1e_i,t(θ^∗,ggg^†_t);λi)

σ^∗−3ι_i⁰B^∗A^∗_s

≡ e^(j)_i,xλ

i,t(θ^∗,ggg^†_t)−e_i,t(θ^∗,ggg^†_t)e⁽_i,xxλ^j)

σ^∗−3ι_i⁰B^∗A^∗_s

(2.69)

Nowl_{σ π,t} is a Kronecker product of two matrices, so we consider the product of arbitrary elements of each matrix. We have

∂

∂g_t−s h

g_t−^†(m)_j (B^∗−1)^(n,m)

σ_n^∗−2en,x,t(θ^∗,ggg^†_t) +σ_n^∗−3en,t(θ^∗,ggg^†_t)en,xx,t(θ^∗,ggg_t^†) i

=(B^∗−1)^(n,m)

σ_n^∗−2e_n,x,t(θ^∗,ggg^†_t) +σ_n^∗−3e_n,t(θ^∗,ggg^†_t)e_n,xx,t(θ^∗,ggg_t^†) ιm

+g^†(m)_t−_j

2σ_n^∗−3e_n,xx,t(θ^∗,ggg^†_t) +σ_n^∗−4e_n,t(θ^∗,ggg_t^†)e_n,xxx,t(θ^∗,ggg^†_t)

ι_n⁰B^∗−1(−A^∗_s)

(2.70)

For l _,t, notice that H⁰(B^∗−1⊗B^∗−1⁰) does not depend on G, so we consider an element from

[et(θ^∗,ggg^†_t)⊗(Σ⁻²e_x,t(θ^∗,ggg^†_t) +Σ^∗−3et(θ^∗,ggg^†_t)e_xx,t(θ^∗,ggg^†_t))].

Similar as above, we obtain

∂

∂gt−s

e_i,t(θ^∗,ggg^†_t)

σ^∗−2_j ej,x,t(θ^∗,ggg^†_t) +σ^∗−3_j ej,t(θ^∗,ggg^†_t)ej,xx,t(θ^∗,ggg_t^†) i

σ^∗−2_j e_j,x,t(θ^∗,ggg^†_t) +σ^∗−3_j e_j,t(θ^∗,ggg^†_t)e_j,xx,t(θ^∗,ggg^†_t)

ι_i⁰(−A^∗_s) +ei,t(θ^∗,ggg^†_t)

2σ^∗−3_j ej,xx,t(θ^∗,ggg^†_t) +σ^∗−4_j ej,t(θ^∗,ggg^†_t)ej,xxx,t(θ^∗,ggg^†_t)

ι⁰_jB^∗−1(−A^∗_s)

(2.71)

Forl_{λ λ,t}, consider the(j,k)-th element of itsi-th diagonal block, we have

∂

∂gt−s

e⁽_i,λ^j,k)

iλi,t(θ^∗,ggg^†_t)

(f_{i,xλ λ}^(j,k)(σ_i^∗−1ei,t(θ^∗,ggg^†_t);λi)

f_i(σ_i^∗−1e_i,t(θ^∗,ggg_t);λi) − f_{i,λ λ}⁽^j,k)(σ_i^∗−1ei,t(θ^∗,ggg^†_t);λi)f_i,x(σ_i^∗−1ei,t(θ,ggg_t);λi) f_i²(σ_i^∗−1e_i,t(θ,ggg_t);λi)

− 1

f_i²(σ_i^∗−1e_i,t(θ^∗,ggg^†_t);λi) h

f_i,λ^(j)(σ_i^∗−1ei,t(θ^∗,ggg^†_t);λi)f_i,xλ^(k)(σ_i^∗−1ei,t(θ^∗,ggg^†_t);λi) +f_i,xλ⁽^j)(σ_i^∗−1e_i,t(θ^∗,ggg_t^†);λi)f_i,λ^(k)(σ_i^∗−1e_i,t(θ^∗,ggg^†_t);λi)i

− f_i,λ^(j)(σ_i^∗−1e_i,t(θ^∗,ggg^†_t);λi)f_i,λ^(k)(σ_i^∗−1e_i,t(θ^∗,ggg^†_t);λi)f_i,x(σ_i^∗−1e_i,t(θ^∗,ggg^†_t);λi) f_i³(σ_i^∗−1ei,t(θ^∗,ggg_t^†);λi)

)

·σ^∗−1ι_i⁰B^∗−1(−A^∗_s)

≡e⁽_i,λ^j,k)

iλixσ^∗−1ι_i⁰B^∗−1(−A^∗_s)

(2.72)

Forl_{λ π,t}, the first term(Ir p+1⊗B^∗−1Σ^∗−1)does not depend onG, thus we consider an element of the second termgt⊗e_xλ,t(θ^∗,ggg^†_t).

∂

∂g_t−s

g^†(m)_t−i e⁽ⁿ⁾_j,xλ

j,t(θ^∗,ggg_t^†)

=1{i=s}e⁽ⁿ⁾_j,xλ

j,t(θ^∗,ggg^†_t)ιm+σ^∗−1_j g^†(m)_t−i e⁽ⁿ⁾_j,xxλ

j,t(θ^∗,ggg_t^†)ι⁰_jB^∗−1(−A^∗_s)

(2.73)

where1{·}is the indicator function.

For l_{λ β,t}, again −H⁰(B⁻¹⊗B^∗−1⁰Σ^∗−1) does not depend on G, so we consider an element of et(θ^∗,ggg_t^†)⊗e_xλ,t(θ^∗,ggg_t^†).

For l_{π π,t}, notice that(Ir⊗B^∗−1⁰Σ^∗−1)does not depend onG, we consider an element from the middle component(gtg_t⁰⊗e_xx,t(θ,ggg_t)):

∂

∂gt−s

g^†(m)_t−i g^†(n)_t−_je_i,xx,t(θ^∗,ggg_t^†)

=e_i,xx,t(θ^∗,ggg^†_t) (1{i=s}ι_m+1{j=s}ι_n) +g^†(m)_t−i g^†(n)_t−_je_i,xxx,t(θ^∗,ggg^†_t)σ^∗−1ι_i⁰B^∗−1(−A^∗_s)

(2.75)

For l_{π β}_,t, again we haveB^∗−1⁰⊗Σ^∗−1B^∗−1HandB^∗−1⁰Σ^∗−1⁰ does not depend onG. Then for the first term, we have

∂

∂g_t−sg_t−i^†(m)ej,x,t(θ^∗,ggg_t^†) =ej,x,t(θ^∗,ggg^†_t)1{i=s}ιm+g_t−i^(m)ej,xx,t(θ^∗,ggg^†_t)σ^∗−1ι⁰_jB^∗−1(−A^∗_s) (2.76) For the second term, we have

∂

∂g_t−sg^†(m)_t−i e_j,t(θ^∗,ggg^†_t)ek,xx,t(θ^∗,ggg^†_t)

=e_j,t(θ^∗,ggg^†_t)e_k,xx,t(θ^∗,ggg^†_t)1{i=s}ι_m +g^†(m)_t−i

e_k,xx,t(θ^∗,ggg^†_t)ι⁰_j+e_j,t(θ^∗,ggg^†_t)ek,xxx,tσ^∗−1ι_k⁰

B^∗−1(−A^∗_s)

(2.77)

Now that we have the above characterization of ^∂vech(l^{θ θ}^,t^(θ

∗,ggg^†_t))

∂ggg , we notice that Assumption 2.5 implies bounds on

ke_i,t(θ^∗,ggg^†_t)k,ke_i,x,t(θ^∗,ggg^†_t)k, ke_i,xx,t(θ^∗,ggg^†_t)k,ke_i,xλ_i_,t(θ^∗,ggg^†_t)k, ke_i,xxx,t(θ^∗,ggg_t^†)k,ke_i,xxλ

i,t(θ^∗,ggg_t^†)k,andke_i,xλ

iλi(θ^∗,ggg^†_t)k

(2.78)

Const· sup

θ^∗∈Θ^∗_ε

1+|x_i(θ^∗,ggg^†_t)|^c

(2.79) wherec∈ {c₁,a1,a2,a₃,b₁,b2}is the corresponding power.

Then following the same arguments as in eq(2.22) to eq(2.29), we obtain

L_{θ θ,T}(θ˜^∗,G)˜ −L_{θ θ,T}(θ˜^∗,GQ⁰₀)−→^p 0 (2.80)

Combined with eq(2.36), we have

L_{θ θ,T}(θ˜^∗,G)˜ −→^p E(l_{θ θ,t}(θ₀^∗,ggg_t^∗)) =E(l_{θ θ,t}(θ₀,ggg_t)) (2.81)

Now notice that

√

T L_θ,T(θ₀^∗,G) = (˜ √

T L_θ,T(θ₀^∗,G)˜ −√

T L_θ,T(θ₀^∗,GQ⁰₀)) +√

T L_θ_,T(θ₀^∗,GQ⁰₀) (2.82)

For the second term, by Theorem 1 in Lanne et al. (2017), we have

√

T L_θ,T(θ₀^∗,GQ⁰₀)−→^d N(0,(−E[l_{θ θ,t}(θ0,ggg_t)])⁻¹) (2.83)

Meanwhile, for the first term, recall that we previously showed that

sup

θ^∗∈Θ^∗_ε

|L_T(θ^∗,G)˜ −L_T(θ^∗,GQ⁰₀)|=O_p(δ_NT⁻²) (2.84)

which implies that

sup

θ^∗∈Θ^∗_ε

|√

T L_T(θ^∗,G)˜ −L_T(θ^∗,GQ⁰₀)

|=O_p(

√

Tδ_NT⁻²) =o_p(1) (2.85)

asN,T →∞when

√ T

N →0. Combining all of the above, we obtain that

√

T(θˆ^∗−θ₀^∗)−→^d N(0,(−E[l_{θ θ}_,t(θ₀,ggg_t)])⁻¹) (2.86)

asN,T →∞and

√ T

N →0. To estimate the asymptotic variance, notice that eq(2.81) implies that

−L_{θ θ}(θ˜^∗,G)˜ −1 p

−

→(−E(l_{θ θ,t}(θ₀,gt)))⁻¹ (2.87)

Therefore −L_{θ θ}(θ˜^∗,G)˜ −1

is a consistent estimator of the asymptotic variance.

The above result, Theorem 2.3, implies that the first step principal component estimation does not affect the asymptotic variance of the second step estimates. This is consistent with the result

shows that under the assumption

√ T

N →c6=0, such result no longer holds and an asymptotic bias will be introduced in the least square second step. For our maximum likelihood second step, a similar result might also hold, but we leave this issue for future work. In our simulation practice, we will evaluate the impact of this assumption in finite samples.

Other than inference, the above asymptotic normality result also implies that standard testing procedures could be used to test many conventional identification assumptions, as discussed in Lanne et al. (2017). For example, short-run restrictions, which normally involves assuming some elements of B to be zero, could be tested via standard Wald or likelihood ratio tests. Although already stressed in the literature, we reiterate the following remarks due to their importance in em- pirical research. First, not all restrictions could be tested, since we restrictBto have unit diagonal elements. Second, the test results should be interpreted as under a specific ordering, rather than under all ordering. To be more specific, for example, when testingB(1,3) =B(2,3) =0 when there are three factors in total, this should be interpreted as the first two factors do not respond to the third shock contemporaneously, rather than that there exists a shock that do not affect two of the factors. These problems are no different from those of testing identification restrictions in structural VAR, which are further discussed in section 4.6 in Lanne et al. (2017). In a FAVAR context, the identification assumptions, while are often mathematically identical or similar to the structural VAR identification restrictions, sometimes involve a slightly different economic interpretation due to the factor estimation, which the researcher should be careful with. We will discuss this point further in Section 2.5.

Dalam dokumen Essays on Impulse Response Inference in Vector Autoregressive Models (Halaman 55-71)