Asymptotic Linear Expansion of Local Polynomial Estimators

3.8 Supplementary Appendix

3.8.2 Results on Asymptotic Linear Expansion of Local Polynomial Estimators

3.8.2.2 Asymptotic Linear Expansion of Local Polynomial Estimators

whereH(h)is a diagonal matrix with the main diagonal entries beingh^−|k|, for lexicographic-orderedk, with 0≤ |k| ≤ p. Here, we have dropped the subscript of

X¯ to ease notational burden. We letιιι−({S_d,t}_(d,t)∈S₋) = (S⁰_1,0,S⁰_0,1,S⁰_0,0)⁰. The local Fisher information matrix evaluated atxcan be approximated as

I(x) =diag(p−(x))−p−(x)p⁰₋(x), (3.8.25)

wherep−(x) = (p(1,0,x),p(0,1,x),p(0,0,x)). In addition, we define the local hessian as

Σ^ps(x) =E[I(X)⊗H(h) X(x¯ _c)

X(x¯ _c)⁰H(h)Ke_ps(X;x,h,λ)].

With these notations in hand, we can introduce several quantities associated with the linear expansion of the PS estimator. For each(d,t)∈S−,

Ad,t(W,x) = (e_3,ι(d,t)⊗e_N_p,1)⁰Σ^ps(x)⁻¹eAAA−(W,x,γ^∗(x)), G^(ps)_d,t (W,x) =e⁰_3,ι(d,t)I(x)A−(W,x),

whereAAeA−(W,x,γ) =ιιι−({A˜_d,t(W,x,γ)}_(d,t)∈S₋), andA−(W,x) =ιιι−({A_d,t(W,x)}_(d,t)∈S₋). For the treated group in t=1, letG⁽_1,1^ps)(x) =−∑_(d,t)∈S₋G^(ps)_d,t (x). Additionally, we define, for a given observationX_j

B^(ps)_n,d,t(X_j) =E[G^(ps)_d,t (W_i,X_j)|X_j], (3.8.26)

S^(ps)_n,d,t(X_j) = 1 n−1

∑

i6=j

G^(ps)_d,t (W_i,X_j)−E[G^(ps)_d,t (W_i,X_j)|X_j], R^(ps)_n,d,t(X_j) =p(d,t,ˆ X_j)−p(d,t,X_j)−B^(ps)_n,d,t(X_j)−S^(ps)_n,d,t(X_j).

The three quantities represent the bias, the first-order stochastic part, and the remaining terms derived from the de- composition of the PS estimator, respectively.

Focusing on the OR model, for(d,t)∈S, the leave-one-out local polynomial estimator has a closed-form solution given by

mbd,t(X_j) = 1 n−1

∑

i6=j

e⁰_N_q_,1Σˆ^or_d,t⁻¹(X_j)

X¯_i(X_j)H(b_d,t)I_d,t,iYiKeor(X_i;Xj,bd,t,ϑd,t),

whereΣb^or_d,t(X_j) =_n−1¹ ∑i6=jI_d,t,iH(b_d,t) X¯_i(x_c)

X¯_i(x_c)⁰H(b_d,t)Ke_or(X_i;X_j,b_d,t,ϑ_d,t).

Analogous to the PS case, we useB^(or)_n,d,t,S_n,d,t^(or), and R^(or)_n,d,t to represent the bias, the first-order stochastic and the

remainder terms, respectively. For a given observationX_j, these quantities are specified as

B^(or)_n,d,t(X_j) =E[G^(or)_d,t (W_i,X_j)|X_j], S^(or)_n,d,t(X_j) = 1

n−1

∑

i6=j

G^(or)_n,d,t(W_i,X_j)−E[G^(or)_d,t (W_i,X_j)|X_j], R^(or)_n,d,t(X_j) =mˆd,t(X_j)−md,t(X_j)−B^(or)_n,d,t(X_j)−S^(or)_n,d,t(X_j),

where

G^(or)_d,t (W_i,X_j) =e⁰_N

q,1Σ^or_d,t(X_j)⁻¹H(b_d,t)

X¯_i(X_j)I_d,t,iξ_d,t,i^or (X_j)Ke_or(X_i;X_j,b_d,t,ϑd,t), Σ^or_d,t(x) =E[I_d,t,iH(b_d,t)

X¯_i(x_c)

X¯_i(x_c)⁰H(b_d,t)Keor(X;X_j,bd,t,ϑ_d,t)], ξ_d,t^or(x) =I_d,t(Y−

X(x)¯ ⁰β_d,t^∗).

Lemma 3.4 Suppose Assumptions 3.1, 3.2, and 3.5 are satisfied. In addition, Assumptions 3.5.2 (ii) and 3.5.5 (iv)–

(vii) hold for(d,t) = (1,1). Then, for(d,t)∈S,

sup

j∈Nn

B^(ps)_n,d,t(X_j)

=O_p(h^p+1+λo+λ_u), (3.8.27)

sup

j∈Nn

S^(ps)_n,d,t(X_j)

=O_pp

logn/(nh^υ^c)

, (3.8.28)

sup

j∈Nn

R^(ps)_n,d,t(X_j) =O_p

h^p+1+λo+λu+p

logn/(nh^υ^c)2

, (3.8.29)

sup

j∈Nn

B^(or)_n,d,t(X_j)

=O_p(b^q+1_d,t +ϑ_d,t,o+ϑd,t,u), sup

j∈Nn

S^(or)_n,d,t(X_j) =O_p

logn/ nb_d,t^υ^c

sup

j∈Nn

R^(or)_n,d,t(X_j) =O_p

b_d,t^p+1+ϑd,t,o+ϑd,t,u+ q

logn/ nb_d,t^υ^c 2!

Before stating the proof, we need to introduce some additional notations. Since kernel functions K andL are supported on[−1,1]^υ^c, the effective support ofK((· −x_c)/h)isSxc,h={z:x_c+hz∈X} ∩[−1,1]^υ^c. WhenSxc,h= [−1,1]^υ^c, xis an interior point, otherwisexlies close to the boundary. For any measurable set S ⊂[−1,1]^υ^c, let ν_k(S) =^R_Su^kK(u)duandκk(S) =^R_Su^kK²(u)du. Now we let theN_`×N_`matricesQ_`(x_c)andT_`(x_c), and the

N_`×n_kmatrixM_`,k(x_c)be defined as

Q_`(x_c) =







Q^(0,0)(Sxc,h) ... Q^(0,`)(Sxc,h) ... . .. ... Q^(`,0)(Sxc,h) . . . Q^(`,`)(Sxc,h)







, (3.8.30)

T_`(x_c) =







T^(0,0)(Sxc,h) ... T^(0,`)(Sxc,h) ... . .. ... T^(`,0)(Sxc,h) . . . T^(`,`)(Sxc,h)





 ,

M_`,k(x_c) =







Q^(0,k)(Sxc,h) ...

Q^(`,k)(Sxc,h)





 ,

whereQ^(i,_` ^j)(S)andT^(i,_` ^j)(S)areni×njmatrices with their respective(l,m)-th element given byν_π_i_(l)+π_j_(m)(S) andκπ_i(l)+π_j(m)(S). Whenxis a boundary point, these quantities are not invariant tox, and thus, capture the boundary effects.

Proof of Lemma 3.4:

Given that our data is a random sample, it is straightforward to show the “leave-one-out” estimators considered in the lemma is asymptotically equivalent to the usual “leave-in” estimators. See Rothe and Firpo (2019) for a detailed exposition. We therefore focus on the “leave-in” versions in what follows.

We prove the results for PS only. The case for OR follows by generalizing Proposition 7 of Fan and Guerre (2016) to the case where discrete covariates are accommodated. This generalization can be achieved by employing the techniques similar to those presented here.

For (3.8.27), we have

sup

x∈X

B^(ps)_n,d,t(x)

= sup

x∈X

e⁰_3,ι(d,t)I(x)(I₃⊗e⁰_N

p,1)Σ^ps(x)⁻¹E[eAAA−(W,x,γ^∗(x))]

≤sup

x∈X

e⁰_3,ι(d,t)I(x) ·sup

x∈X

(I₃⊗e⁰_N

p,1)Σ^ps(x)⁻¹ ·sup

x∈X

E[eAAA−(W,x,γ^∗(x))]

By definition, sup_x∈_X kI(x)k=O(1). Standard change of variable gives

Σ^ps(x) =I(x)⊗Q_p(x_c)f_X(x) +O(h+λ_o+λu). (3.8.31)

Since infx∈Xλ_min(I(x)⊗Q_p(x_c)) =infx∈Xλ_min(I(x))·inf_x_c∈Xcλ_min(Q_p(x_c))>0 and infx∈X f_X(x)>0 under As- sumptions 3.2(iii), 3.5.6, and 3.5.1, we get

sup

x∈X

I(x)⁻¹⊗Q_p(x_c)⁻¹·fX(x)⁻¹

=O(1), (3.8.32)

and thus, sup_x∈_X

Σ^ps(x)⁻¹

=O(1). Now, from Lemma 3.5, we conclude that sup_x∈_X

B^(ps)_n,d,t(x)

=O(h^p+1+λo+ λu).

Having just demonstrated thatΣ^ps(x)⁻¹ is uniformly bounded overX, we can now apply Lemma 3.5 and the CMT to deduce (3.8.28).

To establish (3.8.29), the proof proceed through three steps. First, we demonstrate the existence of a global maximizer for the local log-likelihood function defined in (3.3.6). Subsequently, we obtain the uniform asymptotic linear expansion for the local maximum likelihood estimator. Finally, we apply the delta method to verify that the remainder term exhibits the required rate.

Step 1: Define ¯γ= (I₃⊗H(h)⁻¹)γ and ¯γ^∗(·) = (I₃⊗H(h)⁻¹)γ^∗(·). Using the scaled parameters, we rewrite the likelihood as

L_n^ps(γ¯;x) =1 n

∑

i=1

∑

(d⁰,t⁰)∈S−

I_d,tH(h)

X(x¯ _c)⁰γ¯d,t

−log 1+

∑

(d⁰,t⁰)∈S−

exp H(h)

X(x¯ _c)⁰γ¯_d⁰,t⁰

Ke_ps(X_i;x,h,λ). (3.8.33)

The gradient and hessian ofLn^ps(γ;¯ x)with respect to ¯γare given by

∇γ¯Ln^ps(γ;¯ x) =1 n

n i=1

∑

AeAA−(W_i,x,γ), ∇²_γ_¯_γ_¯0Ln^ps(γ;¯ x) =1 n

n i=1

∑

H(W_i,x,γ),

where

H(X,x,γ) =I(X_c,x_c,γ)⊗H(Xe ,x,h,λ), H(Xe ,x,h,λ) =H(h)

X(x¯ _c)

X(x¯ _c)⁰H(h)Ke_ps(X;x,h,λ),

I(X_c,xc,γ) =diag(ΨΨΨ−(X_c,xc,γ))−ΨΨΨ−(X_c,xc,γ)ΨΨΨ−(X_c,xc,γ)⁰, Ψ

Ψ−(X,x,γ) =ιιι−({Ψ_d,t(

X(x),¯ γ)}_(d,t)∈_S₋), Ψd,t(x,γ) = exp x⁰γd,t

1+∑(d⁰,t⁰)∈S−exp x⁰γ_d⁰_,t⁰.

Next, we define the following two events

E_1n(c) = (

sup

x∈X

1 n

n i=1

∑

AA−(W_i,x,γ^∗(x))

<cκ_n )

E2n(c) = (

x∈infXλmin

1 n

∑

i=1

H(Xe ;x,h,λ)

>c )

forc>0 andκn=p

logn/(nh^υ^c) +h^p+1+λ_u+λ_o.

By Lemma 3.5, we deduce thatP(E_1n(c₁))→1, for any fixedc₁>0.

Now, standard change-of-variable analysis gives

E[H(X;e x,h,λ)] =Q_p(x_c)f_X(x) +O(h+λo+λu).

Under Assumptions 3.5.1 and 3.5.6, infx∈X f_X(x)>0 and inf_x_c∈Xcλ_min(Q_p(x_c))>0. As a result, there existsc₂>0 such that infx∈X λ_min(E[H(Xe ;x,h,λ)])≥c₂, whennis sufficiently large. Coupled with the fact that

sup

x∈X

1 n

n i=1

∑

H(Xe _i;x,h,λ)−E[H(X;e x,h,λ)]

=O_pp

logn/(nh^υ^c) .

which is a consequence of Lemma 5 from Fan and Guerre (2016), we deduce thatP(E_2n(c))→1, forc≤c₂. Next, we define a neighborhood of ¯γ^∗(·),

Γ(δ) ={γ(·):kγ(·)¯ −γ¯^∗(·)k_∞≤δ κn}.

Theorem 1 in Tanabe and Sagae (1992) implies that

x,y∈Xinf I(x,y,γ(y))> inf

x,y∈X

(

(d,t)∈S

∏

−

Ψ_d,t({

x(y)¯ ⁰γ_d,t(y)}_(d,t)∈S₋) )

·I₃, (3.8.34)

in the sense that their difference is positive definite. For anyδ >0, ifγ ∈Γ(δ), Assumption 3.5.5(ii) implies that kγ(·)−γ^∗(·)k_∞=o(1). This further suggests that, whenn is sufficiently large, the right-hand side of (3.8.34) is bounded from below byc3I3, for some positive constantc3.

The analysis leading up to this point demonstrates that for for a givenc1>0, it is possible to selectnlarge enough such thatP(E_1n(c₁))>1−ε/2,P(E_2n(c₂))>1−ε/2, and (3.8.34) is satisfied. Now, setδ0>2c₁c⁻¹₂ c⁻¹₃ . Then, for anyγ(·)∈∂Γ(δ₀), i.e.,kγ(x)−¯ γ¯^∗(x)k=δ0κn, for allx∈X, we have sup_x∈_X Ln^ps(γ¯(x);x)−Ln^ps(γ¯^∗(x);x) <0,

with a probability of at least 1−ε. This is because

sup

x∈X

{Ln^ps(γ(x);¯ x)−Ln^ps(γ¯^∗(x);x)}

=sup

x∈X

∇γ¯Ln^ps(γ¯^∗(x);x)(γ¯−γ¯^∗(x))−(γ¯(x)−γ¯^∗(x))⁰

−∇²_γ_¯_γ_¯0Ln^ps(γ¯^†;x)

(γ(x)¯ −γ¯^∗(x))/2o

≤ sup

x∈X

1 n

n i=1

∑

AA−(W_i,x,γ^∗(x))

−c₁κn

·δ0κn

<0,

where ¯γ^†, dependent onx, lies between ¯γ(x)and ¯γ^∗(x). SinceLn^ps(γ¯;x)is continuous, a local maximum, denoted by ˆ¯

γ(x), exists within the compact set{γ¯:kγ¯−γ¯^∗(x)k ≤δ₀κ_n}, for anyx∈X. Furthermore, due to the concavity of Ln^ps(·;x), ˆ¯γ(x)maximizesLn^ps(·;x)overR^3N^p for anyx∈X. Hence, ˆ¯γ(·)is the global maximizer ofLn^ps(γ¯(·);·) with a probability exceeding 1−ε. Asεis arbitrary andδ0is independent ofx, it can be inferred that

γ(·)ˆ¯ −γ¯^∗(·) ∞= O_p(κ_n).

Step 2:We proceed to derive the uniform asymptotic linear expansion of ˆ¯γ(·)−γ¯^∗(·). ExpandingLn^ps(γ¯;x)using a third-order Taylor series and rearranging the terms lead to

γˆ¯(x)−γ¯^∗(x) =1 n

∑

i=1

Σ^ps(x)⁻¹AAAe−(W_i,x,γ^∗(x)) +R^γ(X_j),

where

R^γ(x) =− Σ_n^ps(x)⁻¹−Σ^ps(x)⁻¹

·1 n

n i=1

∑

AA−(W_i,x,γ^∗(x))−Σ_n^ps(x)⁻¹C_n(x), C_n(x) =1

∑

i=1

∑

(d,t)∈S−

∑

(d⁰,t⁰)∈S−

(γˆ¯_d,t(x)−γ¯_d,t^∗ (x))⁰H(h) X¯_i(x_c)

X¯_i(x_c)⁰H(h)(γˆ¯_d⁰_,t⁰(x)−γ¯_d^∗0,t⁰(x))

·I˙ι(d,t),ι(d⁰,t⁰)(X_c,i,x_c,γ˜)⊗

X¯_i(x_c)H(h)Ke_ps(X_i;x,h,λ),

for an intermediate point ˜γlying betweenγ(x)b andγ^∗(x),Σn^ps(·) =¹_n∑ⁿ_i=1H(W_i,·,γ^∗(·)), and

I˙ι(d₁,t₁),ι(d₂,t₂)=ιιι−

n I˙_ι(d^(d³^,t³⁾

1,t₁),ι(d₂,t₂)

(d3,t3)∈S−

, I˙_ι(d^(d³^,t³⁾

1,t₁),ι(d₂,t₂)(X_c,x_c,γ) =1{(d₁,t₁) = (d₂,t₂)}Ψ_d₁_,t₁(

X(x¯ _c),γ)(1{(d₁,t₁) = (d₃,t₃)} −Ψd₃,t₃(

X(x¯ _c),γ))

∑

`₁,`₂∈{1,2},`₁6=`₂

Ψ_d_`

1,t_`

X(x¯ _c),γ)Ψ_d_`

2,t_`

X(x¯ _c),γ)(1{(d_`₂,t_`₂) = (d₃,t₃)} −Ψ_d₃,t₃(

X(x¯ _c),γ)).

In view of (3.8.31) and (3.8.32),

Σ^ps(·)⁻¹

=O(1). Taking this into account, along with Lemma 3.5, we obtain

sup

x∈X

1 n

n i=1

∑

A₋(W_i,x,γ^∗)−E[A₋(W,x,γ^∗)]

=O_pp

logn/(nh^υ^c) ,

sup

x∈XkE[A−(W,x,γ^∗)]k=O_p h^p+1+λ_o+λu .

Furthermore,

sup

x∈X

Σ_n^ps(x)⁻¹−Σ^ps(x)⁻¹

≤sup

x∈XkΣ_n^ps(x)k⁻¹·sup

x∈XkΣ_n^ps(x)−Σ^ps(x)k ·sup

x∈XkΣ^ps(x)k⁻¹

=O_p(1)·O_pp

logn/(nh^υ^c)

·O(1)

=O_pp

logn/(nh^υ^c) .

where the first inequality is a result of the relationshipA⁻¹−B⁻¹=−A⁻¹(A−B)B⁻¹and the Cauchy-Schwarz inequality. The next line is derived from (3.8.31) and (3.8.32), and arguments similar to those employed in the proof of Lemma 5 in Fan and Guerre (2016).

By the triangular inequality and the Cauchy-Schwarz inequality,

sup

x∈XkC_n(x)k ≤ 1 2n

n i=1

∑ ∑

(d,t)∈S−

∑

(d⁰,t⁰)∈S−

I˙ι(d,t),ι(d⁰,t⁰)(X_c,i,x_c,γ(x))˜

γˆ¯_d,t(x)−γ¯_d,t^∗ (x) ·

γˆ¯_d⁰_,t⁰(x)−γ¯_d^∗0,t⁰(x) · kH(h)

X¯_i(x_c)k³·

Ke_ps(X_i;x,h,λ) . max

(d,t),(d⁰,t⁰)∈{0,1} sup

x,z∈X

I˙ι(d,t),ι(d⁰,t⁰)(z_c,x_c,γ(x))˜ ·

γˆ¯d,t(x)−γ¯_d,t^∗ (x) ·

γˆ¯_d⁰_,t⁰(x)−γ¯_d^∗0,t⁰(x) o

(3.8.35)

·1 n

∑

i=1

sup

x∈X

n K_h^ps(

X¯⁽¹⁾_i (x_c)) · kH(h)

X¯_i(x_c)k³o

. (3.8.36)

When ˜γconverges uniformly toγ^∗, as established in the first step,

I˙ι(d,t),ι(d⁰,t⁰)(z_c,x_c,γ(x))˜

in (3.8.35) is asymptotically bounded, uniformly inx,z∈X, and for each(d,t),(d⁰,t⁰)∈S−. In addition, we can deduce from a standard change of variable argument that (3.8.36) isO_p(1). Hence, it can be concluded that sup_x∈_XkC_n(x)k=O_p κ_n²

. As a result, we obtain sup_x∈_X kR^γ(x)k=O_p κ_n²

Step 3: We note that bp(d,t,x)−p(d,t,x) =Ψd,t(e_N_p_,1,γ(x))b −Ψd,t(e_N_p_,1,γ^∗(x)) and∇γ_d,tΨd,t(e_N_p_,1,γ^∗(x)) = e⁰_3,ι(d,t)I(x). Utilizing the delta method in conjunction with the uniform expansion obtained in Step 2 then estab-

lishes (3.8.29). This completes the proof of the lemma.

Lemma 3.5 Suppose that the conditions of Lemma 3.4 hold. Then

sup

x∈X

1 n

n i=1

∑

AA−(W_i,x,γ^∗(x))−E[eAAA−(W,x,γ^∗(x))]

=Op

(logn/(nh^υ^c))^1/2

, (3.8.37)

sup

x∈X

E[eAAA−(W,x,γ^∗(x))]

=O h^p+1+λo+λu

. (3.8.38)

Proof of Lemma 3.5:

The proof of (3.8.37) proceeds along similar lines as in Lemma 5 of Fan and Guerre (2016). For any given vectork with 0≤ |k| ≤p, define

Ae^(k)_d,t(W,x,γ) = Id,t−Ψd,t(

X(x¯ _c),γ)

h^−|k|(X_c−xc)^kKe(X;x,h,λ), Ae^†,(k)_d,t (W,x_c,γ) = I_d,t−Ψd,t(

X(x¯ _c),γ)

h^−|k|(X_c−x_c)^kK

X_c−x_c h

for(d,t)∈S−, and letκ_n= (logn/(nh^υ^c))^1/2.Assumption 3.5.5 implies thatκ_n→0. Moreover, under Assumptions 3.5.1, 3.5.2, and 3.5.4, we have that, for anyε>0, there existsδ_n=n^−κ^a such that (i)

max

i∈Nn

Ae^†,(k)_d,t (W_i,xc,γ^∗(x))−Ae^†,(k)_d,t (W_i,x⁰_c,γ^∗(x⁰))

≤h^υ^cκnε/3, (3.8.39)

Ae^†,(k)_d,t (W,x_c,γ^∗(x))i

−E h

Ae^†,(k)_d,t (W,x⁰_c,γ^∗(x⁰))i

≤h^υ^cκnε/3, (3.8.40) for(d,t)∈S−and for allx,x⁰∈X such thatx_d=x⁰_dandkx_c−x⁰_ck ≤δ_n; (ii) there is a positive integerJ_n=O(n^κ^b), κ_b>0, and a set{x_j}^J_j=1ⁿ ⊂X, such that for allx∈X, there exists a j satisfyingx∈B(x_j,δ_n)∩X, and for all x⁰∈B(x_j,δ_n),x⁰_d=xd,j. As a result,X =^S^J_j=1ⁿ (B(x_j,δ_n)∩X).

Now, observe that, for(d,t)∈S−

sup

x∈X

1 n

n i=1

∑

Ae^(k)_d,t(W_i,x,γ^∗(x))−E[Ae^(k)_d,t(W,x,γ^∗(x))]

≤max

j∈NJn

1 n

n i=1

∑

Ae^(k)_d,t(W_i,x_j,γ^∗(x_j))−E[eA^(k)_d,t(W,x_j,γ^∗(x_j))]

(3.8.41)

+max

j∈NJn

sup

x∈B(xj,δ_n)∩X

1 n

∑

i=1

Ae^(k)_d,t(W_i,x,γ^∗(x))−Ae^(k)_d,t(W_i,x_j,γ^∗(x_j))

(3.8.42) +max

j∈NJn

sup

x∈B(xj,δ_n)^∩X

E[Ae^(k)_d,t(W,x,γ^∗(x))]−E[eA^(k)_d,t(W,x_j,γ^∗(x_j))]

. (3.8.43)

In view of (3.8.39), (3.8.42) is bounded from above by

i∈Nmaxn,j∈NJn

sup

x∈B(x_j,δn)^∩X

h^−υ^c

Ae^†,(k)_d,t (W_i,x_c,γ^∗(x))−Ae^†,(k)_d,t (W_i,xc,j,γ^∗(x_j))

≤κ_nε/3.

Meanwhile, sincex_d=xd,j, wheneverx∈B(x_j,δn), (3.8.40) then implies that (3.8.43)≤κnε/3.

To bound (3.8.41), we apply Bernstein’s inequality.⁸ Since the support ofKis bounded, we have that|eA^(k)_d,t(W,x, γ^∗(x))| ≤CkKk_∞, for a sufficiently large positive constantC. Additionally, standard calculation gives

Varh

Ae_d,t(W,x,γ^∗(x))i

=E[(I_d,t−p(d,t,(X_c,x_d)))²H(h) X(x¯ _c)

X(x¯ _c)⁰H(h)K_h(

X¯_i(x_c))²1{X_d=x_d}]

+o h^−υ^c

=h^−υ^cI(x)ι(d,t),ι(d,t)T_p(x_c)f_X(x) +o h^−υ^c .

Hence,Varh

Ae^(k)_d,t(W,x,γ^∗(x))i

≤Ch^−υ^cunder Assumption 3.5.4.

With these two results in hand, we have

P max

j∈NJn

1 n

n i=1

∑

Ae^(k)_d,t(W_i,xj,γ^∗(x_j))−E[Ae^(k)_d,t(W,xj,γ^∗(x_j))]

≥κnε/3

≤

∑

j=1

P 1 n

n i=1

∑

Ae^(k)_d,t(W_i,x_j,γ^∗(x_j))−E[Ae^(k)_d,t(W,x_j,γ^∗(x_j))]

≥κnε/3

≤2J_nexp

− ε²logn

C+C(εlogn·n⁻¹h^−υ^c)^1/2

≤exp

−(ε²−κ_b)logn C

where the first inequality is due to the Bonferoni inequality and the second is by Bernstein’s inequality. The far right side goes to 0 whenε²>κb. Hence, (3.8.41)≤κnε/3.

Combining (3.8.41)–(3.8.43) gives

P sup

x∈X

1 n

∑

i=1

Ae^(k)_d,t(W_i,x,γ^∗(x))−E[Ae^(k)_d,t(W,x,γ^∗(x))]

≥κ_nε

→0. (3.8.44)

This complete the proof for (3.8.37).

Next, we establish (3.8.38). Define Io(x_d,z_d) =∑^υ_s=1^o 1{|x_o,s−zo,s|=1}∏l6=s1{x_o,l =zo,l}, and Iu(x_d,z_d) =

8Let{X_i}ⁿ_i=1be independent zero-mean random variables. Suppose|X_i| ≤Malmost surely, fori∈Nn. Then, Bernstein’s inequality states that for allt≥0,

n i=1

∑

X_i≥t

≤exp

− t²/2

∑ⁿi=1E[X_i²] +Mt/3

∑^υ_s=1^u 1{x_u,s6=z_u,s}∏l6=s1{x_u,l=z_u,l}. From a Taylor expansion of orderp+1, we deduce that, uniformly inx∈X,

E[eA_d,t(W,x,γ^∗(x))]

= 1

(p+1)!

∑

(d⁰,t⁰)∈S−

hI(X_c,x_d)_ι(d,t),ι(d⁰_,t⁰₎g^(p+1)_d0,t⁰ (X_c,x_d)⁰

X¯^(p+1)(x_c)H(h)

X(x¯ _c)K_h(

X¯⁽¹⁾(x_c))1{X_d=x_d}i

∑

z_d∈Xd\x_d

∑

j=o,u

λjIj(x_d,z_d) (p(d,t,x)−p(d,t,(x_c,z_d)))E h

H(h)X(x¯ _c)K_h(

X¯⁽¹⁾(x_c))1{X_d=z_d}i + (s.o.)

= h^p+1 (p+1)!

∑

(d⁰,t⁰)∈S−

I(x)_ι(d,t),ι(d0,t⁰)Mp,p+1(x_c)g^(p+1)

d⁰,t⁰ (x)f_X(x)

∑

z_d∈Xd\x_d

∑

j=o,u

λjI_j(x_d,z_d) (p(d,t,x)−p(d,t,(x_c,z_d)))M_p,0(x_c)f_X(x_c,z_d) +o(h^p+1+λ_o+λ_u)

=O(h^p+1+λ_o+λu),

where(s.o.)stands for smaller order terms. The last equality is due to Assumptions 3.5.2 and 3.5.4.

3.8.3 Auxiliary Lemmas and Results

Dalam dokumen PDF ESSAYS ON THE ECONOMETRICS OF CAUSAL INFERENCE By Qi Xu for the degree of (Halaman 174-184)