Directory UMM :Data Elmu:jurnal:S:Stochastic Processes And Their Applications:Vol89.Issue2.2000:

(1)

www.elsevier.com/locate/spa

Uniform iterated logarithm laws for martingales

and their application to functional estimation

in controlled Markov chains

R. Senoussi

∗

INRA, Laboratoire de Biometrie, Domaine St. Paul, Site Agroparc, 84 914.Avignon, Cedex 9, France

Received 27 March 1995; received in revised form 28 February 2000; accepted 28 February 2000

Abstract

In the rst part, we establish an upper bound of an iterated logarithm law for a sequence of processesMn(:)∈C(Rd;Rp) endowed with the uniform convergence on compacts, where Mn(x) is a square integrable martingale for each x in Rd. In the second part we present an iterative kernel estimator of the driving functionf of the regression model:

Xn+1=f(Xn) +n+1:

Strong convergences and CLT results are proved for this estimator and then extended to controlled Markov models.

Resume:

La premiere partie etablit une majoration de type loi du logarithme itere pour une suite de processus stochastiques Mn(x) ∈ C(Rd;Rp) muni de la topologie de la convergence uniforme sur les compacts, lorsqueMn(x) est une martingale de carre integrable pour toutx dansRd. La seconde traite par la methode des noyaux, le probleme de l’estimation iterative de la fonctionf

du modele de regression:

Xn+1=f(Xn) +n+1:

On prouve la consistance forte et etablissons dierentes vitesses de convergence de l’estimateur. On generalise ensuite ces resultats a d’autres exemples et en particulier au modele markovien contrˆole. c 2000 Elsevier Science B.V. All rights reserved.

MSC:primary 60F15; 62G05; secondary 60G42; 62M05

Keywords:Iterated logarithm law; Autoregressive model; Controlled model; Markov chain; Kernel estimator

0. Introduction

Part I of the paper proves a lim sup version of an iterated logarithm law for a se-quence of random processes (Mn(·))n¿1 with values in Rp and arguments or indices

in Rd. Processes (Mn(x))n¿1 are assumed to be square-integrable martingales for all

∗_{Tel.: 19-33-90-31-61-33; fax: 19-33-90-31-62-44.}

(2)

x∈Rd and to have almost surely continuous paths for all n∈N. Strong laws on gen-eral Banach spaces have been established already (Mourier, 1953; Kuelbs, 1976), but in our case the space C₍_Rd_;_Rp_{) endowed with the topology of uniform convergence}

on compacts is not a Banach space. We give rst simple conditions that ensure the strong uniform convergence and then strengthen this result by an iterated logarithm law. Unlike Strassen’s law (Heyde and Scott, 1973) for sequences of real r.v. the re-sult presented here is not an invariance principle but is only a strong law for variables that take values in a function space.

Part II develops some aspects of function estimation in the context of autoregressive models. Most studies of density or regression estimators generally use a Lp criterion, but for the controlled models the a.s. convergence is crucial in order to adapt an optimal control process. For this reason, we prove the a.s. uniform convergence (Devroy, 1988; Hernandez-Lerma, 1991), an iterated logarithm law and the pointwise weak conver-gence of the regression function estimatorfof the following controlled Markov model:

Xn+1=f(Xn) +C(Xn; Un) +n+1:

Our results are quite similar to classical density and regression kernel estimators of i.i.d. real sequences.

We now specify some notations that will be intensively used in this paper.

Bd(x; R) is the ball centered in x ∈ Rd with radius R in the Euclidean sense

kxk =pP(xi₎2_. _D _{generally denotes a dense countable set of} _Rd _{as for instance} D=S

m¿0Dm where Dm=Zd=2m. The following function h(t) = p

2LL(t) where LL(t) = log(log(t)) is used throughout the paper. We recall that C₍_Rd_;_Rp_{) is the} metrisable space of continuous functions from Rp to Rd, endowed with the topol-ogy of uniform convergence on compacts. The modulus of continuity of functionf on [−N;+N]dis denoted!(f; N; )=sup(kf(x)−f(y)k; kx−yk6; kxk6N; kyk6N). For the probability part, the existence of a stochastic basis (;A_;_F_{= (}F_n₎_n_¿0_;_P₎

satisfying the usual conditions, is always assumed, that isF is a Pcomplete, increasing and right continuous family of sigma elds.

The increasing process of a F-adapted, square-integrable vector martingale is the predictable and increasing sequence of semi-denite positive matrices.

hM; Min =Pnk=1E(Mk:tMk)|Fk−1), (or hMin), where Mn+1 =Mn+1 −Mn stands for the martingale dierence. More generally, we denote hM(x)in for sequence (Mn(·))n¿1 of random functions of C(Rd;Rp) such that, (Mn(x))n¿1 is a discrete,

F-adapted, square-integrable vector martingale for each x. Sometimes, with no loss of rigor, it happens that we use a same notation for dierent constants to avoid their profusion. Eventually, we simply refer to Duo (1990) and Iosifescu and Grigorescu (1990) each time we need to recall classical results on martingales.

Part A. Uniform strong laws

1. Strong laws in C₍_Rd_;_Rp₎

(3)

limit in each point under the assumption E(sup₀₆_s61kXn(s)k)¡∞. We give below a comparable result for non-stationary sequences of random functions of martingale type under Lipschitzian conditions. We rst dene a function on Rd to specify the Lipschitzian conditions we need in the multivariate case, if x= (x1_{; : : : ; x}d₎

(x) = d Y

i=1

(|xi|+5{|xi_|₌₀_}):

Theorem 1.1. Let Mn(x)be a family of discrete martingales indexed byx∈Rd;with

values in Rp _{and let us assume that for some continuous increasing function} _a₍_·₎ _on R+ and constants ¿0; ¿0;

(a) E(kMn(0)k2_{) =}O₍_n₎_;

(b) for all integers N and x; y_∈Bd(0; N);

E(_kMn(x)−Mn(y)k2)6a(N)nkx−yk(x−y):

Then; for all ¿ =2; the sequence n− _M_n(_·₎ _{converges a.s. to zero and uniformly}

on compacts of Rd.

Proof. First, conditions (a) and (b) imply thatMn(·) has continuous paths a.s. and that the strong law for square-integrable martingales applies (Neveu, 1964): limnn−Mn(x)= 0 a.s. for all x. Hence, a.s. n−_M

n(x) converges to zero on every dense countable set D.

Second, by Ascoli’s lemma we only have to prove thatn−_M

n(x) is a.s. an equicon-tinuous sequence. If we consider the partial oscillation, W(f; N;2−m_{) = sup(}_k_f₍_x₎₋

f(y)_k; x; y_∈Bd(0; N)∩Dm;kx−yk62−m) of f on Grid Dm=Zd=2m, we get

!(f; N;2−m)6CX

r¿m

W(f; N;2−r): (1)

Next, for N ¿0; ¿0; kxk; kyk6N, let us dene Events

A(n; x; y; ) ={sup(k−kMn(x)−Mn(y)k; 2n6k62n+1)¿}

and

B(n; m; N; ) =[(A(n; x; y; ); x; y_∈Bd(0; N)_∩Dm;kx−yk62−m):

On the one hand, by Kolmogorov’s inequality for martingales, it follows

P(A(n; x; y; ))6a1(N)kx−yk(x−y)−2(2n)−2(2n+1):

On the other hand, since the number of neighborsy_∈Dm of x; ky−xk= 2−m is less than C2md_{, we get}

P(B(n; m; N; ))6a2(N)−22−m+n−2n:

If ¡ =2 and C(n; N) =S

m¿1(B(n; m; N;2−m), then

P(C(n; N))6a3(N)2n−2n

X

m¿1

(4)

Since, P

m¿1P(C(n; N))¡∞, it follows by Borel–Cantelli Lemma that, from some

rankn∗, we have for allm_∈N; x; y_∈B(0; N)_∩Dm; kx−yk62−m.

n−kMn(x)−Mn(y)k6d−m:

Setting D=S

mDm, by (1), we obtain for n¿n∗; x; y ∈ Bd(0; N)∩D and kx−yk 62−k_{; n}−_k_M

n(x)−Mn(y)k6CPm¿k2−m6C2−k.

Since the paths are a.s. continuous, this inequality still holds on Bd(0; N) and this proves the a.s. equicontinuity of the sequence n−_M_{n( ).}

Corollary 1.1. Let (Zi)i¿1 be an i.i.d. sequence of square-integrable r.v. in Rs; a positive continuous function onRd_×_Rd _{and a mapping}_F _from_Rs_×_Rd _to_Rd _which

meets the following conditions for some constants ; C1; : : : ; C4:

(i) kF(z;0)k6C1kzk+C2.

(ii) _kF(z; x)₋F(z; y)_k2₆_C_3k_x₋_y_k₍_x₋_y₎₍_C_4k_z_k2₊₍_{x; y}_)).

Then; if n(x) =Pn

i=1(F(Zi; x)−E(F(Zi; x))); the sequence n

−_n(_·₎ _{converges a.s.}

and uniformly on compacts to zero; for all ¿1 2.

Proof. The square-integrable martingalen(0) satises assumption (a) of Theorem 1.1 with= 1. Moreover, we have

E





n X

i=1

(F(Zi; x)−F(Zi; y)

2

6Cnkx−yk(x−y)(CEkZk2+(x; y)):

2. Iterated logarithm law

Heyde and Scott (1973) generalized the invariance principle of Strassen’s log–log law to discrete martingales and then to ergodic stationary sequences of r.v. by the Skorokhod representation method. However, for our purpose, we follow in this paper the classical approach by the exponential inequalities of Kolmogorov, adapted to ran-domly normed partial sums (Stout, 1970). Although we deal with the function space

C₍_Rd_;_Rp_{) (Ledoux and Talagrand, 1986), we recall that the result proved below is}

not an invariance principle. Its proof relies on the following ILL for martingales. It has been adapted from Stout (1970) and proved in Duo (1990).

Theorem 2.1. (a) If Mn is a F-adapted real martingale and if s2n is an adapted

sequence converging a.s. to +_∞ that satisfy for some F₀_{-measurable r.v.} _{C ¡}₁_;

(i) |Mn+1|=|Mn+1−Mn|6Cs2n=h(s2n) and

(ii) hMin6s2n−1; then limn|Mn|=h(s2n−1)61 +C=2 a:s:

(b)This inequality continues to hold; ifCondition|Mn+1|6Cns2n=h(s2n)is substituted

(5)

The interesting case is C= 0. This result can be extended to the topological space

Proof. The proofs of Theorems 1.1 and 2.2 are based on the maximal inequality for positive supermartingales. ditions of the theorem. a∗

(6)

loss of generality, we may assume that a¿0, b¿0 and bound sup(a; b) by a+b. Inequality (2) follows in this second case, since is arbitrary.

Inequality (2) holds a.s. for each x. If the equicontinuity is established, each cluster point (:) is continuous and then (2) holds a.s. on Rd_.

(3) To prove the equicontinuity on compacts, it is enough to show that, lim→0supn!(n; N; ) = 0 a.s.

F-martingale Hn has bounded increments,

(7)

where, function C(N) depends only onN (but may vary below) andW(Mn; N;2−m) = sup_{|Mn(x; y)_|:x; y_∈Dm(0; N), y_∈V_m(_x₎_}_.

We have, for all integers k; m and events

Bm k =

[

r¿m

sup tk6n¡tk+1

W(n; N;2−r)¿2−rt(N)

;

P(Bm_k)6C(N)X r¿m

e{rdlog(2)−2r(−q)LL(k)}:

For large m,k, say m¿m∗_, _k_¿_k∗ _and _r_¿_m_{, we have}

rLL(k)

2r(−q)

r −

dlog(2) LL(k)

¿rLL(k) =ruk:

If m∗¿1, P(Bm

k)6C(N) P

r¿me

−ruk₆_C₍_N_)e−muk_; _{and then}

X

k¿k∗

P(Bm

k)6C(N) X

k¿k∗

(kLL())−m_¡_∞_:

By Borel–Cantelli Lemma,P(limk¿k∗Bm_k)=0. We have thus proved that sup_tk₆_nW(_n; N;2−r₎₆₍_N₎₂−rt_; _{for large} _k_,_m _{and all} _r_¿_m_.

Finally, part (2) of the proof implies that

sup tk6n

!(n; N;2−m)6(N) X

r¿m

2−rt₆_C₍_N₎₂−tm _a_:_s_:

This proves lim→0supn!(n; N; ) = 0 and the equicontinuity of the sequence n(·).

3. Examples

3.1. Usual rates

If the increments of the martingale family behave well, i.e.,s2_n=n, the convergence rate of Theorem 2.2 can be explicited.

Corollary 3.1. Let ¿0 and ¿ ¿0; be constants such that

(i) _kMn+1(0)k6bn=2=(LL(n))1=2; trhM(0)in6an; (ii) for any; integer N and x; y∈Bd(0; N);

kMn(x)−Mn(y)k6_kx−yk_b₍_N₎_n=2₌_(LL(_n₎₎1=2_;

tr_hM(x)₋M(y)_in6kx−yka(N)n:

Then; for all ¿0; sup_k_x_k₆_N_kMn(x)k=( p

n=2_(LL(_n₎₎1+₎_→₀ _a:s_.

3.2. Regression models

(8)

By a zero mean noise (n)n¿1 with nite conditional moment of order¿2, we mean

a F-adapted sequence of r.v. such that,

H1 (i) E(n+1|

F_{n) = 0} _and _E₍t

n+1n+1|Fn) = ; (ii) for some ¿0; sup_nE(kn+1k2+2|Fn)¡∞:

We also consider a sequence of processes Yn(x), F-adapted for all x; an increasing continuous functionb(_·) from R+ toR+. and assume there exist ¿0 and an adapted

sequence of r.v. n such that, a.s.

H2 (i) |Yn(0)|6n;

(ii) _|Yn(x)−Yn(y)|6b(N)kx−ykn; ∀x; y∈Bd(0; N):

We study below the asymptotic behavior of the martingale family

Mn(x) = n X

k=1

Yk−1(x)k:

Proposition 3.1. Under assumptions H1; H2 and if a.s.

s2_n= n X

k=1

2_k_{→ ∞} and

∞ X

k=1

(2

k)1+(LL(s2k)) (s2

k)1+

¡_∞;

the sequence Mn(·)=h(s2n−1) is a.s. relatively compact in C(Rd;Rp).

Remark 3.1. If P∞_k₌₁(2

k=s2k)1+ is substituted for the above second series, we obtain the pointwise convergence result of Duo et al. (1990).

Proof. (a) If kxk6N; and a(N) = 1 +b(N)N, we get|Yn(x)|6a(N)n. Next, let us dene the events An+1=

a(N)nkn+1k6sn2=h(s2n) , their complements Acn+1 and put

_n=n5An−E(n5An|Fn−1);

n=n5Ac

n−E(n5Acn|Fn−1):

Since, E(n5An|Fn−1) =−E(n5Ac

n|Fn−1), the martingale Mn(·) splits into two martin-gales Mn(·) =Mn(·) +Mn(·); where

M_n(x) = n X

k=1

Yk−1(x)k and Mn(x) = n X

k=1

Yk−1(x) k:

Next, set _n(x) =M_n(x)=h(s2

n−1) and n(x) =Mn(x)=h(s2n−1).

(b) Theorem 2.2 applies to the family M_n(_·). Since, _k_n_+1k62s2

n=(a(N)nh(s2n)) and tr(E(t

nn|Fn))6tr( ), we get

(i) _kM_n₊₁(0)_k6_|Yn(0)| kn+1k6b∗s2n=h(s2n),

tr(hM(0)in)6tr( ) n X

k=1

Y_k2−1(0)6a

∗

(9)

(ii) kM_n₊₁(x)−M_n₊₁(y)k6_|Yn(x)−Yn(y)| k_n+1k6b∗(N)kx−yk_s2

(c)(i) The increasing process (in the semi-denite sense) of Martingale Nn(x) = Pn

On the other hand, the moment assumption yields the Chebychev-type inequality

E(k_kk2₅

The previous moment inequality, proves that

tr_hW_in6C(N)

Proof. The asymptotic equivalence s2

(10)

Part B. Functional estimation

This part deals with the kernel estimation of the unknown but smooth regression function f from Rd to Rd that drives a controlled Markov model of type (Duo, 1990)

Xn+1=f(Xn) +C(Xn; Un) +n+1: (3)

The control C is assumed to be known and the sequence (n) to be a white noise with respect to some ltration F= (F_n₎_n_¿0_{, i.e., all} _i _{have the same distribution and} n+1 is independent of Fn for all n. Model (3) extends the classical linear regression model Xn+1=AXn+n+1 and the nonlinear model of Hernandez-Lemma (1991),

Xn+1=f(Xn) +n+1: (4)

In the sequel, we content with a smooth kernel K and a bandwidth well adapted to iterative computations and tracking (Masry and Gyorfy, 1987). We assume also that

K is Lipschitzian with order and coecientk, and that

H3

K is a nonnegative and compactly supported function onRd;

which satises Z

K(z) dz= 1 and|K(u)−K(v)|6kku−vk_:

If the dynamic system (4) is stable and if the stationary distribution has a density

h, a kernel estimator of h is dened for all ¿0, as follows:

ˆ

hn(x) = 1

n

n X

i=1

idK(i(Xi−x)); (5)

next, the function f of model (3) (or (4) if C≡0) can be estimated by

ˆ

f_n₊₁(x) = Pn

i=1idK(i(Xi−x))(Xi+1−C(Xi; Ui))

Pn

i=1idK(i(Xi−x))

: (6)

We assume that ˆf_n₊₁(x) = 0 whenever ˆhn(x) = 0.

The a.s. and weak convergence rates as well as the iterated logarithm laws of these estimators are quite similar to those obtained in the i.i.d. case (Hall, 1981; Mack and Silverman, 1982; Devroy and Penrod, 1984; Liero, 1989). The results rely on stability criteria of Lyapounov type presented in Duo (1990). Note that Iosifescu and Grigorescu (1990) present a wide range of pointwise a.s., log–log laws and weak convergence results and some invariance principles for dependent sequences (called random systems with complete connections). However, our proofs seem to have no counterparts in their framework.

We now briey recall the main denitions of Duo (1990) used in the sequel. A sequence (Xn; Un) of r.v. adapted to a ltration F with values in (E×U; E⊗U), is a controlled Markov chain if for some transition probability (x; u; dy) from E_×U

in E, the distribution of Xn+1 conditionally to Fn is (Xn; Un; dx). E is said the state space and U the control space. Any sequence = (dn) of measurable functions

dn from En+1 to U is called a strategy. The strategy determines the control at any time:Un=dn(X0; : : : ; Xn). If, for a xed state x, the controldn(x0; : : : ; Xn−1; x) belongs

(11)

every sequence (Xn) of r.v. gives rise to a sequence of empirical distributions

n(B)=1=(n+1)Pn

i=05{Xi∈B}; B∈B(Rd). The sequence is said stable if, the sequence

n converges weakly to a stationary distribution a.s. In the controlled case (3), a class D _{of strategies stabilizes the sequence if a.s., for any admissible} _∈ D_{, any}

initial distribution and _∀ ¿0, there exists a compact C such that lim_nn(C)¿1−.

4. Non-controlled models

4.1. Strong convergence

Theorem 4.1. For the autoregressive model (4); assume that

1. f is continuous and limkxk→∞kf(x)k=kxk¡1.

2. n is a white noise with density p of classC1 and p and its gradient are bounded. 3. Model (4) is stable.

Then;

(A) Stationary distribution has a bounded density h of class C1 _which satis-es h(x) =Rp(x−f(z))h(z) dz. Moreover; for all 0¡ ¡1=d and all initial distribution; hˆn(x)→h(x) a.s.; uniformly on compacts.

(B) For all x _∈ S=_{x; h(x)¿0_}; fˆ_n(x) pointwise converges to f(x) a.s. If the noise has a moment of order m ¿2 (assumption H1) and if 0¡ ¡1=2d; the pointwise convergence strengthens to uniform convergence on compacts.

Finally;if f is of class C1;then for all ¿0; N ¡∞and=inf (;1₂−(d+));

sup kxk6Nk

ˆ

f_n(x)₋f(x)_k=O₍_n−_(LL(_n₎₎1=2₎ _a:s:

We rst prove a lemma which enables us to convert the arithmetic mean to a type of weighted means.

Letxn be a sequence of real numbersxn, or in any normed spaces, and putSn() = Pn

i=1ixi; Sn=Pni=1xi.

Lemma 4.1. If Sn=n→s; then

(i) n−(1+)_S_n(₎_→_s=_{(1 +}₎_if _¿_0,

(ii) _kSn()k=O(n1+) if −1¡ ¡0,

(iii) kSn()k=O_(log(_n₎₎_if ₌₋_1,

(iv) _kSn()k=O(1)if ¡−1.

Proof. If ai=i((i+ 1)−i) put n=Pni=1ai, then n= (n+ 1)+1−Pni=1+1i, and

Sn() = n X

i=1

i(Si−Si−1) =n+1(Sn=n)− n−1

X

i=1

(12)

(i) if ¿0, the following inequalities

The last inequality follows from the Taylor formula of function x.

If −1¡ ¡0, we getkSn()k6Cn1+_{. Conditions (iii) and (iv) follow by similar} arguments.

Proof of Theorem 4.1. Since the noise has a density p and f is continuous, the probability transition is strongly Fellerian and in this case, the stability is equivalent to the positive recurrence. We recall also that assumption p ¿0 (or ¡1) is sucient to ensure the stability of the chain (Duo, 1990).

A.1. Properties of the stationary distribution. Thus (Xn) is positive recurrent with an invariant distribution that satises R

(13)

− Next, we can nd constants a; b, such that

|Mn(x)_|6bn; _hM(x)_in 6an(1+2−d);

and easily prove that|(i₍_z₋_x₎₋₍_i₍_z₋_y₎_|₆_Cn_k_x₋_y_k_; _∀_i₆_n_{, for 0}_{¡ ¡} for some constant C and thus,

|Mn(x)−Mn(y)|=n|((n(Xn−x))−(n(Xn−y)))

Since the chain is stable and p bounded and continuous, we rst get

1

equicontinuity of sequencen−′ H_n′(·).

Finally, since gradp is bounded and has a compact support, we have

(14)

Thus, sup_x∈Rd|H˜n(x)−H_n′(x)|6C Pn

i=1i−d−= o(n−

′

) a.s., if ¿0. To resume, we have proved that, for all N ¡_∞, 0¡ ¡1=d,

sup kxk6N

|′n−′Hn(x)−h (x)| →0 a:s:

Taking =d (′= 1) and =K ends the proof of statement A. B. Study offˆ_n(x): We decompose the bias into

ˆ

f_n₊₁(x)−f(x) = (nhˆn(x))−1(Wn+1(x) +Rn+1(x)); (7)

whereKi(x) =K(i₍_X

i−x)), and

Wn+1(x) =

n X

i=1

idKi(x)i+1; Rn+1(x) =

n X

i=1

idKi(x)(f(Xi)−f(x)):

B.1. Uniform convergence of Rn(x): Let N ¡∞ and assume with no loss of gen-erality, that supp(K)_⊂Bd(0; R). Since f is continuous, then ∀ ¿0; ∃=()¿0; such thatkf(x)−f(y)k6, for all kxk6R+N, kyk6R+N andkx−yk6. Next, observe that if x ∈ Bd(0; N) we have either kXi −xk¿ Ri− and then Ki(x) = 0, or _kXi −xk6Ri− and then kXik6R+N. In this last case, the rst alternative

R6:i _yields _k_f₍_X_i)₋_f₍_x₎_k₆ _{and the second alternative} _{R ¿ i} _yields _k_f₍_X_i)₋

f(x)_k62 sup_k_z_k₆_R₊_N_kf(z)_k=L.

We put n1= inf (n; R6n), and get the inequalities

kRn(x)k6C+: n−1

X

i=n1

idK(x)6C+(n₋1) ˆhn−1(x)

which prove the a.s. uniform convergence on compacts of Rn(·)=n, i.e., ∀ ¿0, limnsupkxk6NkRn(x)=nk6supkxk6N h(x) a.s.

B.2. If f is of class C1 _and_C_{= sup}

x∈Bd(0; R+N)kgradfk, we have

Ki(x)kf(Xi)−f(x)k6CRi−Ki(x);

and,_kRn(x)k6CPni=1−1i(d−1)Ki(x). Then, Lemma 4.1 and the uniform convergence of ˆ

hn(·) (part A) yield,

sup kxk6N

kRn(x)k=O₍_n1−₎ _a_:_s_: ₍₈₎

B.3. Pointwise convergence offˆ_n(_·) :

hW(x)in= n−1

X

i=1

i2dK_i2(x):

Since K is Lipschitzian, the kernel K2 _{is also Lipschitzian with the same order} _{. If}

we take =K2_, _{= 2}_d_,′₌₋_d_{+ 1 = 1 +}_d_{, and proceed in the same way as} in part A, we get

sup kxk6N k

(1 +d)n−(1+d)hW(x)in− h(x) Z

K2(z) dzk →0 a:s: (9)

(15)

B.4. Uniform convergence of fˆ_n(·): Clearly, if the noise has a moment of order

m ¿2, Wn meets assumption H1 of Proposition 1 and then, it is enough to verify assumption H2.

Indeed, Inequality |Yn(0)|6Cnd_K_{i(0) and arguments as in part A.2 show that,}

|Yn(x)−Yn(y)|6Cn(d+)kx−yk, ¿0. Therefore, Corollary 3 applies and says that the sequence Wn(·)=(2n1+2(d+)_LL(_n₎₎1=2 _{is a.s. relatively compact.}

Since, for all ¿(1 + 2d)=2, there exists ¿0 such that ¿(1 + 2(d+))=2, we get supkxk6Nn−Wn(x)→0 a.s.

In particular, the value = 1 is possible if ¡1=2d. Summing up, we have proved that,

if ¿0 and 1₂(d++ 1)6 ¡1₂(d+), then

sup kxk6N k

ˆ

f_n(x)₋f(x)_k=O₍₍_n(2(d+)−1)_LL(_n₎₎1=2₎_:

Note that the case ¿1

2(d+) is useless, since the uniform convergence of the

esti-mator is not ensured and that the other case 0¡ 61

2(d++ 1) yields

sup kxk6N k

ˆ

f_n(x)₋f(x)_k=O₍_n−_(LL(_n₎₎1=2₎_:

We do not know yet if the value=1

2(d+ 1) which gives the best rate is attainable.

4.2. Pointwise CLT and ILL

Theorem 4.2. If the assumptions of Theorem 4:1 hold; if f is of class C1 _{and if}

1=(d+ 2)¡ ¡1=d; then forx1; : : : ; xq ∈S:

(A) (Zn(x1); : : : ; Zn(xq))whereZn(xj)=n(1−d)=2( ˆfn(xj)−f(xj));converges weakly to a

Gaussian distribution inRd×q which hasqindependent componentsN_{d(0; (} ₌₍₁₊ d)h(xj))R

K2₍_z_{) d}_z₎_{; j}_{= 1}_{; : : : ; q:}

(B) Moreover; if the noise has a nite conditional moment of order m ¿2; a point-wise iterated logarithm law holds on S;

lim n

n1−d 2 LL(n)

kfˆ_n(x)₋f(x)_k26 tr (1 +d)h(x)

Z

K2(z) dz a:s:

Proof. Considering bias (7), we have already proved that ˆhn(x) → h(x)¿0 a.s. on

S, if ¡1=d. Now, if f is C1 _and _¿₁₌₍_d_{+ 2)}_; _{the upper bound (8) improves}

into, sup_k_x_k₆_N_kRn(x)k= o(n(1+d)=2) a.s., and it remains only to study the asymptotic behavior (CLT and ILL) of Wn(x).

(A) We start checking the CLT assumptions for martingales (Duo, 1990). By (9), we get (1 +d)n−(1+d)hW(x)in→ h(x)

R

K2(z) dz a.s. For the Lindeberg’s condition, we note that V(t) =E(kk2_5{k

k¿t})→0 ift→ ∞, and that for ¿0 and _kK_k= sup(_|K(u):u_∈Rd), we have

n X

i=1

E(_kWi(x)k25{kWi(x)k¿:n(1+d)=2_})

6 n X

i=1

i2dK_i2(x)V

n(1−d)=2 kK_k

(16)

This is enough to prove the weak convergence of each Z(xi).

For the independence of components, it is enough to prove that, a.s.,

lim

n hW(x); W(y)in= limn n X

i=1

i(2d)Ki(x)Ki(y)¡∞; if x6=y: (10)

Considering the eventsAi={Xi∈Bd(x;Ri−)∩Bd(y;Ri−)}, we dene the martingale

Mn=Pni=1i2d(5Ai−P(Ai|Fi−1)) and its increasing process hMin. Since the density

is bounded, it follows by integration on Rd that,

P(Ai|Fi−1) =P(i∈Bd(x−f(Xi−1);Ri−)∩Bd(y−f(Xi−1);Ri−)|Fi−1)

6_5{k_x₋_y_k₆₂_Ri−_}P(_i∈Bd(x−f(X_i₋₁); 2Ri−)|F_i₋₁)

6Ci−d5{kx−yk62Ri−_}:

Put N(x; y) = inf (i:_kx₋y_k¿2Ri−_{) and observe that,}

hMin6 n X

i=1

i4dP(Ai|Fi−1)6C

N(x;y)

X

i=1

i3d¡∞:

Thus, Mn converges a.s. to a nite r.v. M∞. Moreover, since

n X

i=1

i2dP(Ai|Fi−1)6C

n X

i=1

id5{kx−yk62Ri−_}6CN(x; y)¡∞ a:s:;

the bound Ki(x)Ki(y)6_kKk2₅

Ai implies that (10) holds.

(B) The second part of theorem is a simple consequence of Proposition 3.1 and Re-mark 3.1 if we takem=2+2,i=idKi(x),s2n=

Pn

i=12i and prove that P∞

i=1(2i=s2i)1+ converges a.s.

Since, limn(1+d)n−(1+d)s2n=h(x) R

K2₍_z_{) d}_z_{, it is enough to show the convergence}

of P∞_i₌₁i−Zi where Zi=idKi2(1+)(x) and= (1−d)(1 +) +d: First, observe that Pn

i=1Zi=n converges if ¡1=d, and then apply Lemma 4.1 to P∞_i₌₁i−Zi; since ¡1: Next, note that limn(2sn2LL(s2n))−1kWn(x)k26tr a.s. to complete the proof.

4.3. Noise density estimator

If, in addition to functions f and h of the non-controlled model (4), we need to estimate the noise densityp, we consider on R2×R2 the autoregressive model, Zn+1=

F(Zn) +n∗+1, where

Zn+1=

Xn+1

Xn

; F(x; y) =

f(x)

f(y)

and ∗_n₊₁=

n+1

n

:

PutK∗=K⊗K, choose a pointx0,h(x0)¿0, and dene the following kernel estimate:

ˆ

p_n(y) = ( ˆhn(x0))−1hˆ

∗

n(x0; y+ ˆfn(x0));

where ˆh∗_n(x; y) =n−1Pn

(17)

Corollary 4.1. Under the assumptions of Theorem 4:1; pˆ_n(·) converges a.s. to p(·)

uniformly on S_∩C; for all compactC for all ¡1=2d.

If f is known at some point x0; h(x0)¿0, we had to substitute advantageously the

value ˆf_n(x0) forf(x0). Another method is to vary x0 and take x0=y for example.

Sketch of the proof. (1) Note rst, that ifK is of orderand meets assumption (H3), so is K∗ onRd+d. Note also that, the noise∗ is not white anymore and the rst part of the proof of Theorem 4.1 demands a slight modication. However, the stability of the chain (Xn) implies the stability of (Zn), which has a stationary distribution∗ with density h∗(x; y) =h(x)p(y−f(x)).

(2) Split up H∗ n(x; y) =

Pn

i=1iK∗(i(Xi−1−x; Xi−y)), into

H_n∗= (H_n∗₋H˜∗_n) + ( ˜H∗_n ₋Hˆ∗_n) + ( ˆH∗_n ₋H∗_n) + H∗_n;

where, H∗_n(x; y) =p(y−f(x))Pn

i=1i−2dp(x−f(Xi−2)),

˜

H∗_n(x; y) = n X

i=1

i−dKi−1(x)

Z

K(v)p(i−v+y₋f(Xi−1)) dv;

ˆ

H∗n(x; y) = n X

i=1

i

Z

K∗(i(u; v))p(u+x−f(Xi−2))p(v+y−f(u+x)) dudv:

We rst prove that, limn→∞′n− ′

H∗

n(x; y) =h(x)p(x−f(x)) for all ¿2d and

′=−2d+ 1.

Note that, ˜H∗_n is the F-compensator of H∗

n and that ˆH ∗

n is the F∗-compensator of ˜

H∗n where F∗= (Fn−1)n¿1.

(i) We follow the proof of Theorem 4.1 and readily obtain the a.s. uniform conver-gence on compacts of ′_n−′

H∗_n toh(x)p(x₋f(x)), if we take =.

(ii) Put en= sup(!n;2Rn−); where!n=!(f; R+N; n−) is the continuity modulus of f. Since p and its gradient are bounded and continuous,K has its support included in Bd(0; R), and !n→0, we get for all x; y; kxk6N;kyk6N;

K∗(u; v)|p(i−u+x−f(Xi−2))p(i−v+y−f(i−u+x))

−p(x₋f(Xi−2))p(y−f(x))|

6C(i−(_ku_k+_kv_k) +_kf(i−u+x)₋f(x)_k)5{kuk6R;kvk6R}6Cei:

For example, applying Lemma 4.1 to the sequence ei, we obtain

sup kxk6N;kyk6N

n−′|_Hˆ∗ n −H

∗

n|(x; y)6Cn −′

n X

i=1

i−2dei→0:

(18)

5. Controlled model

The controlled models (3) have no stationary distribution in general, and the statistic ˆ

hn(x) is not intended to estimate anything actual. However ˆf_n(x) continue to estimate the regression function f as shown below.

Theorem 5.1. Assume that

(1) C is known and f unknown but continuous;

(2) Noise (n) is white with a strictly positive andC1-dierentiable density p and

if p and its gradient are bounded.

(3) u(x) = sup_u_∈A(x)kC(x; u)k is bounded on compacts and

lim kxk→+∞

sup_u_∈A(x)kf(x) +C(x; u)k

kxk ¡1:

Then; for all initial distributions and admissible strategies; statementBof Theorem

4:1 continues to hold on Rd.

Proof. Only minor modications of the proof of Theorem 4.1 are needed. First, there is nothing to change in studying Mn=Hn−H˜n, since if F(x; u) =f(x) +C(x; u), the process Hn(x) =nhˆn(x) has compensator

˜

Hn= n X

i=1

Z

K(z)p(i−z+x₋F(Xi−1; Ui−1)) dz:

Next, we note that we must only bound ˆhn(:) on compacts (and not to deal with its convergence), the Lyapounov condition (3) enables us to stabilize the chain (Duo, 1990) and then, to get a constant M so that,

lim n

1

n

n X

i=1

kXik26M and for r ¿ M; lim n

n(Bd(0; r))¿1−(M=r)2:

Set,d(r)=sup(_kF(x; u)_k; u∈A₍_x₎_;_k_x_k6r):It follows thatm(r)=inf (p(z):_kzk6R+

N+d(r))¿0; sincep ¿0 and continuous.

For larger, so that d(r)6r where ¡1, we obtain

kpk¿

Z

K(z)p(i−z+x−F(Xi−1; Ui−1)) dz

¿m(r)5{kxk6N;kXi−1k6r}

Z

K(z) dz=m(r)5{kxk6N;kXi−1k6r};

i.e., kpk¿n−1_H_˜_n(_x₎_¿_m₍_r₎_n(_B_d(0_{; r}₎₎_; _{for all} _k_x_k₆_N_.

Thus, if r ¿√2M, by the Lyapounov condition we get the bounds

0¡ 6m(r) 2 6limn

inf kxk6N

˜

hn(x)6lim n _k_xsup_k₆_N

˜

hn(x)6_kpk= ¡∞:

Since all terms in Lemma 4.1 are positive, we get

n−(1+)Sn() =n−1Sn−n1+ n−1

X

i=1

ai(Si=i) !

6Sn=n:

(19)

6. For further reading

The following reference is also of interest to the reader: Stute, 1982.

References

Devroy, L., 1988. An equivalent theorem for l1 convergence of the kernel regression estimate. J. Statist. Plann. Inference 18.

Devroy, L., Penrod, C., 1984. The consistency of automatic kernel density estimates. Ann. Statist. 12 (4). Duo, M., 1990. Methodes recursives aleatoires. Masson, Paris.

Duo, M., Senoussi, R., Touati, A., 1990. Sur la loi des grands nombres pour les martingales vectorielles et l’estimateur des moindres carres du modele de regression. Ann. Inst. H. Poincare 26 (4).

Hall, P., 1981. Laws of the logarithm for nonparametric density estimators. Z. Wahrsch. Verw. Geb. 56. Hernandez-Lerma, O., 1991. On integrated square errors of recursive nonparametric estimates of non

stationary Markov processes. Probab. Math. Statist. 12 (1).

Heyde, C., Scott, D., 1973. Invariance principles for the law of the iterated logarithm for martingales and processes with stationary increments. Ann. Probab. 1 (3).

Iosifescu, M., Grigorescu, S., 1990. Dependance with Complete Connections and its Applications. Cambridge Tracts in Mathematics. Cambridge University Press, Cambridge.

Kuelbs, J., 1976. A strong convergence theorem for Banach space valued random variables. Ann. Probab. 4. Ledoux, M., Talagrand, M., 1986. La loi du logarithme itere dans les espaces de Banach. C.R. Acad. Sci.

Paris 303 (2).

Liero, H., 1989. Strong uniform consistency of nonparametric regression function estimates. Probab. Theory Related Fields 82.

Mack, Y.P., Silverman, B.W., 1982. Weak and strong uniform consistency of kernel regression estimates. Z. Wahrsch. Verw. Geb. 61.

Masry, M., Gyorfy, L., 1987. Strong consistency and rates for recursive probability density estimators of stationary processes. J. Multivariate Anal. 22.

Mourier, E., 1953. Elements aleatoires dans des espaces de Banach. Ann. Inst. H. Poincare 13. Neveu, J., 1964. Bases mathematiques du calcul de probabilites. Masson, Paris.

Rao, R.R., 1963. The law of large numbers for D([0,1];R)-valued random variables. Theory Probab. Appl. 8.

Stout, W.F., 1970. A martingale analogue of Kolmogorov’s law of the iterated logarithm. Z. Wahrsch. Verw. Geb. 15.