Ridge-Parameter Regularization to Deconvolution Problem with Unknown Error Distribution D a ng

(1)

Vielnam J. Math (2015)43.239-256 DOI 10 10O7/S1O013-015-0119-1

Ridge-Parameter Regularization to Deconvolution Problem with Unknown Error Distribution

D a n g D u e I V o n g - C a o X u a n P h u o n g

Received: 14 October 2013 / Accepted- 27 June 2014 / Published online: 25 January 2015

A b s t r a c t O u r a i m in t h i s a r t i c l e i s lo e s t i m a t e a d e n s i t y f u n c t i o n / o f i.i.d. r a n d o m v a r i - a b l e s X\ Xn f r o m a n o i s e m o d e l Yj = Xj + Zj, j — 1.2. .. .n. H e r e , iZj)\<j<„

is i n d e p e n d e n t of ( ^ 7 ) i < j < n a n d is a finite s e q u e n c e of i i.d. n o i s e r a n d o m v a r i a b l e s d i s - t r i b u t e d w i t h a n u n k n o w n d e n s i t y f u n c t i o n g. T h i s p r o b l e m is k n o w n a s t h e d e c o n v o l u t i o n p r o b l e m in n o n p a r a m e t r i c s t a t i s t i c s . T h e g e n e r a l c a s e in w h i c h t h e e r r o r d e n s i t y f u n c t i o n g i s u n k n o w n a n d its F o u r i e r t r a n s f o r m g c a n v a n i s h o n a s u b s e t o f M h a s still n o t b e e n c o n s i d - e r e d m u c h . I n t h e p r e s e n t a r t i c l e , w e c o n s i d e r t h i s c a s e . U s i n g d i r e c t i.i.d. d a t a Z , , . . . , Zj„

w h i c h a r e c o l l e c t e d in s e p a r a t e d i n d e p e n d e n t e x p e r i m e n t s , w e p r o p o s e a n e s t i m a t o r g to t h e u n k n o w n d e n s i t y f u n c t i o n g. A f t e r t h a i , a p p l y i n g a r i d g e - p a r a m e t e r r e g u l a r i z a t i o n m e t h o d a n d a n e s t i m a t i o n o f the L e b e s g u e m e a s u r e o f l o w level s e t s ofg^. w e g i v e a n e s t i m a t o r / t o t h e t a r g e t d e n s i t y f u n c t i o n / a n d e v a l u a t e t h e r a t e o f c o n v e r g e n c e o f t h e q u a n t i t y E | | / - / | | ^ .

K e y w o r d s D e c o n v o l u t i o n • D e n s i t y f u n c t i o n • F o u r i e r t r a n s f o r m E s t i m a t o r M e a n i n t e g r a t e d s q u a r e d e r r o r

M a t h e m a t i c s S u b j e c t C l a s s i T i c a t i o n ( 2 0 1 0 ) 6 2 F 1 2 - 6 2 G 0 7

Dedicated lo the 65th bmhday of Professor Nguyen Khoa Son

This article is supported by the National Foundation of Scientific and Technology Developmi (NAFOSTED)-Project 101.01-2012.07.

D . D . Trong ( S )

Faculty of Mathematics and Computer Science.

Ho Chi Minh City National University,

No 227. Nguyen Van Cu Street, Ward 4. District 5, Ho Chi Mmh City. Vielnam e-mail: [email protected] vn

C. X. Phuong

Faculty of MathemaUcs and Statistics, Ton Due Thang University,

No. 19. Nguyen Huu Tho Street, Tan Phong Ward. Dismct 7, Ho Chi Mmh City. Vietnam e-mail: cao\[email protected]

^ S p r i i

(2)

D D Trong, C. X Phuong

1 Introduction

In this article, our goal is to estimate a density function / of independent and identically distributed (i.i d.) random variables X], . , Xn from a noise model

fj^Xj + Zj, j = \,2,....n. (1) Here, (Z^) i <j<„ is a finite sequence of i i.d. noise random vanables which have an unknown

density ftinction g Moreover, die sequencers independent of (Xj) !<;<„. According to die basic theory of probability, the relationship between the density function h of Y] widi the functions / and g is represented by die convolution equation

h = f*g. (2) where ^

if^g)(x) = f fix-u)giu)du. x&R denotes the convolution of two functions / , g.

For convenience, we introduce some notations. For die function (p e l}(R), we denote by \\<p\\2 the L^-norm of ip, that is, \\ip\\\ = J'^^\(pix)\^dx. For 0 e L ' ( M ) , the Fourier transform 0^' of ip is defined by

0 f t ( f ) ^ I <p{x)e^'-'dx, t eR.

From the Founer analysis, we know that if (/)*'' e L ' ( K ) n L^(K), then

<t>ix)^^ r tj>Ht)e-"'dt, xeR.

2n J-oo

In addition, H^^lb = V27rll^||2, which is known as the Parseva/f^^e/iffrv.For two sequences of positive numbers ia„) and (fen), we use the notation On — Oib„) to represent n,, <

const bn for all n large enough. Finally, X iA) denotes the Lebesgue measure of a set A c 1 andZ(9)) ^{t e K : ^ ( r ) ^ Oj.

In deterministic context, the functions h and g are known and the problem of solving / from (2) IS an lU-posed problem in mathematical analysis. However, in statistical context, instead of knowing the density function h, we have just the contaminated sample iYj)i<j<n- Thus, naturally, we have to build an estimator for h from these observations and use it to find / In almost all papers of deconvolution, the authors assumed that g was known and Zig^) — 0. The problem of estimating / based on the sample (Tj)i<;<n and on the previous assumptions can be called the classical deconvolution problem in statistics. It has been studied for more than two decades, and now, there are a large number of works on diis problem The most common approach to estimate / is the deconvolution kernel method. In this method, by using a kernel function K with compactly supported Fourier transform K^

and a smoothing parameter p > 0 called bandwidth, one approximates / by the following estimator

1 f^ „,K^'ipt) 1 ^

''dt. xeR (3) This method was used by Carroll-Hall [I], Devroye {3], Stefanski and Carroll [17], Zhang [19], and Fan [5-7]. Besides die kernel estimation, die wavelet-based method is also an approach to solve the deconvolution problem. The reader can find some important contri- butions related to this method in [11, 16]. In fact, m the classical deconvolution problem, the

^ Springer

(3)

J Deconvolution Problem

difficulty of establishing convergence rate of estimators depends heavily on the smoothness of the error density. Therefore, we often consider the error density g in the form

,•,(1 -1- |rl^)^''e-'''i'i' < \g'Hi)\ < ciO + | f | ^ ) -^{,-^2}^I'r

where 0 < Ci < Q , 0 < dj < d\, r e R, s > 0 ir > 0 if s = i)). We recall that the density g is called ordinary smooth if i = 0 and supersmooth otherwise. In [5], Fan established the optimal rate of convergence of the estimator in (3) for both the ordinary smooth case and the supersmooth case. Heproved that the optimal rate of convergence of the mean squared error as well as the mean integrated squared error (MISE) attains polynomial rate in the ordinary smooth error density case and logarithmic rate in the supersmooth one

Although the kernel method as well as the wavelet-based method is very popular, there are sdll some restrictions in application. The first drawback is the assumption that the error density g must be known. This is unrealistic in many practical applications The second drawback is the assumption Z(g^') — 0. The latter assumption does apparently not hold even with simplest densities, e.g., the one of die uniform distnbution on [—a, a], a > 0.

Recently, for known g. Hah and Meister [8] have given a ridge-parameter approach for estimating / to overcome the case Z(g'^') ^ 0. This approach does not require the use of kernel funcrion and bandwidth parameter in its estimation. In fact, if the quandty g^'(t) in denominator of (3) gets too close to zero, then it ts replaced by the positive function '•»(') = "~ |f| . ^ > 0, 5 > 0 called the ridge funciion More precisely, they suggest an estimator for / in the form

1 «f'(-oig^or 1 -(^ .,Y

max(|gf'(r)|;"-f|'|*l

with r > 0. The method of using the estimator in (4) is known as the ndge-parameler regularization The optimal rate of convergence of the MISE in [8] is established under the condition

M i | s m ( / f / ) | ' ' | r r ^ < | / ' ( r ) | < M2\s\niKt)\''\t\-'^ forail |/i > T (5) and g^^t) ^ 0 for \t\ < T, where p. > 1, f > 0, f > 0, T > 0, 0 < Wi < MT. In fact, Zig'^) in (5) is known and the condition (5) is satisfied in a narrow class of distribution, e.g., the uniform one.

In the present ariicle, we focus on considenng the deconvolution model (I) with the unknown g. Obviously, this is a large opposition to the classical deconvolution problem.

In this case, we estimate the unknown error density g from an additional noise sample (Z[ Z,'„). This sample is independent of (}']. • •< T„) and is collected in separated independent experiments. The data Z j , . . . , Z,'^ are i i.d with the density g. In the case of ordinary smooth error density, Diggle and Hall [4] studied the asymptotic MISE of the truncated estimator

fiu.yi ''„.z ... ,, , ,

' - • - , s '

where g^' denotes the empirical characteristic function of the sample (ZJ, ., ZJ„), -n, , 1 •

Stic func

a Spri.

(4)

D D. Trong. C, X Phuong and p > 0 is the parameter to be selected. Following Diggle and Hal! [4], some authors (see, e.g., [2, 9, 15, 18]) considered the model (1) by using the estimator of general form

where A is a subset of E and I^ is its indicator function. In Neumann [15], for A — (r e IR • \g^\l)\' > m " ' ' | , the author showed that the rate of convergence of die MISE over the Bessel-potential space is optimal when g is ordinary smooth. Johannes [9) smdied the estimator / j for A = |; e 1 : | ^ " ( 0 | - > «(1 + ' " f l . " — «(«.'"), s € R Comte and Lacour [2] mtroduced an approach of data-driven density estimation in the presence of error with unknown distribution. They studied the estimator f^ with K^iT) = xi-7T.7T]it) and A = [i e R : \g^HO\- > ' " " ' l Wang and Ye [18] have recently suggested a ridge-based kernel deconvolution estimator

1 r -n. KHpt)gH-t) 1-A „v .

• ^ ^ ^ - ^ ' • - ' ^ - ^ ' ^ ' " ' - Y n i j m a x ( | | ^ K O P ; . - M i ^ g ^ ' "

and studied effects of error magnitude on the estimator. They also established asymptotic estimates on two cases of ordinary smooth and supersmoodi noise.

To the best of our knowledge, ahhough there are many articles related to the deconvolution model (1) with unknown error distribution, we did not find any article dealing with the convergence rate of the MISE for the case of the unknown Z(g*'), so we will study the problem in our article. Using g'^ given in (6) as an estimator for g^' and applying the ridge- parameter regularizaUon, we introduce an estimation procedure for the target density / . In view of properties of entire functions and some results from harmonic analysis, we estab- lish the upper bound for the Lebesgue measure of low level sets of the function g^' and give the convei^ence rate of our procedure.

The rest of our article consists of two sections. In Section 2, we propose an estimator / for / based on the ridge-parameter regularization and use il to give a result of mean consistency and estimations of convergence rate of the MISE The proofs of the results in SecHon 2 are provided in Section 3.

2 Approximation Results

In this section, from the contaminated sample {Y\ Y„) as well as the additional noise sample {Z'^ Z^), we propose an estimator for the target density / and then estabhsh the uniform rate of convergence of the MISE on a class of target densities that we give. Esti- mator chosen in the present article is an improved version of the ridge-parameter estimator in the smdy of Hall and Meister [8] in which we replace the function g by the esnmator g as in (6). More precisely, for ^ < i < 1, we consider the following esrimator for f,

f'^-'^'' **^"•^•- •''''-2iL'^"\.MllH.)P..,Wn^/'''''- <"**

where a — ain.m) is an appropriate positive parameter depending on the sample sizes n, m such that a —* 0 as «. ;n —* oo.

S Springer

(5)

Ridge-Parameter Regulanzation to Deconvolution Problem

The condition ^ < b < I ensures die definiteness of the estimator / . To see this clearly, we consider the function

- ^ ' ^ ^ t e " \ I.M.

m a x { | g » ( ; ) p ; a ! , | ' ' | „ _

Since Ihe function g^ is continuous on ]R and 1*^(0) = 1, there exists a random number I > 0 such that jglOJI > i for all! £ [ - / , / ] . It follows that

r mt)\^^t < f A^ ^ [ "

y-oo ~ J-I max(lif'(r)P. a\t\'']^ J\n.i max{|^ft(/)P: «|f|''l^

f' ^' f ^'

- J-,\gHi)\'^Jl,l.i'^^^Ui^

f dl

< 32/ -I- / - , ,, < oc

That means * e L"(R), and so f is defined for almost all values of ;r.

Recall that, for the target density / e L^(R), the MISE between the estimator / and / is defined by E | | / - / | | | . Concerning a general upper bound for this quantity, we have the following result.

Proposition I Let ^ -\- ^^ < a < [, j < b < 1. Assume ihal f e L'(M) n L^(IR) is the target density and g e Z.' (R) fl L-(R) is the density of error random variables Zj 's. Then, there exists a constant L > 0 .such thai

Eu-ff,<t(-+-)-+2r

ls"(')l' max|!gn(I)|2;a|f|*)

^^/:

\g"(t)\' ,mml\gHl)\^.aM''f ' where f is defined as in{l) and

The next theorem will give the mean consistency with respect to the L^(K)-norm (see Ihe definition in page 24, Section 2.3, [13]) of the estimator / in case A(Z(g^')) — 0.

Theorem 1 Let the assumptions of Proposition I hold. Suppose thai a :— ain, m) -* 0 such that na^ -* oo, ma ->• oo as n,m ->• oo. /« addition, we assume the Lebesgue measure XiZig^')) = 0 Then. E | | / - fWJ -^Oasn.m-* oo

Actually, the problem of finding a consistent estimator in deconvolution density in which g is known and g^^ can vanish on M has also been considered by a few authors.

In the study of Devroye [3], the author showed that it is possible to construct an esti- m a t o r / „ = / ( - i K i , . . , > ' „ ) such I h a t E / ^ | / „ ( x ) - / ( A - ) | ( / j : ^ Oas»! - ^ oo if XiZig^^)) = 0. Meister [12] also gave a consistency result in L-iH) weight norm in a more general case in which the set Zig^) does not contain any open, non-empty interval.

However, in those papers, no convergence rates are provided. An interesting work related

Q Sprineer

(6)

D D. Trong, C X Phuong to this problem is by Meister and Neumann [14] in which the authors studied the same problem as Hall and Meister [8] in the replicated measurement case. Using the condition supj \g^^iT)g^il — r ) | > const • (1 -1- |r|)~'" for to > 0. which contains some densities whose Fourier transform has zeros, their estimator attains the same convergence rates as diose obtained for ordinary smoodi error densities described in the study of Fan [5].

In this part, we concentrate on establishing the rate of convergence of the MISE. Of course, this would not be possible if we do not have assumptions of / and g any more.

Therefore, some information for / and g must be given. First, we suppose that the targei density / is in the Sobolev class

. ^ 3 , C - j d e n s i t y / o n f f i : / e / . 2 ( M ) a n d f \/\t)\-il-b t^dt < c\ (8) with p > ^,C > 0. Concerning the error density g, we consider three classes

^Cj.C2.y = density j? on R ; C,(l + 1^)'^ < | / ' ( o | < CiH + 1")"^ Vt e m .

%,</,,, = I density J? on K : |g'''(/)| > De"'''''" Vr € R | ,

%, = [densityg o n ! : g e L^(H), suppg c [-M.M]}. (9) where C], Cj, y. D,d, i;, and M are positive constants. The notation suppg in (9) denotes

the closure in M of the set (;c s R : g(x) ^ 0} and is called support of the density g.

The condition imposed on / as in (8) is quite natural. In fact, for fi integer, if the density / has the weak derivatives / ' * ' in L-(R) (^ ^ 0 . . . . , ^ ) , then / is in ^p,c for C suffi- ciently large. The class ,^^,c is also considered in many articles on deconvolution such as [8, 15] The class ^Ci.Cj,)- contains the density of the Laplace distribution, the gamma distnbution, while the class ^o.d.i, includes the density of the normal distribution, the Cauchy distribution. In deconvolution topics, the class ^ ^ has been less studied. This class contains the densities of the uniform distributions.

The main purpose of our article is to approximate / by / in the case that the error density g is compactly supported. However, to verify that the estimator / is well chosen, we first want to show Uiat, with die estimator, the classical optimal rates are preserved in die ordinary smooth and supersmooth cases of g In fact, we have the following resuh.

Theorem 2 i-ef 2 < * < 1, ^ > ^, C > 0, C, > 0, C2 > 0, y > 0, Z) > 0, tf > 0, fj > 0.

i) Suppose thai g e^c,.C2r- Choosing a = ( i + l)i^^Mn-^ we get

^up Ell/./ll^^offi + i ) " * ^ !

'o.it.:!. Then, choosing o — (^ -t- ; ^ ) ^ ^ (

sup E | | / - / | | | = o f ( l n / i ) - ¥ - K l n m ) -

eJ'f.c \ / 6 ^ / i c

li) Suppose tlial g G ^o.it.,!. Phen, choosing o ^ ( i -t- l ) ^ ^ { l n ( i -j- i ) - ' ) - S f for 0<d < ^.wegel

fffr.c

^ Springer

(7)

Ridge-Parameler Regulanzation to Deconvolution Problem 245 When g is known, Fan [5] proved that the optimal minimax rate of the MISE respect to

the cases g e ^C\.Ct,y, g e '^o.d.n are/i~^^-'-^''+', (ln«)~"^. We see from Theorem 2 thai these optimal rates are preserved with the unknown g as long as m > «

Forg e ''^Ci.Ci.y.^eamz.nn [15] as well as Comte and Lacour [2] proved that the optimal rate of the MISE is of order /j,~''*+">'+' -j- m~""" 'y . It is preserved here in case in > n In case n > m, the convergence rate in Theorem 2 i) is of order m W+JF+T while the rate in the study of Neumann [15] and Comte and Lacour [2] is of order m -"+-'•+' -I-

^j-mm| 'i'l and we see that these convergence rates are the same in case « > HI since we havem"'"'" 'y' < m'^ti+iy+t. For 5 e (^o,^,;,, Theorem 2 ii) shows that the convergence rateof the MISE IS of order (Inn) " -|-(lnm) 1 , and this rate coincides with the rate for supersmooth error densities in Ihe study of Comte and Lacour [2].

From now on, we consider the rate of convergence of the MISE under the condition g e ^^M. / e ^!),c- We first have

J-y, I m a x ( [ / ' ( O r ; ol'l"! I I I ^lfift(Op<''l'l"

Moreover, since |^'"{/)| > ^ for all / e [—L. L], we have the estimate

n L maxdj'-WP;.!,!'!^"' ^ n V" L '" + «'- /„,„. W)

<(i2L+f ^ ) ( - + -]-^- (11) Combining the estimates (10) and (11) with Proposition 1, we get

Etf-ff,<k(- + - ) ^ + 2f lf"(l)\'dt. (12) where { = * -I- 64i. + 2 j j , | ^ ^ Irl-'^rfi.

Now, let E > 0, fl > 0, V > 0. For the error density g, we denote

S s , , . = |r E K ; in < « , \g"(t)\<e"). (13) Choose or > 0 such that

— > R ' , (14)

a Then,

[r effi; \l\<R, s' < \g'Hl)\ < x / f l ' l ' l = 0 . which yields

| l € f f i : | j ' ' ( / ) P < o | r | ' ' ] C B « , , . U | / e K ; lr| > R].

Hence, from (12), we obtain

EII/-/II2 <f- + - 1 - ^ + 2 / \f"(l)\^dt + 2 f \f"(t)i'dt*

\n m)cfl Js^^, J., »

<k(--^-)-^-^ 2i(Bs,,.) + 2 f \f"l.t)\'dt.

^ Springer

(8)

D. D Trong, C X. Phuong

and so.

Note that when /' e ^p.c. we have

/ UHDX'-di = [ |/'(/)|2(l - h r Y d + t^)-^dt < CR-^P,

Jli|>R J\l\>R

n\.f - fWl ^ ^ ( ^ + ; ; ; ) ^ + 2X(B/i,,0 + 2CR-^^. (!5) The sets SR^E^ as in (13) are called the /ovi'/^v^/jeMof g*^'. In the case that the error density g is compacdy supported, the set of zeros i'^'(f) as well as the set BK^^'- affect heavily the recovering of / from its Fourier transform. To obtain the convergence rate of the MISE, we must know the upper bounds of the Lebesgue measure of SR.^I' . The following theorem will give us an answer for the question.

Theorem 3 Let fi > j , v > 0, M > 0 and g e 'SM For £ > 0 small enough, we choose R to satisfy

2eMR[i2fi + 1) ln(R) -|- ln(15e^)] = IniE'"). (16) Then, for e > 0 small enough, we have

(30(2/? -I- l)Me'^)-Hlni£-"))i <R< (2^Af)-' ln(£-'') (17) and

^ ( B R . , 0 < 2R-^P, (18) where BR.E" is defined as in (13).

For e ^ 7J + i^ small enough, we choose R : - R„ ,„ to satisfy (16). Substituting K = /^H./„ into (15), we obtain

mf - ff. <k(- + - ) \ + i2C-\-4)R-^^. (19)

\n mja^

On the other hand. Theorem 3 implies

ff.,^ > (30(2^ -\- l)Me*)--^ (in ( ^ + - ) ' ) (20) From (19) and (20). we obtain

m.t-f\\l<k'

where k* = max{k, (2C -i- 4)(30(2^ -|- OMe"*)^).

^ Finally, we need to select the parameter a > 0 according to H, w so that the tenii

^7, + ; ; ; ) ^ c o n v e r g e s t o 2 e r o f a s t e r t h a n t h e t e n n ( ! n ( i - i - ^ ) - ' ' ) - / ' In addition, a must verify the condition ^ > R''us in (14), where £ - ^ + ;^ and ^ is as in Theorem 3. To satisfy these conditions, we propose to choose a — ij-,-^-^)^ with2i; < r < I 0 < v < - Then, we obtain the following theorem which is the main resuh of our article ' ''

® Springer

(9)

Ridge-Parameter Regularization to Deconvolution Problem

Theorem 4 L e t i < ^ < L ^ > 5 , C > 0 , 0 < u < ^ , M > 0 and the error density g e ^M- Lei the estimator f be as in il) with cc = i~-\- Jj)^ with 2v < ^ < \. Then,

sup K\\f - f\\l= O [ln(- + -^ 1

3 Proofs

To prove Proposition 1, we need the following lemma.

Lemma 1 Lei a > 0, b > 0, Lei g be Ihe density funciion of error random variables ZI, ... Z„ and g be as in (6). Then, for all t eRwe have

Q .^ I F(o F(o I

|max[|^ft(r)|2;a|r|''| max(|^f'(0|-; akl''} |

< 2 | g f t ( f ) - ^ " ( f ) | ^ \g^it)-gHt)\

V'max{|g"(/)p;a|r|'')-max(|,?f'(OI^;ai'l*l max{|/>(r)p: or|;|'']' Proof We have, for all t e R,

e = !2ii

raax||g"(/)[2;o|r|»| max{ls»(()|2, ol(|'')' where

Cl : = ? ( 0 • max ||g"(r)l2;al/|» j - |n(7) • max | l s ' ' ( / ) p : a l t l ' j , Using the equality max[«; v | = ,^(;( -f u -t- |H - u|) for all M, u > 0, weget

Cl = ^i«(o-Fw[s''<')-s"(/)] + io|ii'[?i(o-Fw]

+ ^ [ s ' ' ( 0 - | l s " ( ' ) P - a | ' l ' | - F ( 0 ' | l s " m P - o | r | ' | ] , which imphes

ICil < j(l«"(')lls"(r)l+«|(|')|s"(')-s"(r)[

+;^|siHO-|ls"(')r--o|r|'i-F(0-|!s"(')P-o|t|''||.

( I g " ( ' ) l l 8 » ( f ) l + » | t | ' ' ) | 8 " ( r ) - 8 " ( i ) |

^ - 2max||s''«)|2;<r|;|').maxl|i,.«(;)|2:o.|;|''|

|i»(0 . \\g'\t)\^ - a\lt\ - JHT) • lls'Wp - »|/|»|

^ 2max{|g«(r)p:i,|;|»l m a x | | j " ( r ) P : a l t l ' )

^ Springer

(10)

(11)

(12)

D D.Trong, C. X Phuong Now, we estimate h- Combining the equality h^^ = f^ g^' with (22), we get

J - c c " I ' l ma J_^ III"

V J-i. I ' r ^^ J\i\^L ) ma Hence, from these estimates of / [ , I2, we obtain

\nina^ ma J \J_L I ' T ^|;|,.>L I'l ^ J|(|>L / Moreover, since « > ^ -(- i , we have

Hina-^ ma \mna mJ a \n m/a and this implies

(23) We next focus on estimating the quantity E | | / — / H , . Once again, using the Parseval identity, the Fubini theorem, and the common variance-bias decomposition, we have

E I I / - / I I ? - ^ J_ m\l)-f''it)\''dt + ^ p \^Tf\l)dl.

By the estimates

max(|s«(I)|2;o|r|»| max{|j«(/)p; o l / l ' ) - ' * ' ' 1 8 ! « ! ! x,„,.<.>-u 1 l«"(')P Var/»(;) = ,, , , , . , , , ,t„Var(e"'^') <

i i | I S « ( / ) F ; o | l | * P • ~" ' " n m a x ( | g » ( / ) P : a | ( | » P ' we deduce that

E l l /

'-^«^^£

l«"(r)l'

^^/:

<,ft(,)|2;,

|J?''(')P

A\gHt)\\oi\t\'']'^"'- ^^^' Finally, combining the estimates (23) and (24) with the inequality (21), we get the

conclusion of the proposition. Q Proof of Theorem 1 For all n, m e N and f e M, we have

\s^^il)\^ I

m a x ( | , l . ( , ) P ; » | , | > | - ' k " ' " l ' ^ l ^ ' " " ' ' ^ ^ ' W .

max||g'i(r)|2;ol,|») - 1 l / " « r ^ O a . e ( e K .

SSpru

(13)

Therefore, by the Lebesgue dominated convergence theorem, we obtain

iroP

\f^Ht)\-dt -^ 0 J-^\max{\gHt)\^:a\li'']

as/i,m ^ CO. Next, since \g^'il)\ > 5 foralU € [-£., L], we have

a s n . m -^ oo. Combining the above results with the assumptions and Proposition 1,we get

the conclusion of the theorem D To prove part i) of Theorem 2, we need the following lemma.

Lemma 2 Let g e '^Ci.c^.y Assume that ^ > 2^. Then, for T = ( ^ ) " + ^ , we have

( r e M . | ^ ' ' ' ( O I ^ < a | ' l ' ' | C{ieR: \t\ > T].

Proof Forr e R such that |g*'(OP i a|f|'', the definition of ^fci.Cj,^ implies C i ( l - F f ^ ) - ' ' < a | r | ^

ll follows that

\i\''i\+t^)y > ^ . a

Since ^ > 2>',weget jf| > 1. So. 2>'|r|''+^>' > ^ o r | f | > ( ^ ) * ^ = 1. D Proof of Theorem 2 Let / e .,^^,c-

i) From the estimate

/ ^ „ o | m a x | | s » ( O P . i » | f | ' ' ) I J|,"(n,!iaii"

and Proposition 1, we get

E|l/-/lli <k{- + -\-+2l \f\t)\-dt

2 f " | 8 " ( ' ) | ' j ^ n y _c „m ax || « li (() |2 , «| l|' '|2 The latter estimate and Lemma 2 imply

E l l / - / l l i < t f - + - ) - + Jl + /2 + J j , (25) where

/ H . r n J|,l<7- l s " ( ' ) | - ri J . , , . r o ' l l l - '

^ Springer

(14)

D. D. Trong. C X. Phuong We have

^ 1 = 2 / l . / ' " ( f ) P ( l + r - ) ' / C i \ ~ ' ^ -M- + t-)'^ii +l^)-^dl < 2 C r - ^ ^ = 2 C ( — 1 a ^ + ^ .

2J'+2

1 = 2 / 1./-^'

* = - / ' r s k i * ^ ^ / ' (i + rV*<-^(i + rVr<?^7-^>'+>

« J\n<T lff"(r)P wCi /,,|<7- nCi nCi nCi \a2r) ~ 1 ^ [v )

„ifS'

**. 3 , 2 / '^^^^dt^m i , i - v - * = i ^ r 5**

4C2 T-^K-Jt+I 2 6 - 1

4C2

dM

^4C2

( 2 y - l - 2 i - l ) ( i o 2 ( " Z ^ y 2y + 2 i i - l (^21-; | i ; ' Combining the estimates of Ji, J2. J^ with the inequality (25), we obtain

E | l / - / l l i < C j ' ^ ' ' ' ' ' ~ '

-{a- ^{n i \ 1}

^••-m

) such that

2^ / Q.6+J7 _ j

. 2 J '

- + -

2)-H 3 F F

LV

1 " ^ ^ ' 2 y - l - 2 i -

1

C^

^,21-

a ^

It follows that

n.f-f\\lscy

\n mJ W+ij'+T

The latter estimate holds for all / e ^ ^ . c , and this implies the conclusion of the cart i) of the theorem. *^ '

^ S p m

(15)

i Deconvolution Problem ii) Let T > 0 be a parameter chosen later We have

r 1 i«""'^' !\\f"{t)\-dt < r tK—r^\f"o)\-dt

7_„|max{|g«(;)p;o|(|'l ' | '^ ' " - i-o<, max|ls"<')P: «|I|')- '

= Lf max,i;w!^;a|,in^l^"""'^' ^ / „ . . max.^MP; »|,|>P'^"""'•"

< /• «"|rl'"l/»(i)|2|S»(0r*rfi+/' |/»(r)P<i<<-^o2j.2V''f'+cf-2«.

In addition, using the way as in the proof of Theorem 1, we have

ls''(')l'-

max(ls"(OP;o|r|'

1 \ I Hence, from Proposition 1, we obtam

E | | / - . n i i . ( * . 6 4 . . 2 | ^ ^ i ; i ) ( i . i ) l . | » ^ f - . - V 2 C i Now, we choose f = (Jln{^ -i- ^)~ )^ forO < ^ < ^ . Then,

Ell/-/lli^C.[(Ui)l+.^(.n(i + i)"')"(Vi)"""'

with

C4 = max Letting Cf > 0 satisfies

(i,i)^=„^fin(i.iyy(i+iy

\n m/a- \ \n in J j \n mj

Then, for », m large enough, we have

E | | / - / | , i . a [ 2 ( , „ ( i . l ) - y ( U i ) ™ - b ( : n ( V i ) -

< 3C4('ln(i-l-jJ^) J

If HI > n, we have

i i f - + - ' ) > l n ( ' ^ ) - l n ; t - l n 2 > - I n / i ,

\ n f n / \ 2 / -d

^ Springer

(16)

D- D. Trong, C X. Phuong

H^i)T^G)""""'^'-

I, by the same arguments, we also have

fl n

From the above two cases, we deduce _ 2 e

< (]-) " j d n / j ) " ^ -l-(lnm)"

E l l / - / l l 2 < 3 C 4 Q ^ " \i\nn) ^i+ilnm) ''f 1 .

The latter estimate holds forail / G ..^^ c, and this implies the conclusion of the part ii) of

the theorem. D To prove Theorem 3, we will use the following result (see Theorem 4, Section 11.3 in

[10]).

Lemma 3 Let fiz) be an analytic function in the disk [z '• \z\ < 2eU], | / ( 0 ) | = 1, andUt J] be an arbitrary small positive number. Then, Ihe estimate

/ l 5 P \

l n | / ( ; ) | > - I n l j • l n M ^ ( 2 e t / )

(5 valid everywhere in the disk [z • \z\ < U\ except a set of disks iBj) with sum qf radii Y^fj < n^. where Mfir) = max|;|=,. | / ( 7 ) |

Proof of Theorem 3 We consider the function

fit) - 2eMi [ ( 2 ^ -f- 1) ln(f) -|- ln(15e^)] - InCf-"). ( > 0.

For each e > 0 small enough, i/it) ^ co as ; - * cxj and ^it) -^ ln(£'') < 0 as / ^ O'*'.

Hence, there exists a « > 0 satisfying (16) Now, from (! 6), we deduce that 2 ( 2 ^ + l)eMRlnil5e^R) > iti(e-'').

In view of the inequality In jr < ;v for all A: > 0, we get

R^ > (30(2^ + D M / ) - ' ln(£-'') (26) and this indicates that /? ^ oo as e ^ 0+. Thus, ff > 1 for e > 0 small enough. Also from

(16). wehave2eA/ff < ln(e~''), and so

ff < (2eM)-'ln(£^''). (27) From (26) and (27), we obtain the estimate (17) of the theorem. We next consider the

function G : C -> C,

/

M g(x)e"'dx.

-M

^ Springer

(17)

It IS c l e a r t o s e e that G is a n o n - t r i v i a l e n t i r e f u n c t i o n , | C ( 0 ) | ^ 1 a n d C ( ( ) ^ g^^ii) for a l l f e R . I n a d d i t i o n , f o r all z e C , \z\ = 2eR,

iGiz)l < f gix)e^'^^^^dx < e'^l-l ^ e J-M

T h i s y i e l d s l n { m a x | ; | = 2 f f l \Giz)\) < 2eMR. T h e n , u s i n g L e m m a 3 w i t h U = » ' . f) — ff"-''""'. w e g e t

.„))

: e x p | - 2 e M f f [ ( 2 ^ - | - l ) l n ( f f ) - l - l n ( 1 5 e ^ ) ] | ^ e "

f o r a l l z € C , \zl < ff e x c e p t a set o f d i s k s ( f i ( z j , r ^ ) ) j g j w h o s e s u m o f r a d i i is l e s s t h a n rjU ^ ff--''. T h i s l e a d s t o

( ^ e R : | G ( ^ ) | < E ^ k | < ff) c [ J ( e ( c j . r ^ ) n l ] .

a n d h e n c e , w e h a v e

A ( [ 2 € K : \Giz)\ <E\ \Z\ < ff|) <J^k{BiZj.rj)nR) < ^ 2 r j <2R-'^^.

F m a l l y . n o t i n g t h a t BR^^- = [Z e R : | G ( . - ) | < f". k l < ff}, w e o b t a i n t h e e s t i m a t e ( 1 8 ) o f

t h e t h e o r e m D

Acknowledgments The authors would like lo thank the referees for careful reading of the anscle and for helpful comments and suggestions leading to the improved version of our arucle

1 Carroll, R.J.. Hall, P Optimal rates of convergence for deconvolving a densiiv. J Ara Stat. Assoc 83(404). 1184-1186(1988)

2 Comte, F.. Lacour. C . Data-driven density estimation in the presence of additive noise with unknown distnbuiion J R Stat. Soc. Ser 8 73.601-527(2011)

3. Devroye. L ' C o n s i s t e n t deconvolution in density estimation Can J Siai 17.235-239(1989) 4 Diggle, P.J„ Hall, P. A Founer approach to nonparametric deconvolution of a density estimate. J. R.

Stat. Soc Ser B 55. 523-531 (1993)

5. Fan. J : On ihe optimal rates of conveigence for nonparametnc deconvolution problems. Ann Stat 19, 1257-1272(1991)

6. Fan, J.: Asymptotic normality for deconvolution kernel density estimators Sankhya'Indian J, Siai, Ser.

A S3, 97-110 (1991)

7. Fan.J Global behavior of deconvolution kernel estimates-Slat Sin 1,541-551(1991) 8. Hall. P.. Meister, A A ndge-parameter approach to deconvolution. Ann Stat 35.1535-1558(2007) 9. Johannes, J.. Deconvolution with unknown error distnbution Ann Stat. 37, 2301-2323 (2009) 10 Levin, B . Y ' Lectures on Entire Funcuons. Trans, Math, Monographs, vol, 150 AMS. Providence. Rhole

Island (1996)

11. Lounici, K., Nickl, R.' Global uniform risk bounds for wavelet deconvolution estimators. Ann. Slat. 39, 201-231(2011)

12 Meisier, A.: Non-esumability in spite of idenlifiability in density deconvolution Math. Methods Stat 14. 479-187 (2005)

1-1 Meister, A . Deconvolution Problem in Nonparametric Staiisiics Springer. Berlin (2009) 14 Meister, A,, Neumann, M.H.. Deconvolution from non-standard error den.sities under replicated mea-

•i. Slat. Sm. 20, 1609-1636 (2010)

^ Springer

(18)

Ridge-Parameter Regularization to Deconvolution Problem with Unknown Error Distribution D a ng