07350015%2E2014%2E906350

(1)

Full Terms & Conditions of access and use can be found at

http://www.tandfonline.com/action/journalInformation?journalCode=ubes20 Download by: [Universitas Maritim Raja Ali Haji], [UNIVERSITAS MARITIM RAJA ALI HAJI

TANJUNGPINANG, KEPULAUAN RIAU] Date: 11 January 2016, At: 21:01

Journal of Business & Economic Statistics

ISSN: 0735-0015 (Print) 1537-2707 (Online) Journal homepage: http://www.tandfonline.com/loi/ubes20

On the Estimation of Integrated Volatility With

Jumps and Microstructure Noise

Bing-Yi Jing, Zhi Liu & Xin-Bing Kong

To cite this article: Bing-Yi Jing, Zhi Liu & Xin-Bing Kong (2014) On the Estimation of Integrated Volatility With Jumps and Microstructure Noise, Journal of Business & Economic Statistics, 32:3, 457-467, DOI: 10.1080/07350015.2014.906350

To link to this article: http://dx.doi.org/10.1080/07350015.2014.906350

Accepted author version posted online: 24 Apr 2014.

Submit your article to this journal

Article views: 296

View related articles

View Crossmark data

(2)

On the Estimation of Integrated Volatility With

Jumps and Microstructure Noise

Bing-Yi J

ING

Hong Kong University of Science and Technology, Hong Kong ([email protected])

Zhi L

IU

University of Macau, Macau, China ([email protected])

Xin-Bing K

ONG

Soochow University, Suzhou, China ([email protected])

In this article, we propose a nonparametric procedure to estimate the integrated volatility of an Itˆo semi-martingale in the presence of jumps and microstructure noise. The estimator is based on a combination of the preaveraging method and threshold technique, which serves to remove microstructure noise and jumps, respectively. The estimator is shown to work for both finite and infinite activity jumps. Furthermore, asymptotic properties of the proposed estimator, such as consistency and a central limit theorem, are es-tablished. Simulations results are given to evaluate the performance of the proposed method in comparison with other alternative methods.

KEY WORDS: Central limit theorem; High frequency data; Quadratic variation; Semimartingale.

1. INTRODUCTION

With the availability of high-frequency data, there has been a rapidly growing interest in the estimation of integrated volatil-ity. For continuous Itˆo process, a commonly used estimator is the realized volatility (also called realized quadratic variation in some literature), see, Andersen et al. (2003). The estimation of integrated volatility becomes tricky when the underlying price process contains jumps. Two well-behaved estimators are the multiple-power estimator and the realized threshold quadratic variation, respectively. The former was proposed by

Barndorff-Nielsen and Shephard (2006) and Barndorff-Nielsen and

Shep-hard (2004), while the latter was proposed by Mancini (2009)

and further developed in Jacod (2008). An interesting

com-parison of the two different approaches was given in Veraart (2011).

However, it is widely accepted that the observed prices are contaminated by microstructure noise, for example, the bid-ask spreads. Thus, the discretely observed processYtiis

Yti =Xti +ǫti, i=0,1, . . . ,⌊t /n⌋, (1)

whereXt represents the latent price process, andǫti is the mi-crostructure noise at timeti. For simplicity, assume that the

ob-servation times in [0, t] are equally spaced, namely,ti=:in

(0≤i_{≤ ⌊}t /n⌋) withn→0 asn→ ∞. Our interest lies in

the inference on certain characteristics of the latent processX, using the contaminated observationsYti’s.

In this article, we assume that the latent price processXis an Itˆo semimartingale of the form

Xt =Xtc+X d

t, (2)

where Xc _and _Xd _{are the continuous and discontinuous (or}

jump) components, respectively,

Xc_t ₌X0+

t

0

bsds+ t

0

σsdWs,

X_td ₌

t

0

|x|≤1

x(μ−ν)(ds, dx)+ t

0

|x|>1

xμ(ds, dx),

(3)

wherebandσ are locally bounded optional processes,μis the

jump measure, withνits predictable compensator. All these are

defined on a stochastic basis (,F,Ft, P). An example in the

literature using the process with jumps is by Wu (2008), where, he uses the L´evy process to model financial security returns.

Under this setting, the quadratic variation ofXbecomes

[X, X]t =[Xc, Xc]t+[Xd, Xd]t

= t

0

σ_s2ds₊ 0≤s≤t

(Xs)2,

whereXs =Xs−Xs−is the jump size ofXat times. In this

article, our main interest lies in the estimation of the integrated volatility of the continuous part, that is, [Xc_{, X}c_]

t= t

0σ

2

sds,

in the presence of jumps and microstructure noise. Incidentally, estimation of [Xd_{, X}d_]

tcan be done similarly.

July 2014, Vol. 32, No. 3 DOI:10.1080/07350015.2014.906350

Color versions of one or more of the figures in the article can be found online atwww.tandfonline.com/r/jbes.

457

(3)

For a generic semimartingaleX, define (see A¨ıt-Sahalia and

Jacod2009)

I ₌

r_≥0;

0_≤s≤t

|sX|r <∞

, (4)

wheresX=Xs−Xs−is the jump size at times. This is an

interval of the form [β,∞] or (β,∞]. We say thatXd _{is of}

finite (or infinite) activity in (0, t], that isXd _{has a.s. finitely}

(or infinitely) many jumps if 0∈I (or 0 ∈I); Xd _{is of finite}

variation in (0, t] if 1∈I.

When the latent price processXt contains no jumps, that is,

Xd _≡0 in (2), the influence of the microstructure noise on the estimation of the integrated volatility for high-frequency data has been well documented in the literature. For instance, Zhang

Mykland, and A¨ıt-Sahalia (2005) and Bandi and Russell (2006)

found that microstructure noise, if left untreated, can result in inconsistent estimators of the integrated volatility. There are several main approaches to overcome this difficulty, including

(I) subsampling method (Zhang, Mykland, and A¨ıt-Sahalia

2005),

(II) the realized kernel method (Barndorff-Nielsen et al.2008),

(III) the preaveraging method (Jacod et al.2009and Podolskij

and Vetter2009b),

(IV) the quasi-maximum likelihood method (QMLE) (Xiu

2010).

In this article, we are interested in estimating integrated

volatilities when the latent processXt contains jumps, that is,

Xd

≡0 in (2). Here, given the observationsYti, one needs to take care of the noiseǫti as well as the jumpsX

d

ti. The estima-tion of integrated volatility was considered by Fan and Wang

(2007), and Podolskij and Vetter (2009b) using two

fundamen-tally different techniques, which differ in the order of treating

jumps and noise. Rosenbaum (2009) also considered the

inte-grated volatility estimation under round-off error using wavelet approach.

To be precise, Fan and Wang (2007) first applied wavelet

methods to try to get rid of the jumps, and then dealt with the microstructure noise by the multiscale technique of Zhang

(2006), while Podolskij and Vetter (2009a) first applied “pre-averaging” to reduce the impact of the microstructure noise, and then handled the jumps by the multiple-power estimator. Both papers also showed that their estimators are consistent for the estimation of the integrated volatility when the latent

price processXtcontains finite activity jumps, which were also

supported by their simulation results.

A natural question is:how will the two methods by Fan and

Wang(2007)and Podolskij and Vetter(2009a)perform if we increase the jump intensities of the jumps? To answer this, we conducted some simple simulation studies. In the first simula-tion, the observationsYti are generated from model (1), where Xtandǫt are taken to be

Xt=Wt+ Nt

i=1 Ji, ǫi

iid

∼N_(0,_0.012 ),

where Ji∼N(0,0.52) and Nt ∼Poisson(λt) and Jt and

Nt are independent. We choose the jump intensity λ=

1,2, . . . ,10,20,50, andn₌23,400. NowXt still contains

fi-nite activity jumps, but the jumps occur more frequent asλ

in-creases. We then applied the multipower method and the wavelet

method to this model, and the results are presented inTable 1

andFigure 1in Section4(one can ignore column 1 now, which contains the results for a different method proposed later in the article). Note that [Xc_{, X}c_]

t₌1=1. Clearly,

• Podolskij and Vetter’s method works very well when jumps

are rare (e.g.,λ₌1), and gradually deteriorates as jumps

become more frequent;

• Fan and Wang’s method is extremely accurate regardless

of the arrival rateλ. This is remarkable and perhaps it may not come to a complete surprise to us since the wavelet step can effectively remove all the big jumps in the above jump-diffusion model, which might explain why it should perform so well in this case.

The remarkable performance of Fan and Wang’s method in

the jump-diffusion model with noise begs the question:how will

it work for some other jump models with noise?So, in our second

Table 1. Comparisons for Model 1 (jump-diffusion model with noise). Heren₌23,400

Preaveraging-threshold Podolskij and Vetter’s Podolskij and Vetter’s Fan and Wang

(PT) bi-power (PV2) triple-power (PV3) (FW)

λ (bias, s.e., MSE) (bias, s.e., MSE) (bias, s.e., MSE) (bias, s.e., MSE)

0 (₋0.007, 0.081, 0.007) (₋0.013, 0.091, 0.008) (₋0.020, 0.095, 0.009) (₋0.007, 0.081, 0.007)

1 (₋0.007, 0.081, 0.007) (0.052, 0.131, 0.020) (0.024, 0.114, 0.014) (0.008, 0.084, 0.007)

2 (₋0.003, 0.081, 0.007) (0.109, 0.164, 0.039) (0.062, 0.128, 0.020) (0.009, 0.083, 0.007)

3 (₋0.001, 0.081, 0.007) (0.164, 0.186, 0.062) (0.098, 0.140, 0.029) (0.005, 0.081, 0.007)

4 (0.004, 0.079, 0.006) (0.234, 0.223, 0.104) (0.142, 0.153, 0.044) (0.006, 0.083, 0.007)

5 (0.010, 0.081, 0.007) (0.299, 0.249, 0.151) (0.186, 0.168, 0.063) (0.010, 0.083, 0.007)

6 (0.013, 0.081, 0.007) (0.361, 0.275, 0.206) (0.226, 0.185, 0.086) (0.014, 0.082, 0.007)

7 (0.019, 0.082, 0.007) (0.446, 0.311, 0.295) (0.287, 0.205, 0.122) (0.020, 0.084, 0.007)

8 (0.023, 0.082, 0.007) (0.502, 0.337, 0.369) (0.316, 0.215, 0.146) (0.019, 0.084, 0.007)

9 (0.027, 0.081, 0.007) (0.573, 0.370, 0.465) (0.363, 0.234, 0.187) (0.034, 0.084, 0.008)

10 (0.027, 0.081, 0.007) (0.624, 0.392, 0.510) (0.386, 0.252, 0.212) (0.034, 0.083, 0.009)

20 (0.040, 0.082, 0.010) (1.438, 0.678, 2.521) (0.925, 0.409, 1.023) (0.055, 0.082, 0.010)

50 (0.069, 0.084, 0.012) (4.579, 1.521, 23.28) (3.033, 0.978, 10.16) (0.113, 0.089, 0.021)

(4)

1 2 3 4 5 6 7 8 9 10 0.1

0.2 0.3 0.4

Jump Intensity (λ₎

MSE

Preaveraging−Threshold Bipower

Triplepower Wavelet

Figure 1. MSE plots for Model 1 (jump-diffusion model with noise, the curve of preaveraging-threshold estimator hides behind the wavelet estimator curve).

simulation, we generated the observationsYti from model (1),

whereXt andǫtare taken to be

Xt =Wt+Jt, ǫi

iid

∼N_(0,_0.012 ),

whereJtfollows a symmetricβ-stable process, whereβranges

from 0.3 to 1.5 (all have infinite activity jumps). As β gets

larger, jumps occur more frequently. Again, [Xc, Xc]t=1=1. The results containing bias, standard error (s.e.) and mean square error (MSE) are given in the following table.

Performance of Fan and Wang’s method for differentβ’s

β 0.3 0.5 0.8 1.0 1.5

bias 0.083 0.145 0.274 0.406 0.924

s.e. 0.125 0.148 0.181 0.207 0.259

MSE 0.027 0.043 0.108 0.208 0.918

Clearly, Fan and Wang’s method works well for smallβ’s

since the MSEs are under 21% forβ _≤1. However, asβ >1, the

performance starts to get bad; for example, the MSE is 91.8% for

β₌1.5%. The reason is: the biggerβis, the more small jumps

there are, the more difficult for the wavelet method to separate the small jumps from the microstructure noise. (Incidentally, the results for Podolskij and Vetter’s method are even worse in this case and hence not included here.)

The less than ideal numerical performances of Fan and

Wang’s and Podolskij and Vetter’s methods for the caseβ >1

prompt us to search for some alternative methods. Recall that the observations are

Yti =X

c ti+X

d

ti+ǫti. (5)

with 0< t1< t2<· · ·< tn≤t. Our procedure goes in two

steps:

• Step 1 (Preaveraging). We use preaveraging to reduce the

effect of the microstructure noiseǫti.

• Step 2 (Threshold). We truncate the smoothed data at some

appropriate threshold to remove the jump contribution from the decomposition of [X, X]t.

We will refer to the method as thepreaveraging-threshold(PT)

method, and the resulting estimator as PT estimator.

Simulation studies given later in the article indicate that the PT method turns out to perform better than the two previously men-tioned methods, regardless of finite or infinite activity jumps. In fact various recent papers have shown that essentially all statis-tical methods based on increments of an Itˆo semimartingale in the no-noise case can be translated to the noisy framework by replacing raw increments with preaveraged statistics, which can be interpreted as some kind of generalized increments. Given that intuition, it is perhaps not too surprising that a truncated

version of the estimator from Jacod et al. (2009) works in the

jump case as well, having the same asymptotic variance as their estimator, since the same holds true for realized volatility (RV)

and truncated RV as shown in Jacod (2008). The present

ar-ticle is further proof that the preaveraged data can be effec-tively treated as data with no microstructure noise. Recently,

Bajgrowicz, Scaillet, and Treccani (2013) show that, using the

existing jump detection methods, the jumps usually are erro-neously identified due to the multiple testing issue and the jump effect on the market friction has been overstated. They pro-posed a thresholding testing statistics, however, the test is based on the microstructure-noise-free data. Christensen, Oomen, and

Podolski (2014) consider the microstructure noise robust jump

test using the Bipower estimator. Since main objective of

Baj-growicz, Scaillet, and Treccani (2013) and Christensen, Oomen,

and Podolski (2014) is the estimation of integrated volatility, the results of current article provide a possibility to improve the ef-ficiency of their approaches.

The rest of the article is organized as follows. In Section2,

we introduce the proposed PT estimator. Main results are given

in Section3. Section4is devoted to simulations. We conclude

(5)

the article in Section5. The technical proofs are postponed to the Appendix.

2. METHODOLOGY

As we described before, the processX is observed with an

error (microstructure noise in practice), namely, at the time point ti, we observe Xti +ǫi rather than Xti, where the ǫi

are “errors” which are conditionally on the process X,

inde-pendent. It is convenient to define a “canonical” processǫt in

R[0,∞)_{for all}_t_{, although we only use its value at some discrete} points. Mathematically, we first have a transition probability

Qt(ω, dx) from (,F) toR. Based on it, we construct a new

probability space (R[0,∞)_,_B_{, σ}_(ǫ

s :s∈[0, t)),Q), whereBis

the product Borelσ-field andQ_{= ⊗}t≥0Qt. Then, the

observa-tion is measurable with respect to the filtered probability space ((1),F(1),F(1)t , P(1)),which is given by

⎧ ⎪ ⎨

⎪ ⎩

(1)₌_×R[0,∞),F(1)₌F_⊗B,

F(1)t =Ft⊗σ(ǫs :s∈[0, t)), P(1)(dω, dx)

=P(dω)Q(ω, dx).

Hence, any variable or process in either or R[0,∞) _{can be}

considered in usual way as a variable or a process on . For

more detailed discussion about the new probability space, see, Jacod, Podolskij, and Vetter (2010).

2.1. Preaveraging

We briefly describe the idea of the preaveraging. For details, see Jacod et al. (2009). Define thejth increment by

n_jY :=Ytj−Ytj−1, forj =1,2, . . . , n.

Together they form a sequence (n

1Y, . . . ,

n

nY). Choose an

integer kn such that 1≤kn≤ ⌊t /n⌋and then we formulate

⌊t /n⌋ −kn+1 overlapping blocks, theith being B_i ₌

n_iY, . . . , n_i₊_k_n₋₁Y, for 1≤i_{≤ ⌊}t /n⌋ −kn+1.

Within thisith block, we take a weighted average of the

incre-ments:

n_i,k nY(g)=

kn−1

j=1

g(j/ kn)ni+jY,

for 1≤i_{≤ ⌊}t /n⌋ −kn+1,

where the weight functiongis chosen such that

• it is continuous, piecewiseC1 _{with a piecewise Lipschitz}

derivativeg′_,

• g(s)=0 whens_∈(0,1), and₀1g2(s)ds >0.

One common choice satisfying the above conditions isg(x)=

x_∧(1−x).

Now let us investigate what effect the preaveraging has. In view of (5), denotinggn_j ₌g(j/ kn), we have

n_i,k

nY(g) =

kn−1

j=1

g_jnn_i₊_jXc₊

kn−1

j=1

gn_jn_i₊_jXd₊

kn−1

j=1

g_jnn_i₊_jǫ

:=Ai₁₊Ai₂₊Ai₃, (6)

Some simple variance calculations show that

Ai₁₌Op(

knn), Ai3=Op(1/

kn).

(We shall treat the jump componentAi₂in the next subsection.)

Clearly, by choosingkn→ ∞appropriately, we can control the

effect of microstructure noiseAi

3relative toA

i

1. In particular,

(i) ifkn

1/2

n → ∞, for example,kn= ⌊c −

(1/2₊ǫ)

n ⌋for some

ǫ >0, then the effect of the microstructure noise can be ignored;

(ii) ifkn

1/2

n =c >0, for example,kn= ⌊c −

1/2

n ⌋, thenAi1 andAi₃will be of comparable size.

In either case, the influence of microstructure noise has been eliminated or substantially reduced.

In the following discussion, we will assumekn= ⌊c −

1/2

n ⌋

unless otherwise stated (the only exception is Theorem 2, where

we takekn= ⌊c −

(1/2+η)

n ⌋).

2.2. Threshold Quadratic Variation

The above preaveraging procedure reduces the influence of the microstructure noise, its effect on the jumps is still very limited. The remaining task is to get rid of the effect of the jumps.

After preaveraging, the smoothed increments from the diffu-sion and microstructure noise,Ai

1andA

i

3, are both of size 1/4

n .

However, the smoothed increment from the jump,Ai

2, may still be larger than1n/4. Following the idea of Mancini (2009) or

Jacod et al. (2009), we can propose the following threshold

estimator of [Xc, Xc]t:

U(Y, g)nt =

⌊t /n⌋−kn

i=1

n_i,k nY(g)

2

1_{|n

i,knY(g)|≤un}. (7)

Where,unsatisfying

un−n̟1→0, unn−̟2→ ∞,for some 0≤̟1< ̟2< 1 4.

(8)

Such choices ofunenable those (smoothed) increments larger

thanunto be gradually excluded asn→ ∞, and essentially only

those increments due to continuous part and noise are included, hence we can calculate the integrated volatility after removing the effect of noise.

3. MAIN RESULTS

In this section, we study the asymptotic behavior ofU(Y, g)n t,

such as consistency and central limit theorem. Before stating our main theorems, we need some assumption on the microstructure noise.

Assumption 1. ν(ω, dt, dx)=dt Ft(dx) with Ft(x)=

F_t′(dx)₊F_t′′(dx), where F_t′(dx) and F_t′′(dx) are two mutu-ally singular measures and satisfy, for three constants 0_≤β2≤ β1≤β <2, that

(6)

• F′

processes andf ₌f(ω, t, x) is predictable function.

•

(|x_|β2_∧_1)F′′

t (dx)≤Lt, where Lt is a locally bounded

process which can bound the above mentioned four pro-cessesα−

t , α+t , z−t , andz+t .

Assumption 2. There is a sequence ofF-stopping time (Tn)

increasing to∞such that, whenevert < Tn(ω),

κt(p)(ω)=

We start with the consistency ofU(Y, g)n

t.

Theorem 1. Under Assumptions 1–3, we have

1_n/2U(Y, g)nt

Note that the limit in Theorem 1 involves an unknownκ2_,

which can be consistently estimated by

ˆ

This was shown in Zhang, Mykland, and A¨ıt-Sahalia (2005)

to be true for the case of diffusion with microstructure noise, the more general case with jumps can also be shown

simi-larly due to the finiteness of quadratic variation ofX (details

will be omitted here). Consequently, a consistent estimator for [Xc_{, X}c_]

Similarly to Podolskij and Vetter (2009a), a wider window

sizekn makes the microstructure noise negligible, thus

elimi-nating the need to estimateκ2_{; see Theorem 2. The price to pay} is a slightly slower convergence rate.

Theorem 2. Under the same conditions as Theorem 1, except

we choosekn=c −

3.1 Central Limit Theorem (CLT)

To establish the CLT, we need a structural assumption on the processσ.

Assumption 4. The volatility functionσsatisfies

σt =σ0+

being predictable and locally bounded, andB′_{is a second}

Brow-nian motion independent ofW.

Assumption 4 is the same as the one used in Jacod, Podolskij,

and Vetter (2010), which was used to directly conclude the CLT

for the continuous case. We adopt this assumption to show the CLT from the continuous part of our work; see the proof for detail.

The concept of stable convergence will be used below. A

sequence of random variables (r.v.’s)Xnconverges stably in law

with limitXdefined on the appropriate extension of the original

probability space, written as Xn

S

Z. So it is slightly stronger than convergence in law.

Theorem 3. Assume that Assumptions 1–4 hold, (withp₌4

in the Assumption 2), and ̟1>1/(8−4β) (hence,β <1).

whereW′is a Brownian motion (defined on an extension of the

space) independent ofF, and

μ2(η, ξ)=4(η422+2(ηξ)212+ξ411),

different lower bounds are due to the different orders of the raw increments and the preaveraged statistics (2 is replaced by 4). We further note that the variance is the same as in the continuous case, see Jacod et al. (2009).

We need to estimate the asymptotic conditional variance in Theorem 3, which involves the quarticity₀1σ_s4ds. Podolskij and Vetter (2009b) proposed a triple-power estimator of quarticity.

We now propose another consistent estimator of the quarticity based on the truncated power variation as follows.

Proposition 1. Define U1(Y, g)nt =

(7)

Hence, we easily obtain the asymptotic behavior of the stu-dentized statistics.

Corollary 1. Let Ŵn t=

4 ¯

g2₍₂₎(c22Qˆn+212 ˆ

κˆ2

c +11

( ˆκ2₎2

c3 ),

under the same assumptions as Theorem 3, we have

−n1/4

ˆ

n−

t

0σ

2

sds

Ŵtn

S −

→N_(0,_1),

whereN_(0,_{1) is a standard normal r.v. independent of}F.

3.2. Asymptotic Relative Efficiency Comparison

We compare the variance of our estimator with that of the

bi-power estimator proposed in Podolskij and Vetter (2009b) in

the special case when X is continuous. From Theorem 3, the

asymptotic variance of ˆ1is

c ¯ g2₍₂₎

t

0

E_|μ2(σs, κ/c)|ds.

On the other hand, by Theorem 3 of Podolskij and Vetter (2009b), the variance of the bi-power type estimator is

1₊2m2₁₋3m4₁ c

¯ g2₍₂₎

t

0

E_|μ2(σs, κ/c)|ds

≈1.06 c ¯ g2₍₂₎

t

0

E_|μ2(σs, κ/c)|ds,

wheremp =E|N(0,1)|p. Thus, our estimator is asymptotically

about 6% more efficient than the bi-power estimator in the case of no jumps. In the presence of jumps, our simulations show that finite sample efficiency can be much greater than the asymptotic one.

4. SIMULATION STUDY

We now evaluate the performances of our proposed estimator with other methods by Monte Carlo simulations.

4.1 Simulation Design

We generate observationsYti from the following model

Yti =Xti +ǫti,

where ǫi’s are independent of Xti’s. The latent processXt is generated from the following commonly used models.

• Model 1—A jump-diffusion process with noise:

Xt=Wt+ Nt

i=1 Ji, ǫi

iid

∼N_(0,_0.012 ),

whereJi ∼N(0,0.52),Nt ∼Poisson(λt) andJiandNtare

independent. We chooseλ₌0,1,2, . . . ,10,20,50.

• Model 2—A diffusion with infinite activity jumps and

noise:

Xt =Wt+Jt, and ǫi

iid

∼N_(0,_0.012 ),

whereJtis a trimmed symmetricβ-stable process withβ =

0.5. The algorithm described in Cont and Tankov (2004)

is used to simulate aβ₋stable process. By “trimmed” it

is meant that 2% of the largest (absolute) values will be discarded. The reason for trimming will be explained in the next subsection. Another common technique is to use “tempered” stable process, which is similar in spirit to “trimming.”

• Model 3—The stochastic volatility (SV) model with jumps

and noise:

dXt =μdt+σtdWt+δdJt,

and σt =exp(β0+β1τt),

where dτt =ατtdt+dBt with Corr(dWt, dBt)=ρ

and Jt follows a tempered symmetric β-stable

pro-cess as in the last model. We let μ₌0.03, β0 =

0.3125, β1=0.125, α= −0.025, ρ= −0.3, δ=0.1,

andǫi∼iidN(0,0.0012). All the coefficients were selected

from Podolskij and Vetter (2009a) to allow a comparison.

It is easy to see that [Xc_{, X}c_]

t =t for Models 1 and 2 while, it

becomes a random variable for Model 3.

We consider four sample sizes:n₌1170, 4680, 7800,and

23,400 (which correspond to sampling every 20 sec, 5 sec, 3 sec, and 1 sec in a trading day). The experiments are repeated 5000 times for each case. The bias and standard error (s.e.) and MSE are computed for each sample size.

We will compare the performances of our PT estimator with

those by Podolskij and Vetter (2009a,b) and Fan and Wang

(2007). Here are some specifications.

• For Fan and Wang’s method, when applying the wavelet

to locate the jumps, we use Daubechies s8 wavelet to

cal-culate the empirical wavelet coefficient (Wang1995) and

the universal thresholdd√2 logn/n, wheredis the median

absolute deviation of empirical wavelet coefficient, divided

Table 2. Comparisons for Model 2 (diffusion model with infinite activity jumps and noise)

n (bias, s.e., MSE) (bias, s.e., MSE) (bias, s.e., MSE) (bias, s.e., MSE)

1170 (0.058, 0.178, 0.035) (0.274, 0.546, 0.373) (0.136, 0.357, 0.146) (0.141, 0.183, 0.053)

4680 (0.034, 0.126, 0.017) (0.236, 0.394, 0.210) (0.127, 0.245, 0.076) (0.116, 0.152, 0.037)

7800 (0.024, 0.109, 0.013) (0.223, 0.398, 0.194) (0.118, 0.215, 0.060) (0.108, 0.138, 0.030)

23400 (0.013, 0.082, 0.007) (0.173, 0.283, 0.110) (0.090, 0.148, 0.030) (0.079, 0.103, 0.017)

(8)

Table 3. Comparisons for Model 3 (SV model with infinite activity jumps and noise)

n (bias, s.e., MSE) (bias, s.e., MSE) (bias, s.e., MSE) (bias, s.e., MSE)

1170 (0.070, 0.181, 0.038) (0.311, 0.541, 0.390) (0.164, 0.364, 0.159) (0.167, 0.198, 0.067)

4680 (0.034, 0.126, 0.017) (0.247, 0.397, 0.219) (0.133, 0.240, 0.076) (0.141, 0.176, 0.051)

7800 (0.027, 0.111, 0.013) (0.224, 0.341, 0.166) (0.121, 0.214, 0.060) (0.112, 0.152, 0.036)

23400 (0.016, 0.081, 0.007) (0.187, 0.292, 0.120) (0.100, 0.161, 0.036) (0.087, 0.121, 0.022)

20 5 3 1

0.1 0.2 0.3

Sampling time intervel (Seconds) MSE

Triplepower Wavelet

Figure 2. MSE plots for Model 2 (L´evy model with noise).

20 5 3 1

0.1 0.2 0.3

Sampling time intervel (Seconds) MSE

Triplepower Wavelet

Figure 3. MSE plots for Model 3 (SV model with noise).

(9)

−4 −2 0 2 4 6 0

50 100 150 200 250 300 350

−4 −2 0 2 4

−4 −3 −2 −1 0 1 2 3 4 5

Standard Normal Quantiles

Quantiles of Input Sample

QQ Plot of Sample Data versus Standard Normal

Figure 4. The histogram and Q-Q plot of the statistic in Corollary 1.

by 0.6745. When applying the two time scale step, we take

K₌50.

• For Podolskij and Vetter’s method, the weight function

g(x)=x_∧(1−x) will be used in the preaveraging step.

• For our PT method, the same weight functiong(x)=x_∧

(1−x) is chosen in the preaveraging step as in Podolskij and Vetter (2009b), and the threshold level is chosen to be 0.23

n .

4.2. Simulation Results

The results are presented in Tables1–3and Figures1–3. The following short-hands are used:

PT=preaveraging-threshold;

FW₌Fan and Wang;

PV2₌Podolskij and Vetter’s bi-power;

PV3=Podolskij and Vetter’s triple-power.

We make the following observations.

1. For all models considered, the performances of all three methods improve as the sample size increases.

2. In all the cases, the PT method performs the best (in terms of smaller biases and s.e.’s and MSEs), followed by FW method, which in turn outperforms PV2 and PV3 methods. 3. In Model 1 (the jump-diffusion model with noise), the PT and

FW methods both work well for allλ’s considered, and both

outperform the PV2 and PV3 methods across the range ofλ.

Furthermore, the performances of the PT and FW methods

are not affected by the intensityλ. However, the PV2 and

PV3 methods work well for smallλ’s but deteriorate very

rapidly asλgets bigger.

4. In Models 2 and 3 (with infinite activity jumps and noise), the PT method consistently outperforms the others. We have

also done simulations whenJt is an untrimmed symmetric

β-stable process, (not shown here to save space). It turns

out that the PT method still performs the best, followed by the FW method, while PV2 and PV3 perform rather

unsatis-factorily. One reason is that the untrimmedβ-stable process

can have some rare but huge jumps, which most severely affect the performance of PV2 and PV3. In reality though, huge jumps are unrealistic for high frequency tick-by-tick data so we trimmed 2% increments from largest increments. “Tempering” is often used for this purpose as well.

5. The FW and PT methods perform equally well in Model 1. However, in Models 2–3, the FW method does not perform as well as the PT method. Recall that the FW method tries to remove jumps first and then deals with noise, while the PT method does the reverse. So for infinite activity jumps, it seems better to use PT method.

6. Figure 4displays the Q-Q plot of the PT estimator for Model 2, where the variance was estimated as in Proposition 1. The plot is close to linear, indicating asymptotic normality of the estimator.

5. CONCLUSION

In this article, we proposed a PT estimator of integrated volatility in the simultaneous presence of microstructure noise and jumps. The method is based on two steps, namely, the preav-eraging step to reduce microstructure noise, and the threshold step to remove jumps. The new estimator can handle very gen-eral jump processes of finite or infinite activity. The consistency and asymptotic normality of the proposed estimator have been established. Simulation studies show excellent performances of the proposed estimators, in comparison with some alternative

(10)

methods in the literature. Some possible extensions of the cur-rent work include the study of covariation (matrix) estimation, an important quantity in econometrics, under simultaneous pres-ence of noise and infinity activity jumps.

APPENDIX: THE PROOFS OF THE THEOREMS

By a standard localization procedure, described in details in Jacod (2012), we can replace the local boundedness hypotheses in assump-tions by a boundedness assumption, and also assume that the process

Xitself, and thus the jump processXt, are bounded as well. That is, we may assume the following:

Assumption A.1. For some constant C >0, we have max_{|bt_|,_|σt_|,_|Xt_|, κt(r)_{} ≤}C.

We define the continuous part ofXbyX′_{and discontinuous}

martin-gale part byX′′_{, that is,} Y′₊_X′′_{. Throughout the proof,}_K_{denotes a generic constant while} Kαmight depend on some parameterα. We also definei,k_nW(g) and i,k_nǫ(g) similarly toi,k_nX(g).

We prove a general theorem which is valid for anyp >0, of which Theorem 1 is a special case. Define

V(Y′, g, p)n

tis continuous, by Theorem 3.3 of Jacod, Podolskij, and Vetter (2010), we have

Therefore, it suffices to show that

1−p/4 Rewrite the left-hand side above as1−p/4

n

The following two elementary inequalities will be used in the proof:

||x₊y_|p_{− |}x_|p_{| ≤ |}y_|p, whenp_≤1, We consider two disjoint cases,_|n

i,knY

′₍_g₎_|_{< un/}_{2, from similar considerations, we deduce}

that

whereris any positive real number. In view of boundedness assumption of the parameters and repeated use of the H¨older’s and Burkholder’s inequalities, we have

Eni,knX

The last inequality follows from (6.25) of Jacod (2012) applied to the process X′′_{, with the sampling interval} _knn _and _αn₌_un/√_knn_,

which goes to_∞by (8). Letl₌r₌1, we deduce from the above inequalities that, whenp_≤2,

1−p/4

Proof of Theorem 1. Lettingp₌2, we obtain the required result of

Theorem 1 from Lemma A.1.

Proof of Theorem 2. We redefineU(Y, g)n

t involvingknas

U(Y, g, kn)t=

n . Theorem 2 follows if we can show that

(11)

(A.6) follows from the same procedure as in the proof of Theorem 1.

Therefore, it suffices to prove the following four convergences

⌊t /n⌋−kn

Observe that (since the drift does not affect the estimator of integrated volatility, so here we assume it is zero for simplicity):

E_|γi_{| ≤} K

by Cauchy-Schwarz inequality and Itˆo’s isometry, where gn(s)₌ g_⌊s−(i−1)n

n ⌋/ kn

. Sinceghas a piecewise Lipschitz derivative, Itˆo’s isometry and boundedness ofσstogether imply

E_|γi_{| ≤}Knknn₊1/ kn,

thus, (A.7) follows. Similarly, (A.8) follows from

E_|γ′

by Burkholder inequality. Hence, we have shown (A.9).

Finally, (A.10) follows from a standard procedure of Riemann in-tegrability (see Barndorff-Nielsen et al.2006). Hence the proof is

fin-ished.

Proof of Theorem 3. WhenXis a continuous process, by Theorem 4.1 of Jacod, Podolskij, and Vetter (2010),

−n1/4

with the sameW′_and_μ_{defined in Theorem 3. So to prove the theorem,}

it is enough to show that

3/4−p/4 The left-hand side reduces to3/4−p/4

n

⌊t /n⌋−kn

i=1 η n

i. Note that the jumps ofXtwill be a finite variation process whenβ <1, so we have the following decomposition,

X′1(t)=X0+ and the microstructure noise.

Similarly to (A.2)–(A.4), we have the following estimates,

E r <1 ands_∈(β,1), using the H¨older’s inequality and the inequality (_|x_{| ∧}un)p

(12)

Now we consider the cases ofp >1 and p_≤1 separately. Let s_→β. Whenp >1,

c1_→l(1/4₋̟2)₋1/4,

c2_→r(1/2₋̟2)₋1/4,

c3_→1/4₋p/4₊̟1(p₋β),

c4_→̟1(1₋β).

For some large enoughland r, we havec1>0 and c2>0. When ̟1> _4(pp−1

−β),we have c3>0. We also have c4>0 as̟1>0 and β <1. Letp₌2, we obtain the result of Theorem 3.

Whenp_≤1, using H¨older’s inequality, we have

3/4−p/4 n E

|ηn

i|

≤ Kn

l(1/4−̟2)−1/4 n

+ r(1/2−̟2)−1/4 n +

p(1/4+̟1(1−s))−1/4 n

:₌Knc1 n +

c2 n +

c′

3 n

.

Clearly,cj>0, j ₌1,2 andc′

3>0. The proof follows by combining

the above results and lettingp₌2.

Proof of Proposition 1 and Corollary 1. Proposition 1 is an imme-diate consequence of Lemma A.1 by simply takingp₌4. Because Theorem 3 holds with stable convergence, the proof of Corollary 1

follows from Theorem 3 and Proposition 1.

ACKNOWLEDGMENTS

The authors thank the Editor and the Associate Editor for their very extensive and constructive suggestions which helped to improve this article considerably. Jing’s research is partially supported by Hong Kong RGC HKUST6019/10P, HKUST6019/12P, and HKUST6022/13P. Kong’s research is supported in part by National NSFC No.11201080 and in part by Humanity and Social Science Youth Foundation of Chinese Ministry of Education No. 12YJC910003. Liu want to thank the financial support from The Science and Technology Develop-ment Fund of Macau (No.078/2012/A3 and No.078/2013/A3).

[Received February 2012. Revised February 2014.]

REFERENCES

A¨ıt-Sahalia, Y., and Jacod, J. (2009), “Estimating the Degree of Activity of Jumps in High Frequency Financial Data,”The Annals of Statistics, 37, 184–222. [458]

Andersen, T. G., Bollerslev, T., Diebold, F., and Labys, P. (2003), “Modeling and Forecasting Realized Volatility,”Econometrica, 71, 579–625. [457] Bajgrowicz, P., Scaillet, O., and Treccani, A. (2013), “Jumps in High-Frequency

Data: Spurious Detections, Dynamics, and News,” working paper, available athttp://papers.ssrn.com/sol3/papers.cfm?abstract_id=1343900. [459]

Bandi, F. M., and Russell, J. R. (2006), “Separating Microstructure Noise From Volatility,”Journal of Financial Economics, 79, 655–692. [458]

Barndorff-Nielsen, O., Graversen, S., Jacod, J., Podolskij, M., and Shephard, N. (2006), “A Central Limit Theorem for Realised Power and Bipower Variations of Continuous Semimartingales,” inFrom Stochastic Analysis to Mathematical Finance, Festschrift for Albert Shiryaev, eds. Y. Kabanov and R. Lipster, Berlin: Springer. [466]

Barndorff-Nielsen, O. E., Hansen, P. R., Lunde, A., and Shephard, N. (2008), “Designing Realised Kernels to Measure the Ex-post Variation of Equity Prices in the Presence of Noise,”Econometrica, 76, 1481–1536. [458] Barndorff-Nielsen, O. E., and Shephard, N. (2004), “Power and Bipower

Varia-tion With Stochastic Volatility and Jumps,”Journal of Financial Economet-rics, 2, 1–37. [457]

——— (2006), “Econometrics of Testing for Jumps in Financial Eco-nomics Using Bipower Variation,”Journal of Financial Econometrics, 4, 1–30. [457]

Christensen, K., Oomen, R., and Podolskij, M. (2014), “Fact or Friction: Jumps at Ultra High Frequency,”Journal of Financial Economics, forthcoming. [459]

Cont, R., and Tankov, P. (2004),Financial Modelling With Jump Processes, London: Chapman & Hall / CRC Press. [462]

Fan, J., and Wang, Y. (2007), “Multi-Scale Jump and Volatility Analysis for High-Frequency Financial Data,”Journal of the American Statistical Asso-ciation, 102, 1349–1362. [458,462]

Jacod, J. (2008), “Asymptotic Properties of Realized Power Variations and Related Functionals of Semimartingales,”Stochastic Processes and their Application, 118, 517–559. [457,459]

——— (2012), “Statistics and High Frequency Data,” inProceedings of the 7th S´eminaire Europ´een de Statistique, La Manga, 2007: Statistical Methods for Stochastic Differential Equations, eds. M. Kessler, A. Lindner, and M. Sorensen, Boca Raton, FL: Chapman & Hall / CRC Press. [465,466] Jacod, J., Li, Y., Mykland, P. A., Podolskij, M., and Vetter, M. (2009),

“Microstructure Noise in the Continuous Case: The Pre-Averaging Ap-proach,” Stochastic Processes and their Applications, 119, 2249–2276. [458,459,460,461]

Jacod, J., Podolskij, M., and Vetter, M. (2010), “Limit Theorems for Moving Averages of Discretized Processes Plus Noise,”The Annals of Statistics, 38, 1478–1545. [460,461,465,466]

Mancini, C. (2009), “Nonparametric Threshold Estimation for Models With Stochastic Diffusion Coefficient and Jumps,” Scandinavian Journal of Statistics, 36, 270–296. [457,460]

Podolskij, M., and Vetter, M. (2009a), “Bipower-Type Estimation in a Noisy Diffusion Setting,”Stochastic Processes and their Applications, 119, 2803– 2831. [458,461,462]

——— (2009b), “Estimation of Volatility Functionals in the Simultaneous Presence of Microstructure Noise and Jumps,” Bernoulli, 15, 634–658. [458,461,462,464]

Rosenbaum, M. (2009), “Integrated Volatility and Round-off Error,”Bernoulli, 15, 687–720. [458]

Veraart, A. (2011), “How Precise is the Finite Sample Approximation of the Asymptotic Distribution of Realised Variation Measures in the Presence of Jumps?,”Advances in Statistical Analysis, 95, 253–291. [457]

Wang, Y. (1995), “Jump and Sharp Cusp Detection by Wavelets,”Biometrika, 82, 385–397. [462]

Wu, L. (2008), “Modeling Financial Security Returns Using Levy Processes,” inHandbooks in Operations Research and Management Science: Financial Engineering, Amsterdam: Elsevier. [457]

Xiu, D. (2010), “Quasi-Maximum Likelihood Estimation of Volatility With High Frequency Data,”Journal of Econometrics, 159, 235–250. [458] Zhang, L. (2006), “Efficient Estimation of Stochastic Volatility Using

Noisy Observations: A Multi-Scale Approach,” Bernoulli, 12, 1019– 1043. [458]

Zhang, L., Mykland, P., and A¨ıt-Sahalia, Y. (2005), “A Tale of Two Time Scales: Determining Integrated Volatility With Noisy High-Frequency Data,” Jour-nal of the American Statistical Association, 100, 1394–1411. [458,461]