07350015%2E2012%2E754313

(1)

Full Terms & Conditions of access and use can be found at

http://www.tandfonline.com/action/journalInformation?journalCode=ubes20

Download by: [Universitas Maritim Raja Ali Haji] Date: 11 January 2016, At: 22:05

Journal of Business & Economic Statistics

ISSN: 0735-0015 (Print) 1537-2707 (Online) Journal homepage: http://www.tandfonline.com/loi/ubes20

Preaveraging-Based Estimation of Quadratic

Variation in the Presence of Noise and Jumps:

Theory, Implementation, and Empirical Evidence

Nikolaus Hautsch & Mark Podolskij

To cite this article: Nikolaus Hautsch & Mark Podolskij (2013) Preaveraging-Based Estimation of Quadratic Variation in the Presence of Noise and Jumps: Theory, Implementation,

and Empirical Evidence, Journal of Business & Economic Statistics, 31:2, 165-183, DOI: 10.1080/07350015.2012.754313

To link to this article: http://dx.doi.org/10.1080/07350015.2012.754313

Accepted author version posted online: 04 Feb 2013.

Submit your article to this journal

Article views: 276

(2)

Preaveraging-Based Estimation of Quadratic

Variation in the Presence of Noise and Jumps:

Theory, Implementation, and Empirical Evidence

Nikolaus H

AUTSCH

Institute for Statistics and Econometrics and Center for Applied Statistics and Economics (CASE),

Humboldt-Universit ¨at zu Berlin and Center for Financial Studies (CFS), Frankfurt, D-10178 Berlin, Germany ([email protected])

Mark P

ODOLSKIJ

Institute of Applied Mathematics, University of Heidelberg, D-69120 Heidelberg, Germany ([email protected])

This article contributes to the theory for preaveraging estimators of the daily quadratic variation of asset prices and provides novel empirical evidence. We develop asymptotic theory for preaveraging estimators in the case of autocorrelated microstructure noise and propose an explicit test for serial dependence. Moreover, we extend the theory on preaveraging estimators for processes involving jumps. We discuss several jump-robust measures and derive feasible central limit theorems for the general quadratic variation. Using transaction data of different stocks traded at the New York Stock Exchange, we analyze the estimators’ sensitivity to the choice of the preaveraging bandwidth. Moreover, we investigate the dependence of preaveraging-based inference on the sampling scheme, the sampling frequency, microstructure noise properties, and the occurrence of jumps. As a result of a thorough empirical study, we provide guidance for optimal implementation of preaveraging estimators and discuss potential pitfalls in practice.

KEY WORDS: High frequency data; Jump diffusions; Power variations; Sampling schemes; Volatility estimation.

1. INTRODUCTION

The availability of high-frequency data has significantly in-fluenced research on volatility estimation during the last decade. Inspired by the seminal articles by Andersen et al. (2001) and Barndorff-Nielsen and Shephard (2002), the idea of estimating daily volatility using realized measures relying on intraday data has stimulated a new and active area of volatility modeling and estimation.

This article provides new theory, implementation details, and extensive and detailed empirical results for the class of preaver-aging estimators originally introduced by Jacod et al. (2009)— henceforth JLMPV09. The article’s aim is two-fold. First, we ex-tend existing theory for preaveraging estimators in the presence of serially dependent noise and jumps. We derive the asymptotic properties of preaveraging estimators if market microstructure noise is autocorrelated and suggest an explicit test for depen-dence up to q lags (so-called q dependence). To address the occurrence of jumps, we derive a feasible central limit the-orem for preaveraged multipower estimators and appropriate local estimators for underlying components of the estimators’ asymptotic variance. Second, we provide a thorough empirical study on the properties of preaveraging estimators in practice. We study their empirical performance relative to alternative state-of-the-art estimators and analyze the impact of implemen-tation details, such as the choice of local interval lengths, the role of sampling frequency, finite-sample adjustments, and the impact of the sampling scheme depending on noise properties and jumps.

The main difficulty in estimating the daily quadratic varia-tion of asset prices using noisy high-frequency data is how to optimally employ a maximum of information without being af-fected by so-called market microstructure noise arising from market frictions, such as the bid–ask bounce or the discreteness of prices, inducing a deviation of the price process from a con-tinuous semimartingale process. As discussed, for example, by Hansen and Lunde (2006), among others, market microstructure noise can lead to severe biases and inconsistency of the estima-tors when the sampling frequency tends to infinity. To overcome these effects, sparse sampling, for example, based on 20 min, has been proposed as an ad hoc solution (see, e.g., Andersen et al.

2003). The major disadvantage of sparse sampling, however, is the enormous data loss and the resulting inefficiency of the estimator. As a result, various methods for bias corrections and filtering of noise effects have been proposed in the literature. The most prominent and general approaches are the realized kernel (RK) estimator by Barndorff-Nielsen et al. (2008a) (henceforth BHLS08a), the maximum likelihood estimator by A¨ıt-Sahalia, Mykland, and Zhang (2005), and the preaveraging estimator by JLMPV09. Andersen, Bollerslev, and Diebold (2008) provided an overview of alternative approaches. For sound practical im-plementations, however, a good understanding of the estimators’ finite-sample behavior, their dependence on plug-in estimators

April 2013, Vol. 31, No. 2 DOI:10.1080/07350015.2012.754313

165

(3)

and data-specific adjustments, and implementation details is cru-cial. In the case of kernel-type estimators, the empirical perfor-mance and properties are analyzed by Hansen and Lunde (2006) and Barndorff-Nielsen et al. (2008b)—henceforth BHLS08b. The latter shows that the estimators’ empirical (finite-sample) properties can significantly deviate from theoretical (asymp-totic) properties making finite-sample adjustments necessary and ultimately important in financial practice. The first study analyzing the performance of preaveraging estimators is An-dersen, Dobrev, and Schaumburg (2010) who compared these estimators with jump-robust nearest neighbor truncation esti-mators. Nevertheless, preaveraging estimators’ empirical prop-erties and their sensitivity to implementation details as well as to data features are still widely unknown.

The article’s theoretical contribution is two-fold: First, we de-velop asymptotic theory for preaveraging estimators in the case of serially dependent market microstructure noise. In this con-text, we derive asymptotic theory for preaveraging estimators if microstructure noise isqdependent and propose a correspond-ing autocorrelation test. Second, we construct a feasible central limit theorem for the quadratic variation including a jump part by estimating the variation’s asymptotic variance involving lo-cal jump-robust estimates of the noise and diffusive volatility components. This makes the pure probabilistic results of Ja-cod, Podolskij, and Vetter (2009), who proved the asymptotic mixed normality of the estimator, applicable for determining confidence regions for the general quadratic variation.

The article’s second major contribution is of empirical nature and is motivated by the need for a better understanding of how preaveraging estimators and corresponding inference work in practice, and how they should be optimally implemented. We study the estimators’ dependence on the choice of the underlying local preaveraging interval, the role of the sampling scheme (i.e., calendar time sampling vs. business time sampling, transaction price sampling vs. midquote price sampling), and the impact of the sampling frequency. Of particular interest is how the derived noise-robust and jump-robust inference is reflected by the data and how the estimators perform relative to alternative state-of-the-art estimators.

Employing preaveraging-based inference to transaction and quote data of a cross section of New York Stock Exchange (NYSE) stocks covering different trading frequencies, we an-alyze the estimators’ sensitivity to the length of the underlying local preaveraging interval. We show that too small choices of the length of the local window induce biases and make estima-tors quite unstable. Addressing an inherent trade-off between estimators’ bias and variance, we suggest a rule of thumb for choosing the local window. In this context, finite-sample adjust-ments of the estimator and the underlying sampling scheme are particularly important. Using an optimal preaveraging scheme, we find jump proportions between 5% and 10% on average. Ignoring the possibility of jumps in the price process leads to an overestimation of confidence intervals that can be substantial in the case of less liquid assets. Moreover, our results suggest im-plementing preaveraging estimators based on maximally high sampling frequencies. A reduction in sampling frequency tends to imply an “oversmoothing” of volatility resulting in negative biases. As a result of this analysis, we derive suggestions for optimal implementation of this class of estimators and tests in practice.

The remainder of the article is organized as follows. In Sec-tion2, we present the basic case with underlying conditionally independent noise without jumps, discuss the underlying theo-retical framework, and provide evidence on the optimal choice of the preaveraging interval. Section3 considers preaveraging estimators under the assumption of continuous semimartingales with dependent noise and presents empirical evidence based on an explicit test. In Section4, we discuss the discontinuous case including the possibility of jumps. Finally, in Section5, the time series properties and the impact of sampling frequen-cies and schemes are analyzed, while Section6concludes. All proofs are collected in the Appendix.

2. PREAVERAGING ESTIMATORS FOR THE QUADRATIC VARIATION: THE CONTINUOUS CASE

2.1 The Basic Model

It is well known in finance that, under the no-arbitrage as-sumption, price processes must follow a semimartingale (see, e.g., Delbaen and Schachermayer 1994). In this section, we consider a continuous semimartingale (Xt)t_≥0of the form

Xt =X0+ t

0

asds+

t

0

σsdWs. (2.1)

Here,Wdenotes a one-dimensional Brownian motion, (as)s≥0is a càglàd drift process, and (σs)s≥0is an adapted càdlàg volatility process. Moreover,tdenotes (continuous) calendar time.

However, due to various market frictions, such as price dis-creteness or bid–ask spreads, the efficient priceX is contami-nated by noise. Thus, we can only observe a noisy versionZof the processX. More precisely, we consider a filtered probabil-ity space (,F,(F_t₎_t_≥₀,P) on which we define the processZ, observed atntime points, indexed byi₌0,1, . . . , n, as

Zt =Xt+Ut, t ≥0, (2.2)

whereUt denotes the error term. In the case of transaction time sampling (TRTS),iindexes the (irregular) time points associated with eachn-th trade. Hence,n=[N/n] withNdenoting the number of trades untilt. In the case of calendar time sampling (CTS),iindexes equal-spaced time intervals of lengthnwith n₌[t /n]. Here, the price of the most previous observation occurring before the end of the sampling interval is used (pre-vious tick sampling). For convenience, we show all theoretical relationships using the CTS notation. However, all relationships also hold for TRTS withi,n, andnaccordingly defined.

We are in the framework of high-frequency asymptotics or so-called “in-fill” asymptotics, that is,n→0. We assume that Ut’s are, conditional on the efficient priceX=(Xs)s≥0, centered and independent, that is,

E(Ut|X)=0, Ut ⊥⊥Us, t=s,conditional onX. (2.3)

Furthermore, we assume that the conditional variance of the noise processU, defined as

α2_t ₌EU_t2_|X, (2.4)

is c`adl`ag, and introduce the process

Nt(q)=E(|Ut|q|X), (2.5)

(4)

which denotes theqth conditional absolute moment of the pro-cessU. Note that the noise specification does not entail a noise level tending to 0 as n_{→ ∞}. This property is an attractive feature for empirical work and will also be illustrated in ex-amples given later. Model (2.2) was originally introduced by JLMPV09. Note that the process (2.2) with conditions (2.3) and (2.4) is defined in continuous time (see JLMPV09 for the precise construction of the underlying probability space). In Section3, we discuss a discrete version of the model to introduce serial dependence in noise.

Model (2.2) allows in particular for time-varying variances of the noise and dependence between the efficient priceXand the microstructure noiseU. Let us give some examples to illustrate the applicability of our model. (i) The standard additive iid noise model

Zt =Xt+Ut,

where Ut’s are iid with E(Ut)=0, E(Ut2)=̟2, and inde-pendent of the efficient price X, is obviously included in our framework. (ii) The pure rounding model

Zt =γ[Xt/γ]

at fixed level γ >0 does not satisfy the first assumption of (2.3). In fact, it is easy to observe that the integrated variance cannot be estimated in the pure rounding model, because only observations on a fixed gridγZare available.

(iii) In contrast to the pure rounding process in (ii), iid noise plus rounding models may satisfy our assumptions. Consider the model

Zt=γ[(Xt+Vt)/γ],

whereγ >0 is a fixed rounding level, andVt’s are iidU([0, γ]) distributed and independent ofX. Then the processUt =Zt− Xtsatisfies the assumption (2.3), and we have the identity

α_t2₌γ2

2.2 The Preaveraging Method and Asymptotic Results

Our main goal is the estimation of the integrated variance (IV) (or quadratic variation) defined as

IVt=[X]t =

t

0

σ_u2du. (2.6)

The preaveraging method as originally introduced by Podolskij and Vetter (2009a) is based on certain local moving averages that reduce the influence of the noise processU. Below, we briefly review the asymptotic theory for the preaveraging method as presented in JLMPV09.

To construct the estimator, we choose a sequenceknof inte-gers, which satisfies

kn1n/2=θ+o

1_n/4 (2.7)

for someθ >0, and a nonzero real-valued functiong: [0,1]→

R, which is continuous, piecewise continuously differentiable

such thatg′_{is piecewise Lipschitz, and}_g₍₀₎₌_g₍₁₎₌_{0. Then,}

gis associated with the following real-valued numbers:

ψ1 = this case, the aforementioned constants are

ψ1 =1, ψ2 =

The preaveraged returns are given as

Zn_i ₌

Note that the latter performs a weighted averaging of the in-crementsn_jZ in the local window [in,(i+kn)n]. It is in-tuitively clear that such an averaging diminishes the influence of the noise to some extent. The window sizekn is chosen to be of order−n1/2to find a balance between the two conflicting convergence rates that are due to the diffusive part and the noise part. This choice will lead later to optimal convergence rates.

Our main statistical instruments are the realized variance (RV)

RVnt = [t /n]

i=0

n_iZ2, (2.9)

and the preaveraging estimator

V(Z,2)nt = [t /n]−kn

i=0

Zn_i2, (2.10)

which is a direct analog of RV based on preaveraged returns Zn_i. We remark that RVnt is not an appropriate estimator of the integrated volatility in the noisy diffusion model (2.2). More precisely, we have that

n

that is, a normalized version of RVnt converges to the inte-grated conditional variance of the noise process U (see, e.g., JLMPV09).

Denote a mixed normal distribution with expectation 0 and (conditional) variance F2 _{by MN(0}_{, F}2_{) and assume that} the process Nt(2) is locally bounded. Then, as shown by JLMPV09, we have

(5)

where by (2.11) a consistent estimator of IVtis obtained by

Hence, the consistency of Cn

t is achieved by subtracting the noise-induced bias term that is similar to Zhang, Mykland, and A¨ıt-Sahalia (2005). Furthermore, under the assumption that the processNt(8) is locally bounded, JLMPV09 showed that

−_n1/4C_tn₋IVt

st

−→MN(0, Ŵt), (2.14)

where the convergence is stable in distribution [see, e.g., Renyi (1963) for the definition of stable convergence], and Ŵt =

u are defined by

γ_u2 ₌ 4 known to be the best attainable (see Gloter and Jacod2001). To obtain a feasible version of stable convergence in (2.14), JLMPV09 suggested a consistent estimator of the conditional varianceŴtgiven by

Ŵn_t ₌ 422

The statisticŴn

t is obtained by separately estimating the three integrals appearing in the definition of Ŵt. It has the appeal-ing property of beappeal-ing positive as long as the weight function g(x)=x_∧(1−x) is used. Then, by the properties of stable convergence, we deduce the convergence in distribution

−n1/4

Clearly, we are also able to estimate other random quantities of similar form as₀tγ2

udu. Consider a random variable of the type

for given constantsa, b, c. Following the arguments of remark 4 in JLMPV09, a consistent estimator ofT is obtained by

T_tn₌ a sistent estimator of the integrated quarticity₀tσ4

udu.

To reduce the finite-sample bias in the case of too small sampling frequencies, we need to slightly modify our statistics. First, let us introduce the finite-sample analogs of the quantities ψ1,ψ2,ij,φ1(s),φ2(s) as given by (i, l₌1,2), but the above approximations are the “correct” quantities (i.e., these are the quantities that really appear in the proof). Now, we redefine our main statistics as follows:

C_t,an ₌

(6)

and

where the subscriptastands foradjusted. Note that the finite-sample adjustments do not influence the asymptotic results.

To explain the aforementioned modifications, let us consider the statisticCn

t,a. First, the factor [t /n]/([t /n]−kn+1) in the definition ofC_t,an is an adjustment for the true number of summands inV(Z,2)nt. On the other hand, we have that

where the error of this approximation is of ordern and has expectation 0. This means that the original statisticC_tnactually estimates

2.3 Choosingθin Practice

To analyze the empirical properties of the preaveraging es-timators and their sensitivity to the choice of θ, we employ transaction and quote data of the stocks Exxon (XOM), Home-depot (HD), and Sonoco Products Co. (SON) traded at the NYSE between May and August 2006 corresponding to 5 months or 88 trading days. We see this 5-month period as a good com-promise between choosing a sample that, on the one hand, is sufficiently long to guarantee empirically valid results but, on the other hand, prevents aggregation over possibly many different states of volatility. XOM and HD represent highly liquid stocks with approximately 12 (80) and 8 (60) trade (quote) arrivals per minute, respectively. Conversely, SON is significantly less liq-uid with approximately 2 (14) trade (quote) arrivals per minute. As illustrated below, these stocks represent different empiri-cal features and thus allow us to gain valuable insights into the empirical performance of volatility estimators. However, to pro-vide even more cross-sectional epro-vidence, all empirical findings reported in the article are replicated for three additional stocks,

Table 1. Summary statistics of the underlying data

XOM HD SON

Avg. time between trades (in sec) 5.28 7.19 36.29 Avg. time between quote arrivals (in sec) 0.74 1.07 4.33 Avg. time between nonzero quote changes

(in sec)

3.16 4.60 14.29

Avg. number of trades 4428 3255 644 Avg. number of quotes 31,76 21,96 5405 Avg. proportion of nonzero trade returns 0.73 0.60 0.62 Avg. proportion of nonzero MQ returns 0.23 0.23 0.30

Avg. ˆα2_·1e7 0.10 0.08 0.69

Avg. ˆξ 0.31 0.23 0.23

NOTE: The table reports (daily) averages of the time between trades, quote arrivals, and nonzero quote changes, the number of trades and quotes, the proportions of nonzero trade (or midquote) returns, the (long-run) noise variance ˆα2, the (long-run) noise variance per trade ˆξ, and first-order autocorrelations of underlying sampled returns.

that is, Citigroup (C), Tektronik (TEK), and Zale Corporation (ZLC). These results are reported in the online Appendix.

Table 1gives descriptive statistics on the number of trades, quote arrivals, and nonzero returns for all stocks. It also yields information on the magnitude of market microstructure noise as represented by its varianceα2_t ₌₀tα2

uduand estimated by

ˆ

i Zdenotes trade-to-trade returns andNis the number of all transaction returns. Equation (2.19) corresponds to the es-timator proposed by Oomen (2006b) based on TRTS employing all transactions, that is,n₌N and₌1. As it provides only positive variance estimates as long as the first-order

autocorrela-tions of trade returns are negative, we replace it byNi=1 (N tivated by (2.11) in the case of iid noise. As suggested by Oomen (2006a), ˆξt:=αˆ

2

t/(IVt/N) gives the noise-to-signal ratio per trade, whereIVt is computed by the maximum likelihood esti-mator (henceforth MLRV) proposed by A¨ıt-Sahalia, Mykland, and Zhang (2005) (see the Appendix for implementation de-tails). As shown inTable 1, ˆα2_t is varying substantially across the different stocks, with ˆξt being quite stable ranging from 0.2 to 0.3. We focus on four sampling schemes:

(i) calendar time sampling (CTS) using transaction prices, (ii) calendar time sampling (CTS) using midquotes,

(iii) transaction time sampling (TRTS) using transaction prices,

(iv) tick change time sampling (TTS) using midquotes.

CTS is commonly used in the empirical literature (see, e.g., Andersen et al. 2003) and is a natural choice if the sampling frequency is moderate (e.g., 10 min). However, it faces the problem of sampling mainly zero return intervals if the sam-pling frequency is very high. In this case, TRTS is a more natural sampling scheme since one samples only whenever a transaction

(7)

Figure 1. Autocorrelations (up to three lags) of transaction price and midquote returns for XOM. Upper panel: CTS based on 1 sec and 5 sec. Lower panel: T(R)TS based on 1 tick and 2 ticks.

has occurred avoiding sampling artificial zero returns. However, even in this case, a high number of zero returns can be sampled since many trades do not necessarily imply a price or quote change. As an alternative, TTS ignores any zero returns and samples only whenever quotes have been changed. As stressed by Andersen, Dobrev, and Schaumburg (2010), TTS is justified if volatility is constant in tick time (and thus observation times of price movements and the quadratic variation of the price pro-cess are perfectly correlated), whereas CTS or TTS is optimal when observation times are exogenous to the price process and inference can be conditioned on observation times. However, empirically it is rather unclear which assumption is justifiable and thus which sampling scheme should be preferred. This mo-tivates us to focus on all commonly applied sampling schemes. Figures 1–3 show empirical autocorrelations for trade and midquote returns sampled based on the different sampling schemes. In the case of TRTS and TTS, we generally observe

negative first-order autocorrelations. The negative sign is well confirmed by the literature and is predominantly driven by the bid–ask bounce effect; see Roll (1984). Interestingly, midquote returns tend to induce higher (negative) autocorrelations than transaction returns. The same patterns also emerge based on individual ask or bid quote returns. This is induced by the fact that in the hybrid trading system of NYSE, transaction prices are often located inside of the bid–ask spread and thus are less discrete than (mid-)quote returns. Moreover, the empirical evi-dence shows that quote processes reveal strong reversals that are naturally most pronounced in the case of TTS. Corresponding evidence is also confirmed by Hautsch and Huang (2012) who analyzed the quote impact of order arrivals based on blue chip stocks. These findings show that using (mid-)quote returns does not necessarily allow reducing the discreteness (and thus noise) in the data. Overall, the serial dependence in CTS returns is lower. This is particularly true in the case of trade price returns

Figure 2. Autocorrelations (up to three lags) of transaction price and midquote returns for HD. Upper panel: CTS based on 1 sec and 5 sec. Lower panel: T(R)TS based on 1 tick and 2 ticks.

(8)

Figure 3. Autocorrelations (up to three lags) of transaction price and midquote returns for SON. Upper panel: CTS based on 1 sec and 5 sec. Lower panel: T(R)TS based on 1 tick and 2 ticks.

and if the sampling frequency is higher than the average trade arrival frequency. As a result, consecutive returns are mostly zero and trade-to-trade (or tick-to-tick) dynamics break down. Interestingly, returns based on XOM trading are positively au-tocorrelated. This is in contrast to common findings and makes this stock an interesting special case to analyze the properties of preaveraging estimators under various conditions. See also Hansen and Lunde (2006), Oomen (2006b), and Gatheral and Oomen (2010) for an extensive discussion of empirical proper-ties of market microstructure noise effects.

To gain insights into the performance of the proposed preaver-aging estimators, we benchmark them against the most common competing approaches employed in recent literature. In partic-ular, we implement a subsampled RV estimator computed as

RVSSt=kn−1 [t /n]−kn

i=0

|Z(i+kn)n−Zin|

2

. (2.20)

Hence, RVSS corresponds to the subsampling counterpart to the preaveraging estimator as it runs on the same underlying return grid (n

i) and samples over intervals of the lengthkn. In fact, it corresponds to a preaveraging estimator that equally weights all return observations within the local window [in,(i+kn)n] and thus allows us to study the role of the implicit preaveraging

weighting scheme. Moreover, we use the maximum likelihood estimator (henceforth MLRV) proposed by A¨ıt-Sahalia, Myk-land, and Zhang (2005) and the realized kernel (RK) estima-tor (henceforth (MLRV)) introduced by BHLS08a based on the Tukey-Hanning2kernel with optimal bandwidth. For implemen-tation details, see the Appendix. The preaveraging estimators and their asymptotic variances are computed as described in the previous section. As the estimators are not necessarily positive in all cases, we bound them from below by zero. This happens, however, quite rarely and only in cases where eitherθornis chosen to be very large or in the case of an insufficient number of intraday observations (as sometimes occurs in the case of more illiquid stocks).

Figures 4 and 5 show plots of Cn

t and Ct,an for different choices ofθbased on 3-sec CTS as well as TRTS and TTS using transaction prices and midquote changes. Note that in the case of high values forθ, that is, high values ofkn, the preaveraging estimators cannot necessarily be computed when there are not sufficient intraday observations. This might occur particularly in the case of less liquid assets. Then, the corresponding figures are based on only those trading days for which allestimators can be computed for all values of θ avoiding any sample selection effects. The range of analyzed realizations of θ is chosen to guarantee using at least 90% of the overall sample.

Figure 4. Daily averages ofCn

t (multiplied by 1e5) depending onθ. Based on T(R)TS and 3-sec CTS using prices and midquotes. Benchmarks:

MLRV and RK.

(9)

Figure 5. Daily averages of Cn

t,a (multiplied by 1e5) depending onθ. Based on T(R)TS and 3-sec CTS using prices and midquotes.

Benchmarks: RVSS, MLRV, and RK.

The estimators’ sensitivity to the choice ofθ, and thus the width of the local preaveraging window, is highest ifθis small. This sensitivity is particularly strong in the case of T(R)TS inducing a strongly negative bias (relative to the other estimators) ifθis small. This bias is obviously induced by the fact that in finite samples, the statisticCn

t estimatesAn·IVtwith

An=

1− ψ

kn

1 n 2θ2_ψkn

2

(see Section 2.2). For θ_→0, A_n is downward biased. This effect is strongest in the case of T(R)TS. For CTS, the bias is significantly smaller. Since 3-sec CTS and T(R)TS employ essentially very similar underlying price or quote sampling in-formation, the differentθ-sensitivity of both sampling schemes is only explained by a higher sampling frequency−1

n in the case of CTS resulting in 7800 sampling intervals per day and scaling upA_n_{toward one. Hence, CTS based on possibly high} frequencies is suggested to remove finite-sample biases. Con-versely, as shown inTable 1, even for very actively traded assets, T(R)TS does not induce sufficiently many sampling intervals to ensureA_n_≈1. Hence, in the case of T(R)TS, scaling byA−1 n as inCn

t,ais essential to reduce the estimator’s negative bias and ensures stabilizing the estimates for small values ofθ. However, even after finite-sample adjustments, we still observe a slightly negative bias of the estimator in the case of T(R)TS. This is obvi-ously induced by a relatively strong impact of the noise-induced componentψ1nRVnt/(2θ2ψ2) driving down the estimator ifkn is small. In this case, the width of the local window is not suf-ficient to diminish the influence of noise. For larger values ofθ and thuskn, this effect vanishes and we observe a stabilization of the estimator.

For larger values ofθ, the impact of the underlying sampling scheme diminishes and the estimates tend to stabilize. Preav-eraging and RK estimators are quite similar (on average) for values ofθ around 0.4–0.6. This choice ofθseems to widely resemble the optimal bandwidth choice in the RK according to BHLS08b. If, however, the preaveraging interval becomes too high, we observe a downward trend of the estimates. In these cases, the estimators tend to “oversmooth” and thus yield downward biased estimates. These results suggest choosingθ not too small, but rather conservatively, confirming the results by BHLS08b. Plotting also the subsampled estimators, RVSS, allows us to analyze the impact of the implicit preaveraging weighting schemes. We observe that in the case of longer local intervals, the subsampling estimator is downward biased relative

to the preaveraging estimator and thus tends to “oversmooth” even more strongly. If the preaveraging interval shrinks, also the subsampling estimator becomes quite sensitive to the choice ofθ and reveals significant biases. However, interestingly, it is less biased than T(R)TS-based preaveraging estimators. In these situations, the implicit preaveraging weighting scheme induces an even higherθ-sensitivity.

To analyze the impact of the underlying sampling fre-quency,Figure 6shows theθ-sensitivity ofCn

t andCt,an based on TRTS employing different sampling frequencies (n∈

{1,3,5,10,20}) based on CTS and T(R)TS. As expected, the finite-sample adjustment becomes even more important when nis large, that is, the sampling frequency is small. In these scenarios, the negative biases for small values ofθbecome sub-stantial. The lower panel ofFigure 6shows that in these cases, the finite-sample adjustment is still very effective and signifi-cantly reduces the bias (note that the figures use different scales on the vertical axis). Moreover, we observe a downward bias of the estimator with increasingn. Hence, preaveraging based on sparse sampling implies an oversmoothing and thus downward bias of volatility. This is particularly true for the subsampled RV estimator. In this case, an implicitequalweighting of under-lying returns is clearly inferior to the preaveraging weighting schemeg(·) (see Section2) putting less weights on more distant observations. We also observe that these effects become more extreme if the underlying liquidity of the stock declines.

Finally, note that we consider average values of θ across the entire sample. However, recall from (2.15) that an optimal choice ofθdepends on thelocalsignal-to-noise ratio and thus should also depend on intraday variations of volatility. With the latter following the typical U-shape intraday pattern, a constant θtends to oversmooth in the morning and undersmooth around midday. This effect is naturally stronger based on CTS than based on T(R)TS. However, our empirical findings above show that oversmoothing (implied by a too highθ) is less harmful for the estimator’s bias than undersmoothing (implied by a too smallθ). Consequently, as practical guidance in situations where the intraday volatility strongly fluctuates, we suggest usingθ rather conservatively with the choices ofθrecommended above serving as lower bounds. Theoretically, however, the parameter θ should be indeed chosen according to the local signal-to-noise ratio. The precise description of such an approach and its efficiency gain compared with the case of a constant choice ofθ are investigated by Jacod and Mykland (2012). However, their method is computationally clearly more involved.

(10)

t and MLRV (top panel) as well as Ct,an and RVSSt (bottom panel) for different values of θ and n∈

{1,3,5,10,20_}using TRTS. All estimators multiplied by 1e5.

Figure 7depicts the average values of the approximate stan-dard deviations of the preaveraging estimates with finite-sample adjustments,1n/4

Ŵt,a. For the more liquid assets, we observe that the standard deviation is monotonically increasing inθ re-gardless of the finite-sample adjustment. Hence, using a larger preaveraging window ultimately increases the estimation error. However, for less liquid stocks (see also the online Appendix), the corresponding plots reveal slight nonmonotonicities with in-creasing variances ifθ becomes too small. In these cases, the local window is obviously too short to effectively remove mar-ket microstructure noise. Hence, choosing too small values ofθ in the case of less liquid assets can be harmful not only for the estimator’s unbiasedness but also for its efficiency.

The relationships revealed by Figures4–7indicate an obvious trade-off between bias and efficiency. Minimizing the estima-tor’s variance generally yields very small values ofθand thus the length of the local interval. For these realizations, however, the “θ-signature plots” in Figures 4 and5 reveal the highest sensitivity of the estimators with respect toθand the underlying sampling schemes. Moreover, as shown in Section4, choosing

θ too small turns out to be also quite harmful for jump-robust estimations and jump detections. Hence, aiming at finding a “global” choice ofθ balancing biases (in the case of too short local intervals) against inefficiency (in the case of too long local intervals),θ should be chosen as the smallest possible value at which the signature plots tend to stabilize and the divergence between different sampling schemes tends to vanish. Accord-ing to Figures4and5, the optimal choice ofθ is thus around 0.4 based on liquid assets. In the case of less liquid stocks, we suggestθ_≈0.6.

3. CONTINUOUS SEMIMARTINGALES WITH DEPENDENT NOISE

3.1 Preaveraging With Dependent Noise

Model (2.2) implies that the noise process is uncorrelated, which is not realistic at very high frequencies (see, e.g., Hansen and Lunde2006). In this section, we allow for serial correlation in the noise process and derive the corresponding asymptotic

Figure 7. Averages ofn−t1/4

√

Ŵt,a(multiplied by 1e5) for different choices ofθ. Based on highest frequency T(R)TS and 3-sec CTS using

prices and midquotes.

(11)

results for the preaveraging method. For an alternative approach of dealing with dependent noise, see Nolte and Voev (2009). We consider a model

Zin=Xin+Ui, (3.21)

whereX andUare independent, and (Ui)i≥0is a stationaryq -dependent sequence (for some knownq >0), that is,UiandUj are independent for|i₋j_|> q. As we remarked before, model (3.21) only makes sense in discrete time due to the dependence structure of the noise. We define the covariance function by

ρ(k) :=cov(U1, U1₊k).

Similar to the convergence in (2.12), we obtain the following results.

Similar to (2.12), we obtain a bias that has to be estimated from the data. The estimation procedure is a bit more involved than in the previous section. We introduce a class of estimators given by

γ_tn(k)=n [t /n]

i=0

n_iZn_i₊_kZ, k₌0, . . . , q₊1. (3.23)

In the econometric literature, such estimators are called realized autocovariances (see, e.g., BHLS08a). The following lemma describes the asymptotic behavior ofγn

t (k).

t. The asymptotic results are presented in the following theorem.

Furthermore, whenEU8

1 <∞, we deduce the associated central limit theorem

As before, we need to estimate the conditional varianceŴt(q) to obtain a feasible version of (3.25). In fact, the estimation procedure is somewhat easier than the one presented in Section

2, because the quantityρ2_{is not time varying.}

Proposition 1. Assume thatEU₁8<_∞. Then, we obtain

and it holds that

1n/4

The corresponding finite-sample adjustments are obtained similarly as in Section2.2and are given by

C_t,an (q) :=

(12)

t(1) (upper panel) andCnt(2) (lower panel) for different values ofθ. Estimators multiplied by 1e5. Based on

highest frequency T(R)TS and 3-sec CTS using prices and midquotes. Benchmarks: RVSS, MLRV, and RK implemented based on trade prices.

and

Ŵ_t,an (q)=

1− ψ

kn

1 n 2θ2_ψkn

2 −2

4kn

22[t /n]

3θψkn

2 4

([t /n]−kn+1)

× [t /n]−kn

i=0

Zn_i4₊ 8ρ 2 n

√

n[t /n] θ2_([_{t /}

n]−kn+1)

×

kn

12

ψkn

2 3 −

kn

22ψ1

ψkn

2 4

[t /n]−2kn

i=0 Zn_i2

+ 4ρ

4 n[t /n] θ3_([_{t /}

n]−kn+1)

×

kn

11

ψkn

2 2 −2

kn

12ψ kn

1

ψkn

2 3 +

kn

22

ψkn

1 2

ψkn

2 4

. (3.28)

Figure 8shows daily averages ofCn

t,a(1) andC n

t,a(2) depend-ing onθ. Comparing these signature plots with Figures4and5, we observe that in the case of CTS, the dependent noise-robust version of the preaveraging estimator provides similar shapes as the iid-noise version. Conversely, for T(R)TS, we observe that in most casesCn

t,a(q)> Ct,an . Comparing the expressions for Cn

t,a(q) and Ct,an (and standardizing t to one), this can be induced only by

nRVnt 2 > ρ

2 n.

Obviously, if there is negative (positive) serial dependence in the noise process{U_}, the estimatornRVnt/2 is upward (down-ward) biased. As shown below, we indeed find evidence for significant serial correlations in the noise process inducing an upward bias of the expressionnRVnt/2. These effects are most

distinct for TTS for which also the strongest serial dependence in the return process is found (seeTable 1). Similar but weaker effects are observable in the case of TRTS. The figures show that the corresponding finite-sample adjustments seem to im-ply overcorrections of the estimators’ negative biases inducing even slight upward biases ifθis small. This indicates thatCn

t,a(q) tends to be more sensitive to the choice ofθthanC_t,an . Overall, the signature plots reveal similar shapes as forC_t,an . Hence, the θ-sensitivity is clearly reduced for valuesθ >0.4.

3.2 A Test for Dependence

In the previous section, we assumed that the number of non-vanishing covariancesqis known. In practice, however, we need a decision rule on the choice ofqbased on the discrete observa-tions of the price process. Formally speaking, we require a test procedure to decide whetherρ(k)=0 for somek_≥1. To illus-trate the underlying idea, consider a 1-dependent noise model. In this case, we aim at testing whetherρ(1)=0, that is, whether the noise process is actually iid. Thus, we obtain the following hypothesis:

H0:ρ(1)=0, H1:ρ(1)=0.

As we will observe from the proof of Lemma 2, the statis-ticsγ_tn(k) are asymptotically unaffected by the presence of the processX (thus, only the noise processUdrives their asymp-totic behavior). Hence, by a standard central limit theorem for

q-dependent variables, we deduce that

−_n1/2

−γ

n t (2)

t −ρ(1)

d

−→N

0,τ

2

t

,

(13)

Figure 9. Histogram of AR(1) test statisticSn. Based on highest frequency T(R)TS and 3-sec CTS.

whereγ_tn(2) is defined by (3.23) andτ2is given by

τ2₌τ(0)+2 2

k=1 τ(k),

τ(k)=cov((U1−U0)(U3−U2),(U1+k−Uk)(U3+k−U2+k)). Obviously, we require a consistent estimator ofτ(k), 0_≤k_≤2, to construct a test statistic. SetHn

i = n

iZ

n

i+2Z. Then, by the weak law of large numbers forq-dependent variables, we deduce that

τn(k)= n

[t /n]

i=0

Hn iH

n i+k−H

n iH

n i+3

t

P

−→τ(k),

0≤k_≤2.

The latter implies thatτ_n2₌τn(0)+2

2

k=1τn(k) is a consistent estimator ofτ2. Then, the test statistic is given by

Sn=−

−n1/2γtn(2)

t τ2 n

.

Observe that underH0:ρ(1)=0, we obtain that

Sn d

−→N(0,1).

We rejectH0:ρ(1)=0 at levelαwhen

|Sn|> c1−α

2, where c1−α

2 denotes the (1− α

2) quantile of N(0,1) distribu-tion. Note that this test is consistent against the alternative H1:ρ(1)=0.

Figure 9shows the (time series) distribution of the test statis-ticSnbased on highest frequency T(R)TS and 3-sec CTS. We observe for liquid stocks that Sn takes highly negative values supporting the evidence for significantly negative serial depen-dence in the noise process and confirming the results by Hansen and Lunde (2006). The strength of the serial dependence is decreasing if the underlying trading frequency of the stock be-comes smaller. Correspondingly, the highest test statistics are found for XOM, whereas for the less liquid stocks, the time series distribution of the test statistic is virtually symmetrically around zero. Moreover, the highest magnitudes are found for TTS using midquotes for which we also observe the strongest serial dependence in the underlying return series. Conversely, 3-sec CTS yields significant lower test statistics (in absolute terms). Hence, we can conclude that the serial dependence in underlying returns is mainly driven by a serial dependence in

the corresponding noise process. As reported in Section2, these autocorrelations and, thus, the test statistic Sn are obviously strongly dependent on the underlying sampling scheme. This is most evident for midquote-based sampling. Conversely, based on CTS,Snis close to zero on average in most cases. This also explains whyC_t,an andC_t,an (q),q _≥1, perform similarly in the case of CTS but not in the case of T(R)TS.

4. PREAVERAGING ESTIMATORS FOR THE QUADRATIC VARIATION: THE DISCONTINUOUS

CASE

In this section, we present the asymptotic theory for the es-timator C_tn defined in (2.13) for the discontinuous case. We consider the processZ₌X₊Ugiven by (2.2), where (Xt)t≥0 is a discontinuous semimartingale of the form

X₌X0+ t

0

audu+

t

0

σudWu+(δ1{|δ|≤1})∗(µt−νt)

+(δ1{|δ|>1})∗µt, (4.29)

where µ is a Poisson random measure on R+×R, ν is a predictable compensator of µ with ν(ds, dx)=ds_⊗ F(dx) (whereFis aσ-finite measure),f _∗µt=

t 0

Rf(s, x) µ(ds, dx), and the processδis predictable and supx∈R|δ(s, x)∧

1|/γ(z) is locally bounded for some bounded function γ in

L2₍_R_{, F}_{). The noise process}_U_{is assumed to satisfy the} condi-tions (2.3) and (2.4).

Assume that the processNt(2) is locally bounded. Then, Ja-cod, Podolskij, and Vetter (2009) showed that for anyt >0,

C_tn_−→P [X]t =

t

0

σ_u2du₊ u≤t

|Xu|2, (4.30)

where Xu=Xu−Xu−. If, moreover, Nt(8) is locally bounded,1n/4(Ctn−[X]t) converges in law,

1_n/4C_tn₋[X]t

st

−→MN(0, Ŵt+Ŵt), (4.31)

for anyt >0, whereŴt is given by (2.16) andŴtis defined as

Ŵt = 4 ψ2

2 _T_m_≤_t |XTm|

2

×

θ 22

σ_T2

m−+σ

2 Tm

+12

θ

α_T2

m−+α

2 Tm

, (4.32)

(14)

where (Tm)m≥1are jump times ofX,Xs =Xs−Xs−, and the quantitiesψ2,12,22are given in Section2.2.

Note thatCn

t remains a −1/4

n -consistent estimator of [X]tin the discontinuous case. In fact, the consistency result in (4.31) also holds true for nonequidistant observations. In contrast to the continuous case, we obtain an additional term Ŵt in the conditional variance ofC_tn. To derive a feasible version of the stable convergence in (4.31), we need to construct a consistent estimator ofŴt. This turns out to be a complicated problem, because we obviously require jump-robust local estimators of σ2

u,α 2

u, and the left limitsσ 2 u−,α

2

u−. The jump-robust estimation methods are presented in the next section.

4.1 Jump-Robust Estimation Methods

For various problems in finance, it is extremely important to separate the diffusive part from the jump part. In a noise-free framework, a well-known approach to obtain jump-robust esti-mators of (a functional of) volatility is the multipower variation (see, e.g., Barndorff-Nielsen, Shephard, and Winkel2006). The main idea of constructing jump-robust estimators for models with microstructure noise is combining the multipower varia-tion approach with the preaveraging method. A rigorous math-ematical theory for preaveraged multipower variation has been derived by Podolskij and Vetter (2009b). Below we recall the consistency results as derived in the aforementioned article. An alternative way of jump robust measuring in diffusion models with microstructure noise, which we will not follow in this arti-cle, is the quantile estimation method introduced by Christensen, Oomen, and Podolskij (2010).

We define the preaveraged multipower variation as

V(Z, p1, . . . , pl)nt

=

[t /n]−lkn+1

i=0 Zn_ip1

Zn_i₊_k

n

p2

· · ·Zn_i₊₍_l₋₁₎_k

n|

pl_, _(4.33)

and setp+₌l k=1pk.

Assume thatZ₌X₊U,Xsatisfies (4.29), andNt(2p+) is locally bounded. Then, according to theorem 2(ii) in Podol-skij and Vetter (2009b), if max(p1, . . . , pl)<2, the asymptotic

behavior ofV(Z, p1, . . . , pl)nt is given by

1−

p+

4

n V(Z, p1, . . . , pl)nt P

−→mp1· · ·mpl

× t

0

θ ψ2σu2+ 1 θψ1α

2 u

p+

2

du, (4.34)

where mp =E[|N(0,1)|p]. Note that V(Z, p1, . . . , pl)nt is a class of jump-robust biased measures of ₀t|σu|p

+

du when max(p1, . . . , pl)<2. If p+ is an even number, we are able to biascorrect our statistic to obtain consistent estimators of t

0|σu| p+

du. In particular, we can estimate the quadratic varia-tion of the jump part and the quadratic variavaria-tion of the continu-ous part separately. Given the assumptions above, we obtain

BTnt :=

√

n

θ m2₁ψ2

V(Z,1,1)nt − ψ1n 2θ2_ψ

2 RVnt

P

−→IVt, (4.35)

BTVn_t :₌ √

n θ ψ2

V(Z,2)n_t ₋m₁−2V(Z,1,1)n_t

P

−→

u≤t

|Xs|2. (4.36)

In finite samples, it is again better to replace the constants ψ1, ψ2by their empirical counterpartsψ

kn

1 , ψ kn

2 and to standard-ize the statistic V(Z, p1, . . . , pl)nt by [t /n]/([t /n]−lkn+ 2) to account for the true number of summands. This leads to the corresponding estimators BTnt,aand BTV

n t,a.

Figure 10shows the sensitivity of the estimated (averaged) jump proportion BTVnt,a/Cnt,a to the choice of θ. We observe that for small values ofθ, the jump proportion is unrealistically high and quite sensitive to the choice ofθ. The estimator BTVnt,a is strongly upward biased if the local preaveraging window is chosen too small. This is particularly true in the case of T(R)TS and less liquid stocks. For values ofθ_≈0.4, the jump proportion stabilizes and remains widely constant on a level between 5% and 10%. This result is in close correspondence to our findings based on the signature plots in Section2.2and shows that values ofθsmaller than approximately 0.3 provide biased estimates.

4.2 Estimation of the Conditional Variance and a Feasible Central Limit Theorem

The jump-robust methods presented in the previous section open up a way to estimate the conditional variance in (4.31).

Figure 10. Averaged jump ratio BTVn t,a/C

n

t,afor different values ofθ. Based on highest frequency T(R)TS and 3-sec CTS using prices and

midquotes.

(15)

Recall that the main difficulty is the estimation of the term concentrate on the processα2_{. Consider a sequence of integers}_r

n

obviously good estimates ofα2

in− andα

2

in, respectively [this

is justified by the local version of the convergence in (2.11)]. The construction of the local estimates for the processσ2_is more complicated and is mainly based on the local versions of (4.35) and (4.36). Consider a sequence of integers ¯rnwith ¯rn→

in+are local analogs of the statistic BT

n t defined in Corollary (4.35). The numbers ¯rnmust be of higher order than −n1/2, because otherwise all summands in (4.38) [resp. (4.39)] are strongly correlated. Our first consistency result is the following theorem.

Theorem 2. Assume that the process Nt(8) is locally

bounded. Then, it holds

n_t ₌

The actual number of summands in the definition ofn_t de-pends on the particular choice of the sequencesrnand ¯rn. Note that n_t itself is not a consistent estimator of the conditional variance in (4.31). However, a particular linear combination of n

t and the three summands in the definition ofŴ n

t [see (2.16)] yield a consistent estimator ofŴt+Ŵt. For the sake of

simplic-ity, we propose such an estimator only for our canonical choice g(x)=x_∧(1−x).

Proposition 2. Assume thatg(x)=x_∧(1−x) and the pro-cessNt(8) is locally bounded. Then, it holds

˜ Section2.2. In particular, we deduce that

1n/4

The constants ψi, ij should be replaced by their empirical analogsψkn

i , kn

ij (see Section2.2) to achieve better finite-sample performance.

Finally, note that the estimator ˜Ŵ_tnis not necessarily positive. The constantB, for instance, is negative for any weight function

g. However, we obtain a positive consistent estimator ofŴt+Ŵt by setting

ˆ

Ŵn_t ₌max( ˜Ŵ_tn, Ŵ_tn)≥0,

where Ŵn

t is a positive consistent estimator of Ŵt defined in (2.16). SinceŴt+Ŵt ≥Ŵt, ˆŴnt is a consistent estimator ofŴt+ Ŵt.

5. TIME SERIES PROPERTIES AND THE IMPACT OF SAMPLING FREQUENCIES AND SAMPLING

SCHEMES

Figures11and12show the (averaged) preaveraging estimates forθ₌0.3 andθ₌1.0 and different sampling frequenciesn using T(R)TS (since corresponding plots based on CTS yield similar findings, they are not shown here). As benchmarks, we report the RV estimator, the RVSS estimator, and the RK es-timator. We observe the well-known bias of the RV estimator when the sampling frequency becomes high. Not surprisingly, this is particularly evident in the case of midquote change sam-pling for which the serial dependence in underlying returns is comparably high. We can summarize the following main find-ings: first, confirming the results above, the estimator tends to be downward biased if the sampling frequency is low. Second, the

(16)

Figure 11. Daily averages ofYt (multiplied by 1e5) withYt ∈ {Ctn, Cnt,a,RVSS}for different values ofnusing T(R)TS. Upper panel:Ctn

andCn

t,acomputed based onθ=0.3. Lower panel:C n t andC

n

t,acomputed based onθ=1.0.

finite-sample adjustment is particularly important. Without this adjustment, the estimator becomes strongly negatively biased even for high sampling frequencies. This effect becomes even stronger ifθ is chosen too small. Third, the dependent noise-robust version of the preaveraging estimator, Ca

t(1), is more sensitive thanCa

t if the sampling frequency becomes small. In such a case, the RK estimator tends to be more stable. Finally,

we observe that a more conservative choice of the length of the local interval, that is,θ₌1.0, yields more stable results ifn becomes large. This confirms our findings above thatθshould be chosen rather conservatively if the sampling frequency declines. Figures 13 and 14 show the average jump proportions BTVnt,a/Ct,an depending on the underlying sampling frequency. The relationship is monotonically increasing leading to

Figure 12. Daily averages ofYt(multiplied by 1e5) withYt ∈ {Ct,an (1),RV,RK}for different values ofnusing T(R)TS. Upper panel:Ct,an (1)

computed based onθ₌0.3. Lower panel:Cn

t,a(1) computed based onθ=1.0.

(17)

Figure 13. Averaged jump ratio BTVn_t,a/Cn

t,afor different values ofnusing T(R)TS. Upper panel:θ=0.3. Lower panel:θ=1.0.

unrealistically high jump proportions if the sampling frequen-cies become small. This suggests using the highest possible frequency to reliably estimate jump proportions. However, we observe nonmonotonic behavior for small values of n in the case of CTS and rather illiquid assets (see also the online Appendix). Here, very high sampling frequencies yield artificially high jump proportions that is obviously due to the sampling of a high number of consecutive zero returns. This

makes estimators more sensitive as price changes appear to be more discrete. These findings suggest using T(R)TS as a more appropriate sampling scheme in the case of jump detections. Figures15and16show the time series variations of the jump proportion BTVnt,a/C

n

t,a and the test statistic Sn based on different sampling schemes. Confirming the results above, we observe a relatively close correspondence of estimates based on T(R)TS, but a high sensitivity thereof depending on CTS

Figure 14. Averaged jump ratio BTVn

t,a/Ct,an for different values ofnusing CTS. Upper panel:θ=0.3. Lower panel:θ=1.0.

(18)

Figure 15. Time series plots of jump ratio BTVn t,a/Cnt,a.

Figure 16. Time series plots of AR(1) test statisticSn. Based on highest frequency T(R)TS and 3-sec CTS.

Figure 17. Time series plots of 95% confidence intervals aroundCn

t,a(multiplied by 1e5). Confidence intervals computed based on ˆŴtnand

Ŵn

t,ausing TRTS.

schemes. Particularly for SON, estimated jump proportions strongly increase if the CTS frequency declines.

Figure 17shows the time series variation of 95% confidence intervals computed based on ˆŴn_t,a and Ŵ_t,an for θ₌0.3 using TRTS. It is shown that estimation error is still not negligible in most cases. We also observe differences in confidence intervals due to the inclusion of jumps. This is mostly observable in peri-ods of high volatility where the presence of jumps is obviously higher and consequently jump-robust confidence intervals be-come more narrow. For XOM and HD, we observe confidence intervals that are approximately 15% more narrow if jumps are taken into account. In the case of SON, this reduction even increases up to approximately 85%.

6. CONCLUSIONS

In this article, we discuss the class of preaveraging estimators for quadratic variation in asset prices. We extend existing theory

to explicitly account for serial dependence in noise, to provide jump-robust measures, and to derive feasible central limit the-orems for the quadratic variation of a jump diffusion model. In an extensive empirical study, we analyze the properties of different preaveraging estimators depending on the choice of the preaveraging interval, the sampling scheme, the underlying sampling frequency, and the impact of noise. We can summarize the following major results.

First, a rule of thumb for an optimal choice of the preaver-aging parameter θ is between 0.3 and 0.6. If θ is chosen too small, market microstructure noise is obviously not sufficiently removed and estimators get biased. If, however,θis chosen too high, the estimators tend to “oversmooth” implying a downward bias. Generally, θ should be chosen higher if (i) the sampling frequency declines, (ii) the trading intensity of the underlying stock is low, (iii) T(R)TS is used, and (iv) jump components are estimated. In the two latter cases, choosing too small values of θcan be quite harmful.

(19)

Second, finite-sample adjustments of the estimator are partic-ularly important to reduce significant biases. This is particpartic-ularly true ifθis chosen too small and the sampling frequency is not sufficiently high. Third, (mid)quote sampling is not necessar-ily less affected by market microstructure noise. Based on the used NYSE data, midquote returns even yield the highest serial dependence stemming from quote discreteness. Fourth, ignor-ing the possibility of jumps in the price process can lead to severe overestimations of confidence intervals. Moreover, CTS-based estimates of jump components are very sensitive to the choice of the sampling frequency. To reliably estimate jump proportions, we suggest T(R)TS rather than CTS. Fifth, we suggest implementing both preaveraging and kernel estimators based on a highest possible sampling frequency. Our empirical findings show that a reduction in sampling frequency tends to imply an “oversmoothing” of volatilities resulting in negative biases.

APPENDIX

A.1 Proofs

Proof of Lemma 1. A careful inspection of the proof of theo-rem 3.1 in JLMPV09 shows that the consistency result in (2.12) remains valid for theq-dependent noise. Indeed, the main in-gredient of the aforementioned proof is the convergence in law

both follow from theqdependence of the noise processU.

Proof of Lemma 2. Observe that

γ_tn(k)=n

for all fixed k_≥0, because U and X are independent, and

E[|n iX|

2_]

≤Cn (uniformly ini). The assertion of Lemma 2 follows now by the law of large numbers for q-dependent

random variables.

Proof of Theorem 1. The results follow along the lines of the proof of theorem 3.1 in JLMPV09. The justification is exactly

the same as in the proof of Lemma 1.

Proof of Proposition 1. Applying theorem 3.3 from Jacod, Podolskij, and Vetter (2009) forp₌2,4 (which is also valid for theq-dependent noise), we deduce that

1_n−p/4 Proposition 1 readily follows. The properties of stable con-vergence together withŴn

t(q) P

−→Ŵt(q) imply the central limit

theorem.

Proof of Theorem 2. Following the ideas of theorem 3.1 in Jacod, Podolskij, and Vetter (2009), we immediately deduce that

n_t ₌

sinceα2andσ2are c`adl`ag processes. The convergence readily follows, because

Proof of Proposition 2. The central limit theorem follows immediately by the properties of stable convergence. To obtain the consistency of the estimator ˜Ŵ_tn, we note that

n

uniformly ini. Now, the convergence ˜Ŵn t

In line with BHLS08b, we perform the following data clean-ing steps:

(i) Delete entries outside the 9:30 pm and 4 pm time win-dow.

(ii) Delete entries with a quote or transaction price equal to zero.

(iii) Delete all entries with negative prices or quotes. (iv) Delete all entries with negative spreads.

(v) Delete entries whenever the price is outside the interval [bid−2·spread ; ask+2·spread].

(vi) Delete all entries with the spread greater than or equal to 50 times the median spread of that day.

(vii) Delete all entries with the price greater than or equal to 5 times the median midquote of that day.

(20)

(viii) Delete all entries with the midquote greater than or equal to 10 times the mean absolute deviation from the local median midquote.

(ix) Delete all entries with the price greater than or equal to 10 times the mean absolute deviation from the local median midquote.

A.3 Implementation Details for the Different Estimators

Therealized kernel estimator (RK)proposed by BHLS08a is defined by

K(Z)=γ0(Z)+ H

h=1 k

h₋1 H

{γh(Z)+γ−h(Z)},

whereγh(Z) denotes thehth realized autocovariance given by

γh(Z)= [t /n]

j=1

(Zin−Z(i−1)n)(Z(i−h)n−Z(i−h−1)n),

withh_{= −}H, . . . ,₋1,0,1, . . . , Handk(·) denoting the kernel function to be chosen as the Tukey-Hanning2kernel withk(x)= sin2

{π/2(1−x)2

}.Moreover, define

ζ_t2₌α2_t/

t

0 σ4

udu.

Then, the optimal choice of the bandwidth,H, is given byH ₌ cζt

√

[t /] with c₌5.74 for the Tukey-Hanning2 kernel. To estimate₀tσ4

udu, we use (2.18) witha =1,b=c=0, whereas α2_t is estimated by (2.19).

Note that the RK estimator is computed without accounting for end effects, that is, replacing the first and last observations by local averages to eliminate the corresponding noise components (“jittering” according to BHLS08a). BNHLS08b argued that these effects are theoretically important, however practically negligible, particularly for actively traded assets.

Themaximum likelihood estimator(MLRV) proposed by A¨ıt-Sahalia, Mykland, and Zhang (2005) is given by

MLRV=Nδˆ2(1+θˆ)2,

whereN denotes the number of trades per day and ( ˆδ2_,_θ_ˆ_{) are} the maximum likelihood estimates of an MA(1) model for ob-served trade-to-trade returns, Zi =εi+θ εi−1, with εi being white noise with varianceδ2_and

−1< θ <0. This model sug-gests an alternative estimator of the market microstructure noise variance given by ˆα2_{= −}θˆδˆ2_.

Finally, for all estimators, the daily quadratic variation is computed over the interval starting at the first observation after 9:30 am and ending at the last observation before 4:00 pm. The estimates are then correspondingly scaled up to a complete 6.5-hr trading interval (corresponding to one NYSE trading day).

ACKNOWLEDGMENTS

For helpful comments, we thank the editor and an anony-mous referee. Nikolaus Hautsch acknowledges support by the Deutsche Forschungsgemeinschaft through the CRC 649 “Eco-nomic Risk.”

[Received March 2011. Revised November 2012.]

REFERENCES

A¨ıt-Sahalia, Y., Mykland, P., and Zhang, L. (2005), “How Often to Sample a Continuous-Time Process in the Presence of Market Microstructure Noise?”

Review of Financial Studies, 18, 351–416. [165,169,171,183]

Andersen, T. G., Bollerslev, T., and Diebold, F. (2008), “Parametric and Non-parametric Measurement of Volatility,” inHandbook of Financial Econo-metrics, eds. Y. A¨ıt-Sahalia and L. P. Hansen, Amsterdam: North Holland. [165]

Andersen, T. G., Bollerslev, T., Diebold, F., and Labys, P. (2001), “The Dis-tribution of Realized Exchange Rate Volatility,”Journal of the American Statistical Association, 96, 42–55. [165]

——— (2003), “Modeling and Forecasting Realized Volatility,”Econometrica, 71, 579–625. [165,169]

Andersen, T. G., Dobrev, D., and Schaumburg, E. (2010), “Jump-Robust Volatil-ity Estimation Using Nearest Neighbor Truncation,” Federal Reserve Bank of New York Staff Report No. 465. [166,170]

Barndorff-Nielsen, O. E., Hansen, P. R., Lunde, A., and Shephard, N. (2008a), “Designing Realised Kernels to Measure the Ex-post Variation of Equity Prices in the Presence of Noise,”Econometrica, 76, 1481–1536. [165] ——— (2008b), “Realised Kernels in Practice: Trades and Quotes,”

Economet-rics Journal, 4, 1–32. [166]

Barndorff-Nielsen, O. E., and Shephard, N. (2002), “Econometric Analy-sis of Realized Volatility and Its Use in Estimating Stochastic Volatility Models,”Journal of the Royal Statistical Society,Series B, 64, 253–280. [165]

Barndorff-Nielsen, O. E., Shephard, N., and Winkel, M. (2006), “Limit Theo-rems for Multipower Variation in the Presence of Jumps,”Stochastic Pro-cesses and Their Applications, 116, 796–806. [177]

Christensen, K., Oomen, R., and Podolskij, M. (2010), “Realised Quantile-Based Estimation of the Integrated Variance,”Journal of Econometrics, 159, 74–98. [177]

Delbaen, F., and Schachermayer, W. (1994), “A General Version of the Funda-mental Theorem of Asset Pricing,”Mathematische Annalen, 300, 463–520. [166]

Gatheral, J., and Oomen, R. C. A. (2010), “Zero-Intelligence Realized Variance Estimation,”Finance and Stochastics, 14, 249–283. [171]

Gloter, A., and Jacod, J. (2001), “Diffusions With Measurement Errors. II— Optimal Estimators,”ESAIM, 5, 243–260. [168]

Hansen, P. R., and Lunde, A. (2006), “Realized Variance and Market Microstruc-ture Noise,”Journal of Business and Economic Statistics, 24, 127–161. [165,171,173,176]

Hautsch, N., and Huang, R. (2012), “The Market Impact of a Limit Order,”

Journal of Economic Dynamics and Control, 36, 501–522. [170] Jacod, J., Li, Y., Mykland, P., Podolskij, M., and Vetter, M. (2009),

“Microstruc-ture Noise in the Continuous Case: The Pre-Averaging Approach,” Stochas-tic Processes and Their Applications, 119, 2249–2276. [165]

Jacod, J., and Mykland, P. (2012), “Microstructure Noise in the Continuous Case: The Adaptive Pre-Averaging Method,” Working Paper. [172] Jacod, J., Podolskij, M., and Vetter, M. (2009), “Limit Theorems for Moving

Averages of Discretized Processes Plus Noise,”The Annals of Statistics, 38, 1478–1545. [166,176,182]

Nolte, I., and Voev, V. (2009), “Least Squares Inference on Integrated Volatil-ity and the Relationship Between Efficient Prices and Noise,” CREATES Research Paper 2009-16, School of Economics and Management, Aarhus University. [174]

Oomen, R. C. A. (2006a), Comment on Hansen, P. R., and Lunde, A. (2006), “Realized Variance and Market Microstructure Noise,”Journal of Business and Economic Statistics, 24, 195–202. [169]

——— 2006. “Properties of Realized Variance Under Alternative Sampling Schemes,” Journal of Business and Economic Statistics, 24, 219–237. [169,171]

Podolskij, M., and Vetter, M. (2009a), “Estimation of Volatility Functionals in the Simultaneous Presence of Microstructure Noise and Jumps,”Bernoulli, 15, 634–658. [167]

——— (2009b), “Bipower-Type Estimation in the Noisy Diffusion Setting,”

Stochastic Processes and Their Applications, 119, 2803–2831. [177] Renyi, A. (1963), “On Stable Sequences of Events,” Sankhya A, 25, 293–

302. [168]

Roll, R. (1984), “A Simple Implicit Measure of the Effective Bid-Ask Spread in an Efficient Market,”Journal of Finance, 39, 1127–1139. [170] Zhang, L., Mykland, P. A., and A¨ıt-Sahalia, Y. (2005), “A Tale of

Two Time Scales: Determining Integrated Volatility With Noisy High-Frequency Data,”Journal of the American Statistical Association, 100, 1394–1411. [168]