07350015%2E2013%2E803973

(1)

Full Terms & Conditions of access and use can be found at

http://www.tandfonline.com/action/journalInformation?journalCode=ubes20

Download by: [Universitas Maritim Raja Ali Haji] Date: 11 January 2016, At: 22:17

Journal of Business & Economic Statistics

ISSN: 0735-0015 (Print) 1537-2707 (Online) Journal homepage: http://www.tandfonline.com/loi/ubes20

Automatic Specification Testing for Vector

Autoregressions and Multivariate Nonlinear Time

Series Models

Juan Carlos Escanciano , Ignacio N. Lobato & Lin Zhu

To cite this article: Juan Carlos Escanciano , Ignacio N. Lobato & Lin Zhu (2013)

Automatic Specification Testing for Vector Autoregressions and Multivariate Nonlinear Time Series Models, Journal of Business & Economic Statistics, 31:4, 426-437, DOI: 10.1080/07350015.2013.803973

To link to this article: http://dx.doi.org/10.1080/07350015.2013.803973

View supplementary material

Accepted author version posted online: 31 May 2013.

Submit your article to this journal

Article views: 470

(2)

Supplementary materials for this article are available online. Please go tohttp://tandfonline.com/r/JBES

Automatic Specification Testing for Vector

Autoregressions and Multivariate Nonlinear Time

Series Models

Juan Carlos E

SCANCIANO

Department of Economics, Indiana University, Bloomington, IN 47405 ([email protected])

Ignacio N. L

OBATO

Instituto Tecnol ógico Aut ónomo de M éxico, Av. Camino Sta Teresa 930, Col. H éroes de Padierna, M éxico D.F. 10700, M éxico ([email protected])

Lin Z

HU

School of Economics and Management, Tsinghua University, Beijing 100084, China ([email protected])

This article introduces an automatic test for the correct specification of a vector autoregression (VAR) model. The proposed test statistic is a Portmanteau statistic with an automatic selection of the order of the residual serial correlation tested. The test presents several attractive characteristics: simplicity, ro-bustness, and high power in finite samples. The test is simple to implement since the researcher does not need to specify the order of the autocorrelation tested and the proposed critical values are simple to approximate, without resorting to bootstrap procedures. In addition, the test is robust to the presence of conditional heteroscedasticity of unknown form and accounts for estimation uncertainty without requir-ing the computation of large-dimensional inverses of near-to-srequir-ingularity covariance matrices. The basic methodology is extended to general nonlinear multivariate time series models. Simulations show that the proposed test presents higher power than the existing ones for models commonly employed in empirical macroeconomics and empirical finance. Finally, the test is applied to the classical bivariate VAR model for GNP (gross national product) and unemployment of Blanchard and Quah (1989) and Evans (1989). Online supplementary material includes proofs and additional details.

KEY WORDS: Akaike’s AIC; Autocorrelation; Diagnostic test; Model checking; Schwarz’s BIC.

1. INTRODUCTION

The vector autoregression (VAR) model has been one of the most popular tools employed by macroeconomists in recent years for the analysis of multivariate time series. The main rea-sons for this success are its flexibility and simplicity to imple-ment, which has led to its intensive use in financial and macroe-conomic applications, where it has proven to be useful for data description and forecasting. VAR models, as simple extensions of univariate autoregressions, have been known for many years, but only in the last 30 years, have they become widely used by macroeconomists and policy-oriented researchers. VAR models have been used in macroeconomics mainly for two purposes: first, as a device to derive “stylized facts” of the effects of some shocks (mainly policy shocks) on relevant economic variables and, second, as a mechanism to evaluate economic theory mod-els, see Christiano, Eichenbaum, and Evans (1999). For the purpose of linking data to behavioral relations, short-term or long-term restrictions are typically included, see, for instance, Blanchard and Quah (1989). Starting with the seminal article by Sims (1980), VAR models have often been used for struc-tural, causal, and policy analysis, so that Granger-causality tests, impulse response functions, and forecast error variance decom-positions are nowadays standard macroeconomists’ tools.

For these inference procedures to be reliable, a critical aspect is to test for the correct specification of the VAR model. If the researcher employs a misspecified model, interesting dynamics

of the economic variables can be ignored and conclusions from the impulse response functions can be misleading. A natural way of validating the specification of the VAR model, as with any other time series model, is to check if the residuals are white noise, that is, uncorrelated.

Testing for serial correlation in residuals is a distinguished literature in statistics and economics dating back to Quenouille (1947) for univariate autoregressions. For recent proposals in a univariate setting, see, for example, Delgado and Velasco (2011) and Guay, Guerre, and Lazarova (2011). One of the most popular tools has been the Portmanteau tests proposed by Box and Pierce (1970, hereafter BP) for univariate autoregressive– moving-average (ARMA) models. A multivariate version of the BP’s test was proposed by Chitturi (1974) for VAR processes. Hosking (1980,1981a,b) gave several equivalent forms of this statistic, see also Ahn (1988), and Poskitt and Tremayne (1982) showed that BP’s test in a multivariate context can be interpreted as a lagrange multiplier (LM) test.

Inference related to the Portmanteau statistic for the depen-dent residual case varies according to the assumptions made on the errors and the lag orderh. For the univariate homoscedas-tic residual independent case, BP suggested comparing the

October 2013, Vol. 31, No. 4 DOI:10.1080/07350015.2013.803973

426

(3)

Portmanteau statistic to the upper critical values from a χ2

distribution where the degrees of freedom are the number of autocorrelations tested minus the number of estimated parame-ters. This result is motivated by the fact that the autocorrelations obey as many linear restrictions as parameters have to be esti-mated, so when the number of autocorrelations tested is taken “sufficiently large” (BP, p. 1517), the effect of parameter esti-mation is annihilated. For a framework similar to BP, Newbold (1980) proposed a modified Portmanteau statistic that employed a consistent estimator of the asymptotic covariance matrix of the sample autocorrelations. For the univariate residual dependent case, Francq, Roy, and Zakoian (2005) kept the Portmanteau statistic and estimated the critical values.

Inference with the Portmanteau statistic in the context of multivariate models has also been investigated for the case where a fixed number of autocorrelations, sayh,is tested. Similar to the univariate case, the solutions have relied on either modifying the BP test statistic and keeping the chi-squared critical value or keeping the BP test statistic and modifying the critical value. The first approach was followed by Chabot-Hall´e and Duchesne (2008), who modified the BP test statistic to take into account the effect of estimated parameters in a setting with heteroscedastic errors. In this case, the limiting distribution of the modified test is aχ2

hm2, wheremis the number of considered

series. A practical disadvantage that we have observed with this approach is that the modified BP test statistic is not robust be-cause it requires the estimation of the inverse of a covariance matrix that can be close to singular or even singular in some situ-ations; see Section4for an illustrative example. Hence, the prob-lem with this approach is that in finite samples, it cannot control the Type I error of the test even for very large sample sizes.

The second approach was followed by Francq and Ra¨ıssi (2007) who noticed that for a fixedh, the limiting distribution of the Portmanteau test statistic is a weighted sum of independent

χ₁2random variables, where the weights are given by the eigen-values of an appropriate covariance matrix. The critical eigen-values are not readily available in this approach, but they can be easily estimated from a suitable consistent estimator of a potentially large-dimensional covariance matrix. As we will see in detail in Section3, in our experience, this approach works better in finite samples.

Despite the previous generalizations of the classical Port-manteau statistic to the multivariate residual context, an im-portant limitation of the multivariate Portmanteau statistics still remains; namely, that inference can be rather sensitive toh,the selected number of autocorrelations. Often, different values of

hlead to conflicting conclusions in empirical applications, and there is little guidance about how to deal with this multiple test-ing problem. The objective of this article is to overcome this important limitation by proposing a Portmanteau statistic where the parameterhis not fixed but selected automatically from the data.

Although our initial motivation focuses on residuals from VAR models and proposes fully automatic model checks with simple implementation, we also extend our approach to gen-eral nonlinear multivariate models and, in particular, to gener-alized autoregressive conditional heteroscedasticity (GARCH) models. For the latter application, our construction leads to au-tomatic asymptotic distribution-free versions of the

Portman-teau test proposed by Ling and Li (1997). In this application, the distribution-free property critically rests on the fact that the asymptotic distribution of the automatic test exclusively depends on the asymptotic variance of the first-order sample correlation; a fact that leads to more robust and simpler to implement tests. Finally, it is worth stressing that automatic order selection has been used in the context of VAR models for identification and estimation; see Akaike (1974) and Schwarz (1978) for the Akaike information criterion (AIC) and Bayesian information criterion (BIC) selection criteria, respectively, and L¨utkepohl (2005, chap. 4) for a survey of results. However, note that this classical analysis focused on identification and estimation, not on diagnostic testing. As far as we know, this article is the first to address the automatic selection of h in the framework of testing for serial correlation of residuals from VAR, GARCH, and related multivariate models.

The plan of the article is as follows. Section2introduces nota-tion and the new automatic test in detail for the VAR case. Then, in Section3, we extend the proposed procedure to a general mul-tivariate nonlinear framework and consider in detail volatility models. Section 4 studies the finite sample behavior through simulations. Finally, Section 5 presents an empirical applica-tion to modeling GNP (gross naapplica-tional product) and unemploy-ment, and Section6concludes. Proofs are gathered in the online Appendix D. A word about notation, henceforth for anm_×p

matrixA:₌(aij)1≤i≤m,1≤j≤p, A′denotes its transpose, and|A|

and|A_|_∞denote the Euclidean and sup norm, respectively, that is,|A_|2_:

=tr(A′_A_{) and}_|_A_|

∞:=sup1≤i≤m,1≤j≤p|aij|. LetIm

de-note the identity matrix of orderm.In addition,⊗and⊙denote Kronecker and Hadamard products, respectively.

2. AUTOMATIC DIAGNOSTIC CHECKING

FOR VAR MODELS

Consider anm-dimensional VAR(p) processYt

Yt=μ+1Yt−1+ · · · +pYt−p+εt, (1)

where εt has zero mean and nonsingular covariance matrix E(εtε′t)=Ŵ(0), μ is an m-dimensional vector, and the ’s

are m_×m matrices. Let θ:=(μ, 1, . . . , p, Ŵ(0)) be the

unknown parameters of the model. We assume the process {Yt_}t∈Z is strictly stationary and ergodic, so that the roots of det(Im₋1z− · · · −pzp)=0 lie outside the complex unit

circle. DefineŴ(j) :=E(εtε′

t−j), j ∈Z.In this article, we aim

to test the null hypothesis of correct specification of the auto-correlations of the VAR(p) model, that is,

H0:Ŵ(j)=0, for allj =0,

against the fixed alternative hypotheses, forK_≥1,

H₁K :Ŵ(K)=0.

Notice that our null hypothesis is composite, asŴ(j) depends on

θ,although we do not make explicit this dependence to simplify the notation.

We follow L¨utkepohl (2005) in much of the notation that fol-lows. Define them_×(mp₊1) matrixB:=(μ, 1, . . . , p)

and the (mp₊1)-dimensional vectorZt:=( 1Y′

t−1. . . Yt′−p)′.

Define also the (m_×n) matrix Y :=(Y1, . . . , Yn) and the

(mp₊1)×n matrix Z:=(Z0, . . . , Zn−1). The parameter B

(4)

is estimated by the least squares (LS) estimator (we focus on LS since it is the natural estimator, but other estimators could be entertained as well, with obvious changes in the theory)

B₌Y Z′(ZZ′)−1,

and a consistent estimator forŴ(j) is given by the sample resid-ual autocovariance matrix of orderj _≥0,

This test is often implemented in its Ljung–Box modification

Q(h) :=n2

dimensional vector. The null hypothesis is tested by comparing the value of any version of the multivariate Portmanteau statistic with upper critical values from a χ2

(h−p)m2; see for instance

Hosking (1980,1981a) or Ahn (1988). Although the multivari-ate Portmanteau statistic has been employed repemultivari-atedly, notice that the test presents two important practical drawbacks. First, the use of the critical values from a χ2

(h−p)m2 distribution is

questionable since theχ₍2_h₋_p₎_m2 is a good approximation only

when the number of autocorrelations h is taken “sufficiently large” (BP, p. 1517) so that the effect of parameter estimation is annihilated. Second, inference can be sensitive to the selected numberh.

The limiting distribution forQ(h) for a fixedhhas been estab-lished under different dependence restrictions on the innovations {εt}.For strictly stationary and ergodic martingale difference

sequence (mds) innovations {εt}, this distribution is obtained

as follows. A standard Taylor expansion, see, for example, Lemma 4.2 in L¨utkepohl (2005), relates sample covariances of residuals to sample covariances of true errors through the equation

under some mild moment conditions, the LS estimator satisfies

√

totic distribution of√nγhwill follow from the asymptotic joint

distribution of Jn:=n−1/2n

is also a stationary and ergodic mds, so that the CLT in Billingsley (1961) implies thatJn_−→d J _∼N(0, ),provided

Thus, under the null hypothesis and regularity conditions √

nγh_−→d N(0, Dh),where

Dh₌c˜+G′hBGh−c,B˜ Gh−G′hc,B′˜ . (4)

Francq and Ra¨ıssi (2007) used the asymptotic resultJn_−→d J

to provide the asymptotic null distribution ofQ(h) for a fixed

h_≥1 under weak dependence restrictions on{εt_}.

In this article, we propose the automatic test statistic

AQ:=Q(h),

wherehis chosen from the data as follows

h:=min{h: 1≤h_≤d;Lh≥Lz, z=1,2, . . . , d},

For the motivation of this selection rule for testing, see Inglot and Ledwina (2006a,b) and Escanciano and Lobato (2009). As explained in these references, the motivation behind this procedure is to combine the advantages of AIC and BIC criteria. On the one hand, tests constructed using the BIC criterion are able to properly control the Type I error and are more powerful when the serial correlation is present in the first-order autocorrelations. On the other hand, tests based on the AIC cannot properly control the Type I error, but they are more powerful when the serial correlation is present in high-order autocorrelations. Our selection forhallows the data to choose the preferable criterion according to the data characteristics.

The theoretical results hold for any fixedq.Following exten-sive simulations in Inglot and Ledwina (2006a,b), Escanciano and Lobato (2009), and this article suggest that the choice of

q₌2.4 works well in finite samples. Note that a small value forqwould lead to the use of the AIC criterion, while a largeq

would lead to the choice of the BIC criterion. Moderate values, such as 2.4,provide a “switching effect” in which one combines the advantages of the two selection rules. For further motivation for the choice ofq, see the aforementioned references and, for other choices of penalty terms and optimality properties of the resulting tests, see Kallenberg (2002).

Our theoretical results in the following are proved under the assumption thatdis a fixed large number, but they can be

(5)

extended to the case whered_≡d(n) grows slowly to infinity withnunder additional regularity conditions that include strong mixing assumptions. A theoretical advantage of considering

d _≡d(n) would be to achieve a consistent test against all alter-nativesH₁K for allK_≥1.However, for all practical matters, a theory withdfinite suffices, as we can takedas large as desired, as long as, of course, d _≤n₋1. Notice that the resulting data-driven test will be different from the BP test using d(n) autocorrelations, as proposed in Hong (1996). We interpretdas an upper bound on the number of correlations that the researcher is interested in. That is, we only consider alternative hypotheses for which K_≤d. This is not a practical limitation, since in applications, the first correlations are often the most significant ones. Inference is far less sensitive to d than to the number of correlationsh. For empirical evidence, see Table 6 in our simulations.

Before establishing the asymptotic theory for the auto-matic AQ test, denote by F_t₋₁ _{to the} σ-field generated by

(a) The innovations{εt_}are a strictly stationary and ergodic mds with respect toF_t₋₁with nonsingular varianceŴ(0) and such thatE[_|ut|2]<∞.

(b) The roots of det(Im−1z− · · · −pzp)=0 lie outside

the complex unit circle.

Our first results establish the null asymptotic distribution and the behavior under the alternative for the automatic test statistic

AQand prove that under the null the probability ofhbeing one tends to one asntends to infinity.

Theorem 1. Under the null hypothesis and Assumption A1,

AQ_−→d Z,whereZcan be represented as

independentN(0,1) variables.

Remark 1. Thep-value of the weightedχ2distribution is cal-culated by using Imhof’s (1961) algorithm using the estimator

1=(Ŵ−1/2(0)⊗Ŵ−1/2(0))D1(Ŵ−1/2(0)⊗Ŵ−1/2(0)), (7)

where the expression forD1can be found in Appendix A. Thus,

Theorem 1, the consistency of1, and the nonsingularity of1

yield an asymptoticα-level automatic test.

Theorem 2. Under Assumption A1, asn_{→ ∞}, the test based onAQis consistent againstHK

1 ,forK≤d.

Remark 2. Note that Assumption A1(a) allows for condi-tional heteroscedasticity of unknown form and other forms of nonlinear dependence present in financial data. Alternative weak

dependence conditions, such as mixing assumptions, can be em-ployed (see Francq and Ra¨ıssi2007, sec.3). For independent and identically distributed (iid) innovations{εt_}, our conditions and some of the expressions can be simplified. For instance, a suffi-cient condition forE[|ut_|2]<_∞is thatE[|εitεj tεktεlt_|4]<_∞

for all i, j, k, l₌1, . . . , m,whereεit is theith component of

εt. Moreover, by proposition 4.5 in L¨utkepohl (2005), Dh₌

(Ih_⊗Ŵ(0)−G˜′_h−1Gh˜ )⊗Ŵ(0),which suggests the simple es-timatorDh ₌(Ih_⊗Ŵ(0)−Gˆ′_hˆ−1Ghˆ )⊗Ŵ(0).

Remark 3. An alternative to the classical Portmanteau test statistic is the following modified Portmanteau test statistic:

T(h) :=γ_h′D−_h1γh,

whereDh denotes the sample analog ofDh,whose expression can be found in the online Appendix A. Several versions of the test statisticT(h) have appeared in the literature, see Poskitt and Tremayne (1982) for the case of homoscedastic vector ARMA (VARMA) models and Chabot-Hall´e and Duchesne (2008) for the case of heteroscedastic nonlinear VAR models.

By construction, it is straightforward to show that under the null hypothesisH0, and some further regularity conditions, for

a fixed h, T(h) converges to a χ_hm2 2. Compared to the

Port-manteauQ(h),the PortmanteauT(h) has the advantage that its critical values are readily available. However, we have observed in simulations that the test statisticT(h) and its automatic ver-sion are highly unstable because Dh is close to singular for

some parameter values, especially when h is large. A result that could be expected in general sinceDhbecomes singular as

h_{→ ∞}.

Remark 4. Note that the singularity ofDh,and hence ofh,

can occur when p=0. This implies a discontinuity in the

asymptotic distribution ofγh, and hence, of those ofQ(h) and

AQ.In particular, singularity can even occur whenh₌1,as in the following example:

Example 1. Consider the bivariate VAR(1) model

Yt =θ I2Yt−1+εt,

whereθ_∈R,I2is the identity matrix, and{εt}are iid standard

Gaussian. This example extends the simple example in Durbin (1970, p. 419) who first noticed the discontinuity in the asymp-totic distribution of the residual sample autocovariances for a simple univariate AR(1).

Note that whenθ₌0,it can be easily shown that the eigen-values of h in (6) are 0 with algebraic multiplicity m2 and

1 with multiplicitym2₍_h

−1). Hence, the modified Portman-teau test T(h) cannot be applied in this case. However, The-orems 1 and 2 are still valid, but with Z ₌0 almost surely (a.s.; all eigenvalues are zero forh₌1) whenθ ₌0. An anal-ysis of the limiting size of our automatic test for situations where rank(1)=0 requires a case-by-case analysis, and it

is beyond the scope of this article. However, notice that most existing specification tests that employ h autocorrelations re-quirehto be nonsingular, exceptions are the likelihood ratio test of Durbin (1970) and the LM test of Godfrey (1976). In contrast, the proposed AQtest is an asymptotic α-level test under the weaker condition rank(1)>0. Furthermore, even

in situations where rank(1)=0 but rank(l)>0 for somel,

(6)

a simple modification of our automatic test delivers an asymp-toticα-level test, simply by changing the automatic lag choice to

h:=min{h:l_≤h_≤d;Lh_≥Lz, z₌l, . . . , d_}.For instance, we can takel₌2 in the current VAR(1) example ifθ ₌0. Fur-thermore, in the simulations below, we show that our automatic

AQtest presents an empirical size that is accurate for values ofθclose to zero, such as 0.1,unlike alternative tests based on

T(h).

3. FURTHER APPLICATIONS OF THE

METHODOLOGY

The methodology introduced in the previous section can be extended to constructing automatic specification tests in a va-riety of frameworks. This section discusses a generalization of the automaticAQtest in a general multivariate nonlinear setting and then examines in detail the leading application to volatil-ity models, which will also be considered in our simulations exercises.

We consider a generalized error process

εt ₌Ht(θ0)≡H(Ft, θ0), (8)

where H(·) is a known vector function and θ0∈⊂Rl is

an unknown parameter vector to be estimated. The objective is to test for correct model specification by testing whether the process εt is uncorrelated. Applying the AQ test to the estimated generalized errors is an automatic specification test for two leading cases: general dynamic parametric models and models defined by conditional moment restrictions, as we briefly comment next.

In many dynamic parametric models, the distribution is as-sumed to belong to a parametric class of distributions, for in-stance, in Gaussian VAR-GARCH models, and so, the distribu-tion of the stochastic processYt conditional on (Yt₋1, Yt−2, . . .)

is known. Calling Ft(y;θ0)≡F(y;θ0, Yt−1, Yt−2, . . .) to this

known distribution, then it is simple to show that under correct specification, the generalized errors defined asεt ₌Ft(Yt;θ0),

for allt,are uncorrelated. Hence, the proposedAQtest can be applied to the residualsεt ₌Ft(Yt;θ), for a suitable estimate

θofθ0,for example, the conditional maximum likelihood

esti-mator. For this case, the asymptotic normality of√n(γh₋γh) can be established under the conditions below. See also Bai and Chen (2008) for an alternative testing approach.

In many other cases, the researcher does not assume that the conditional distribution of the data is known, but the economic model just establishes that there exists some generalized error

εt that behaves as an mds. Then, ourAQtest can again be ap-plied to test for the correct specification of these models since

εtis uncorrelated when the model is correctly specified. For in-stance, the first-order conditions of many Rational Expectations and Asset Pricing Models are given in the form of conditional moment restrictions, such as

E[εt _|F_t₋₁_]₌E[Ht(θ0)|Ft−1]=0, a.s.

for a vector of moment functionsHt(·)=H(F_t,_·).For instance, for Asset Pricing models, Ht(θ0)=m(·, θ0)Ret, where mis a

parametric stochastic discount factor function andRe_t is a vec-tor of excess returns, see Hansen and Singleton (1982). In this context, the automaticAQtest can be employed since the correct

specification of the discount factor implies that the generalized errorsεt ₌H(Ft, θ0) are uncorrelated. In these cases, it is

cus-tomary to estimate the parameterθ0by the generalized methods

of moments.

To develop a general theory, which covers the previous two cases and many others, we introduce the following assumption onHt(θ), θ _∈.

Assumption A2.

(a) For each θ_∈, H(·, θ) is a F_t_{-measurable function}

with values inRs.For eacht _≥1, Ht(_·) is a.s. contin-uously differentiable in a neighborhood ofθ0,say0⊂

,with derivative ˙Ht(θ) :=∂Ht(θ)/∂θ′. Furthermore,

Sinceθ0is unknown, we require the existence of an

asymp-totic linear estimator. (b) The parameter spaceis compact.

Under Assumptions A2 and A3, we obtain the analogous results of Section2 for the automatic Portmanteau test based on generalized residualsεt :=Ht(θ),where Dh is defined as before but withvtas in Assumption A3(a) and

G′_h:= −E[ξt h⊗H˙t(θ0)],

where recall ξt h:=(ε′

t₋1, . . . , ε′t₋h)′. The following result is

proved in the online Appendix D.

Theorem 3. If Assumptions A2 and A3 are satisfied, then the conclusion of Theorem 1 holds for the generalized automatic Portmanteau test. Moreover, if we drop Assumption A2(c), then the conclusion of Theorem 2 also holds.

3.1 Volatility Models

Similar to the VAR models, there is an extensive empirical literature on the estimation of multivariate GARCH models in empirical finance; see Bauwens, Laurent, and Rombouts (2006) and references therein. Since financial returns behave approxi-mately as mds processes, for these data, the conditional mean is not typically the object of interest, and research has shifted to modeling the conditional higher moments of these returns. The data are assumed to follow the model

Yt=μ(Ft−1, θ0)+1/2(Ft−1, θ0)ηt,

(7)

whereμt(θ0)≡μ(Ft−1, θ0) is the parametric conditional mean

vector, which can be zero;t(θ0)≡(Ft−1, θ0) is a

paramet-ric conditional covariance matrix, assumed to be nonsingular; and{ηt_}are standardized iid innovations. The specification of traditional multivariate GARCH-type models fort(·), such as the one employed in Section4, can be automatically tested by checking whether the standardized residuals are white noise us-ing theAQtest. In particular, Ling and Li (1997) suggested to look at the following transformationεt_≡Ht(θ0), where

Ht(θ)=(Yt₋μt(θ))′_t−1(θ)(Yt₋μt(θ))−m.

In this context, a natural estimator forθ0is the Gaussian

quasi-maximum likelihood estimatorθ .Hence, primitive conditions for Assumptions A2 and A3 in this example are straightfor-ward to find and are standard in the literature, see Ling and Li (1997) for details. In particular, these authors showed that the expression for theDhmatrix in (4) in this application is given

by conditional log-likelihood. If the innovations{ηt_}are Gaussian, then A₌B, but in general, A₌B. Ling and Li (1997) proposed a modified Portmanteau test that corresponds to

T(h) in the previous section with residuals εt:=Ht(θ)=

(Yt−μt(θ))′t−1(θ)(Yt−μt(θ))−m.An alternative approach

is to use the automatic Portmanteau testAQ.In this application, the estimation of the critical values is further simplified, since

Z₌1Z21 a.s., where 1 is a positive number that can be

where we employ the natural sample analog consistent esti-mators for c, X1, A, and B. Therefore, after an appropriate

standardization, the limit distribution ofAQis a simpleχ2 1.

Alternative Portmanteau tests have been proposed in the lit-erature, see, for example, Tse (2002), and our methods could again be used to provide automatic versions of them, details are omitted for the sake of space.

4. MONTE CARLO EXPERIMENTS

In this section, we compare the finite-sample performance of the proposed automatic testAQwith the Portmanteau test,

Q(h), for different choices of h. For some experiments, we will also report the results for QBIC, which is the automatic

Portmanteau test statistic that uses the BIC criterion to select the lag orderhof the error autocorrelation tested, that is, the employed penalty term is π(h, n, q)=hm2_log_n. _{We do not}

report results for the automatic test that employs the Akaike criterion because this test is not able to control the Type I error, as we comment later. In addition, for some cases, we also report the test results forQ1,which is a robustified version ofQ(1) that uses the estimated weightedχ2 critical values andAT, which is the automatic version ofT(h),that is,AT :₌T(h),whereh

is defined as before but withQ(h) replaced byT(h).Theorem 1 can be easily extended to show that the null limiting distribution

ofATis aχ_m22,so its critical values are readily available. Unlike

AQ, AT requires the matrixDhto be nonsingular, which as we will see it may lead to inaccurate empirical sizes in some cases.

4.1 VAR Examples: Size

We first illustrate the problems associated withAT ,which has led us to prefer the use ofAQ.We examine the sensitivity of the tests to the magnitude of the true parameter values in the bivariate VAR(1) model

Yt =θ I2Yt−1+εt,

whereθranges from 0.1 to 0.95. We consider two scenarios for the processεt : a bivariate iid standard Gaussian sequence and a

bivariate Gaussian GARCH(1,1). The GARCH(1,1) innovations can be written as εt =H

1/2

t ut, where ut follows a standard

bivariate iid Gaussian sequence, and Ht1/2 denotes the matrix

such thatHt ₌Ht1/2H

1/2

t ,whereHt =C+A⊙(εt−1ε′t₋1)+

G_⊙Ht₋1,withC=diag(1,1), A=G=diag(0.3,0.3).

Figures 1and2plot the empirical rejection rates at the 5% nominal level for the AQ,the AT , the Q(2), the Q(4),and the Q(12) tests for the iid normal innovations case and for the GARCH(1,1) innovations case, respectively. The considered sample size is n₌300. Figure 1 provides several messages. First, it indicates that the classical Portmanteau tests, especially

Q(2), present large size distortions for high values forθ. Second, Figure 1also shows that theATtest presents extremely high size distortions for low and moderate values forθ; for example,AT

rejects about 90% at 5% nominal level whenθequals 0.1. This result resembles similar empirical findings by Ljung (1986, sec. 5) who noticed that Newbold’s (1980) modified Portmanteau test could not control the Type I error because the asymptotic covariance matrix of the sample autocorrelations was close to singular. In our case, the asymptotic covariance matrixDhturns out to be already close to singular for moderate values ofhand smallθ. Finally,Figure 1shows thatAQis able to control the Type I error irrespective ofθ.

Figure 2considers the case of conditional heteroscedasticity. The main difference withFigure 1is that the classical Portman-teau tests do not control the Type I error, as the theory predicts. Similarly to the homoscedastic case, AT presents severe size distortions for moderate and small values forθ.In contrast, the behavior of the data-driven testAQis not affected by the value ofθboth in the homoscedastic and heteroscedastic cases.

We have performed additional simulation results for higher-order VAR processes and all the results indicate that AT is unable to control the size for a range of parameter values even for large sample sizes. This has been the main reason for focusing on theAQtest rather than on theAT test.

Next, we consider additional size results to compareAQto

QBIC, Q1, Q(2), Q(4), Q(8), Q(12),andQ(24). We consider

three VAR models with lag orders 1, 3, and 6. The coefficient matrices used in the data-generating processes (DGPs) are taken from L¨utkepohl (2005). The VAR specifications used can be found in the online Appendix B. In these size results, the correct VAR order is fitted.

We consider n= 100, 150, 300, and 1000 to accommodate empirically relevant sample sizes in macro and finance andd ₌

25. As discussed in Escanciano and Lobato (2009), the choice

(8)

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Empirical rejection rate

θ

Q(2)

Q(4) Q(12)

AT

AQ

Figure 1. Size performance at 5% level: iid innovations (n₌300).

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

Empirical rejection rate

θ

Q(2)

Q(4) Q(12)

AT

AQ

Figure 2. Size performance at 5% level: GARCH innovations (n₌300).

(9)

Table 1. Empirical rejection rates (percentage) of nominal 5% test: iid innovations

VAR(1) VAR(3) VAR(6)

n 100 150 300 1000 100 150 300 1000 100 150 300 1000

AQ 5.9 5.8 5.3 5.3 7.7 7.1 5.6 4.8 8.7 7.3 6.0 5.8 QBIC 5.6 5.6 5.2 5.3 7.7 7.1 5.6 4.8 8.7 7.3 6.0 5.8 Q1 5.6 5.6 5.2 5.3 7.7 7.1 5.6 4.8 8.7 7.3 6.0 5.8 Q(2) 5.8 5.8 6.3 6.3 NA NA NA NA NA NA NA NA Q(4) 4.7 5.2 4.7 5.4 9.5 8.7 7.9 7.1 NA NA NA NA Q(8) 4.3 4.9 4.7 5.0 4.9 5.0 4.8 4.9 19.7 14.2 11.2 9.4 Q(12) 4.8 4.9 4.5 5.8 3.9 4.1 4.8 4.8 8.7 6.9 6.6 6.3 Q(24) 5.4 5.3 5.1 5.5 3.3 4.0 4.6 4.6 4.3 3.8 3.9 4.7

of the upper bound d plays a secondary role in our testing procedure, a result also verified at the end of this section in Table 6.Tables 1and2report the empirical rejection rates for the tests for the 5% nominal level, for the iid Gaussian and the GARCH innovations, respectively. The number of Monte Carlo replications is 10,000 for each sample size. We do not include the results for the automatic test that employs the AIC since it does not control the Type I error. For instance, for the VAR(1) case with iid innovations, the empirical percentage rate is 10.7 when n₌100 and it is 10.6 for n₌1000. For the GARCH innovations, the distortions even increase with the sample size since the empirical rejection percentage rate is 14.0 forn₌100 and 18.2 whenn₌1000.Note that the Portmanteau tests are only applicable whenhis larger thanp.Table 1shows that the Portmanteau tests are more accurate as the difference

h₋p increases. However, we observe quite large distortion whenhis close top; for example, in the right panel,p₌6, the Portmanteau testQ(8) rejects 19.7% whenn₌100 and 9.4% even for a sample size as large as 1000.

Table 2 indicates large size distortions of the Portmanteau tests for many choices ofnandh.For instance in the VAR(1) model, when the sample size is 1000,Q(12) rejects 10.7% at the 5% nominal level. These distortions do not decrease as the sample size increases since these Portmanteau tests are not robust to conditional heteroscedasticity. In contrast, our data-driven testAQexhibits moderate size distortions even for small sample sizes common for macro data, such asn₌300. Tables1 and2show a similar behavior for theAQ, QBIC,andQ1 tests.

This fact could be expected since, under the null, the optimal value forhis one because the series are uncorrelated, and the penalty term (5) would choose the BIC criterion.

4.2 VAR Examples: Power

In this subsection, we consider the finite-sample behavior under the alternative. For the sake of space, we only consider Gaussian homoscedastic innovations and, given that Table 1 shows that the classical Portmanteau tests cannot control the Type I error in many occasions, we report size-corrected power. We first examine the performance of theAQtest in detecting the incorrect specification of the VAR model. In Table 3, we report the results for two experiments. In the first experiment, the data are generated from a VAR(2) model (the modulus of roots for the characteristic function are 2.98 and 1.12)

VAR(2) : Yt =

0.03 0.02

+

0.5 0.4 0.1 0.5

Yt−1

+

0 0

0.25 0

Yt₋2+εt,

where εt are iidN(0, I),but the fitted model is a VAR(1). In the second experiment, the data are generated from a VAR(6) model (the modulus of roots for the characteristic function are 1.86, 1.29, 1.16, 1.08, 1.04, and 1.03)

VAR(6) : Yt =

0.03 0.02

+

0.5 0.4 0.1 0.5

Yt−5

+

0 0

0.25 0

Yt₋6+εt,

but the fitted model is a VAR(3). Table 3 reports the size-corrected empirical rejection rates for three sample sizes,

n₌100, 300, and 1000. In the first experiment, the empiri-cal power of the classiempiri-cal Portmanteau tests, Q(h), decreases

Table 2. Empirical rejection rates (percentage) of nominal 5% test: GARCH innovations

VAR(1) VAR(3) VAR(6)

n 100 150 300 1000 100 150 300 1000 100 150 300 1000

AQ 6.5 6.5 6.1 5.5 8.4 7.5 6.2 5.0 9.5 8.3 6.3 5.6 QBIC 5.6 5.8 5.4 5.2 8.4 7.5 6.2 5.0 9.5 8.3 6.3 5.6 Q1 5.6 5.7 5.4 5.2 8.4 7.5 6.2 5.0 9.5 8.3 6.3 5.6 Q(2) 11.8 12.1 14.0 15.9 NA NA NA NA NA NA NA NA Q(4) 8.5 10.6 11.4 13.9 11.0 10.9 10.9 10.6 NA NA NA NA Q(8) 6.6 8.5 8.5 11.0 5.3 5.8 6.4 7.4 20.3 15.5 11.9 11.1 Q(12) 6.5 7.6 7.7 10.7 4.4 5.0 5.6 7.0 8.6 7.4 7.0 7.1 Q(24) 6.0 6.7 7.0 8.1 4.2 3.8 4.7 6.0 4.1 4.0 3.8 5.1

(10)

Table 3. Empirical size-corrected power (percentage) of nominal 5% test: iid innovations

DGP: VAR(2); fitted model: VAR(1)

n AQ QBIC Q1 Q(2) Q(4) Q(8) Q(12) Q(24)

100 39.0 35.0 32.3 63.9 46.8 30.2 23.3 15.9

300 92.5 89.9 86.9 99.2 96.6 87.6 76.8 55.6

1000 100.0 100.0 100.0 100.0 100.0 100.0 100.0 99.9

DGP: VAR(6); fitted model: VAR(3)

n AQ QBIC Q1 Q(2) Q(4) Q(8) Q(12) Q(24)

100 99.6 26.7 26.7 NA 34.3 100.0 99.9 99.7

300 100.0 49.9 49.9 NA 39.8 100.0 100.0 100.0

1000 100.0 100.0 80.5 NA 41.2 100.0 100.0 100.0

withh, whereas in the second experiment, it increases withh.

Therefore, a researcher who employs the classical Portmanteau would prefer to employ a small value forhin the first case, and a large value forhin the second case. Hence, this table illustrates the importance of employing an automatic criterion to selecth

to render more powerful tests. As we can see, our data-driven test always performs well and is comparable toQ(h) for the best choice ofh, which is in general unknown to the researcher.

Next, we simulate data from the vector moving average pro-cess MA(1) (the modulus of roots for the characteristic function are 2.22 and 1.82)

MA(1) :Yt₌εt₊εt₋1, where=

−0.5 0.05 0.05 −0.5

,

and fit two VAR(p) models, in particular, p₌1 and p₌3. Since an invertible MA(1) process can be written as a VAR(_∞) process, we could expect that, as we increase the lag p, the power of detecting misspecification is smaller.Table 4confirms this result and shows that for all tests, the power when fitting a VAR(3) reduces considerably with respect to the case of fitting a VAR(1).Table 4shows that, in terms of empirical power, our proposed test AQis comparable to the Portmanteau test for small values ofhand performs better than the Portmanteau test for large values ofh. Also note that theAQ, QBIC,andQ1 tests

present a similar behavior.

Table 5reports the results for the case where the DGP is the VAR(12) (the modulus of roots for the characteristic function

are 1.06)

VAR(12) :Yt ₌12Yt₋12+εtwhere12=

0.5 −0.05 0.05 0.5

,

and the fitted models are VAR(p)’s withp₌1,3,5,7,9,11, and 12. Note that the last case,p₌12,represents the empiri-cal level. The last column ofTable 5indicates that for the two cases where it can be computed, the classical Portmanteau tests

Q(h) cannot control the Type I error, whereasAQ, QBIC,and

Q1 tests present a similar behavior under the null hypothesis. In terms of power,Table 5shows thatAQstrongly dominates

QBIC, especially when usingp=3,5,7, and 9.The intuition

for this result is clear: when the higher-order autocorrelations are significant, our test employs the AIC criterion that tends to choose high values forh.Table 5also shows the power sensi-tivity of the classical Portmanteau statistic. In particular,Q(h) rejects more often for largerhbecause, when the order of the fitted VAR is smaller, the residuals present more autocorrelation at higher lags.

Finally, we also examine the sensitivity of our test to the se-lection ofd. Specifically,Table 6reports the empirical rejection rates for one case under the null, fitting data with the true DGP (VAR(1) with iid innovation) and one case under the alternative, fitting data generated by MA(1) with a VAR(3). For this exper-iment, the sample size is set atn₌1000 and seven values ford

are used (d ₌25,50,75,100,125,150,and 300). The results indicate that our test is insensitive to the choice ofd.

Table 4. Empirical size-corrected power (percentage) of nominal 5% test: iid innovations

DGP: MA(1); fitted model: VAR(1)

n AQ QBIC Q1 Q(2) Q(4) Q(8) Q(12) Q(24)

100 53.2 52.8 52.1 66.2 45.5 33.5 29.8 24.7

300 98.8 98.8 98.8 99.6 97.2 88.1 79.9 65.4

1000 100.0 100.0 100.0 100.0 100.0 100.0 100.0 99.9

DGP: MA(1); fitted model: VAR(3)

n AQ QBIC Q1 Q(2) Q(4) Q(8) Q(12) Q(24)

100 6.2 6.2 6.2 NA 8.7 6.7 7.2 6.5

300 11.8 11.8 11.8 NA 17.9 11.1 10.0 8.3

1000 38.4 38.3 38.3 NA 56.9 28.7 22.1 15.9

(11)

Table 5. Empirical size-corrected power (percentage) of nominal 5% test: DGP: bivariate VAR(12) with iid Gaussian innovations. Fitted

model is bivariate VAR(p). Sample sizen₌300

p 1 3 5 7 9 11 12

AQ 100.0 100.0 100.0 99.6 98.4 99.6 5.6 QBIC 25.4 17.4 24.7 19.5 19.6 97.7 5.6 Q1 20.4 17.4 24.7 19.5 19.6 97.7 5.6 Q(2) 20.9 NA NA NA NA NA NA Q(4) 36.2 19.2 NA NA NA NA NA Q(8) 60.9 53.6 19.0 30.8 NA NA NA Q(16) 100.0 100.0 100.0 100.0 100.0 100.0 54.0 Q(24) 100.0 100.0 100.0 100.0 100.0 100.0 13.5

4.3 Multivariate GARCH Examples: Size

To check the finite performance of our automatic test ap-plied to a multivariate GARCH model, we simulate a bivariate GARCH(1,1) model following the Baba, Engle, Kraft, Kroner (BEKK) formulation:

Yt ₌1t/2ηt (9)

t ₌C′C₊A′Yt₋1Yt′−1A+G′t−1G,

whereC, A, andGare 2×2 matrices, withCupper triangular. The innovations{ηt,i :i₌1,2;t₌1, . . . , n_}are iid skewedt

distributed with 5 degrees of freedom. The parameter values are taken from L¨utkepohl (2005, p. 573) and reported in the online Appendix B. Stationarity is guaranteed by the fact that the eigenvalues ofA_⊗A₊G_⊗Gare less than 1 in modulus (which are 0.94, 0.90, 0.89, and 0.86). The test statistic is formed by applying our automatic procedure to the residuals

εt=Yt′−

1

t Yt−2,

where t’s are the parametric estimators oft. We compare the finite sample performance of our automatic test with the standardized tests,T(h), proposed in Ling and Li (1997), where these tests are denoted byQ(h).Table 7reports the empirical rejection rates with sample sizes n₌100,200,300,500, and 1000 at the 5% nominal level. The number of replications is 1000 for the multivariate GARCH examples.Table 7shows that our data-driven test can properly control the size even when the sample size is as small asn₌100. It also indicates that the fi-nite sample size performance of theT(h) tests varies with the choice ofh, for example,T(3) exhibits less size distortion than

T(9).

Table 6. Rejection rates of nominal 5% test:n₌1000. Size: True DGP VAR(1) with iid Gaussian innovation and fitted with VAR(1);

Power: True DGP MA(1) but fitted with VAR(3)

d 25 50 75 125 150 300

Size 5.00 5.01 5.02 5.03 5.03 5.04

Power 38.19 38.20 38.20 38.20 38.20 38.20

Table 7. Empirical rejection rates(percentage) of nominal 5% test: skewedtinnovation

GARCH(1,1)

n 100 200 300 500 1000

AQ 5.1 4.6 5.1 5.0 5.3 T(3) 3.9 5.2 5.4 5.3 4.2 T(6) 3.0 3.8 4.6 5.3 5.4 T(9) 3.9 4.3 6.0 5.5 6.8

4.4 Multivariate GARCH Examples: Power

We study the finite sample power performance in GARCH specifications by considering two DGPs: an asymmetric bivari-ate GARCH(1,1) model and a bivaribivari-ate ARCH(2) model. The asymmetric GARCH(1,1) model is specified as

Yt ₌1t/2ηt (10)

t ₌C′C₊A′Yt₋1Yt′−1A+B′ξt−1ξt′−1B+G′t−1G,

whereξt i =Yt i1(Yt i <0), i=1,2.The ARCH(2) model is as

follows

Yt ₌1t/2ηt (11)

t ₌C′C₊A′₁Yt₋1Yt′−1A1+A′2Yt−2Yt−2A2.

The true parameter values are given in the online Appendix B. In the first set of power experiments, we generate the dataset according to the asymmetric GARCH(1,1) model, but we fit the data with a symmetric GARCH(1,1), that is, setting B₌0 in (10). In the second set, the data are simulated from an ARCH(2) model but fitted with an ARCH(1), that is, setting A2 =0 in

(11). We also compare the power performance of our automatic test with the standardized tests,T(h), withh₌1,3,6, and 9. Table 8 reports the empirical rejection rates of both experi-ments with sample sizesn₌100,200,300, and 1000.For the asymmetric GARCH example, the power ofT(h) decreases as

h increases, while for the ARCH example, the power initially increases with hand then it decreases with h. Similar to Sec-tion4.2,Table 8shows that the empirical power of theAQtest

Table 8. Empirical power (percentage) of nominal 5% test: Gaussian innovation

DGP: Asymmetric GARCH(1,1); fitted: GARCH(1,1)

n AQ T(1) T(3) T(6) T(9)

100 22.3 22.2 15.9 13.1 12.0

200 44.3 44.3 33.5 24.3 20.8

300 63.2 63.2 54.7 45.0 38.3

1000 99.8 99.8 99.5 98.1 96.8

DGP: ARCH(2); fitted model: ARCH(1)

n AQ T(1) T(3) T(6) T(9)

100 70.6 38.8 74.8 67.1 60.7

200 94.0 42.4 91.2 90.9 89.1

300 97.7 38.5 94.8 94.2 92.9

1000 99.6 33.4 99.4 99.1 99.1

(12)

is comparable to the highest power achieved by any of theT(h) tests.

5. AN APPLICATION TO GNP AND UNEMPLOYMENT

In this section, we apply the proposedAQtest and the Port-manteau test Q(h) to test for model adequacy of the bivari-ate VAR models for real GNP growth rbivari-ate and unemployment rate estimated by Blanchard and Quah (1989, hereafater BQ) and Evans (1989). These models are prototypical examples of the application of the VAR methodology, see the textbooks by Canova (2007) or Favero (2001).

BQ took quarterly data from 1948:2 to 1987:4 and estimated a VAR(8). For the same bivariate model and a dataset ranging from 1951:1 to 1985:4, Evans (1989) estimated VAR(3) and VAR(6) models. Note that BQ employed a VAR(8) model with-out formally checking for model adequacy, whereas Evans only tested the lack of correlation of univariate residuals in each fitted equation, ignoring the cross autocorrelations between the two sequences of residuals. Evans found that the long-run output shock is significantly positive when using the VAR(3) model, but the conclusion is less significant when using the VAR(6) model. Hence, it is important to check that the chosen VAR model is adequate and avoid overfitting.

In our application, we use the original datasets employed by Evans and BQ, which are in 1982 dollars. Regarding the data, the seasonally adjusted real GNP data are obtained from the Federal Reserve Bank of St. Louis. Evans employed quarterly GNP growth rate, whereas BQ used the annualized GNP growth rate. For unemployment rate, we employ two measures: one is the civilian unemployment rate as in Evans, which is also ob-tained from the Federal Reserve Bank of St. Louis, and another measure is the unemployment rate for males over 20 as in BQ, which is obtained from the U.S. Department of Labor, Bureau of Labor Statistics. Note that, apart from using different unemploy-ment and GNP measures, Evans and BQ also differ in additional specification aspects. For instance, for the unemployment rate, Evans used a time dummy, whereas BQ considered a time trend. In addition, BQ considered a structural break for the GNP.

Table 9assesses the validity of the models proposed by BQ and Evans. In particular, Table 9 reports the p-values for the

AQand the Portmanteau testsQ(h),for varioush,for fitting a

Table 9. P-values of Bivariate VAR for real GNP growth rate and unemployment rate for models in Evans (1989) and Blanchard and

Quah (1989)

Evans BQ

Fitted order 1 2 3 7 8

AQ 0.013 0.133 0.439 0.009 0.423

Q(4) 0.004 0.110 0.049 NA NA

Q(8) 0.012 0.061 0.088 0.006 NA

Q(12) 0.011 0.055 0.031 0.054 0.029 Q(16) 0.027 0.049 0.021 0.253 0.148 Q(20) 0.053 0.059 0.033 0.715 0.603 Q(24) 0.103 0.069 0.055 0.821 0.776

VAR(8) for BQ and a VAR(3) for Evans. For the implementa-tion of theAQtest, the chosen upper bound isd₌15, although the results are not sensitive to this choice. InTable 9, the pro-posedAQtest indicates that the model used by BQ seems to be correctly specified. To check for the possibility of overfitting, Table 9reports the results if a VAR(7) had been fitted. In that case, the AQtest would reject it, confirming the apparently correct specification of the VAR(8) model of BQ. As a general comment forTable 9, notice the conflicting evidence among the Portmanteau testsQ(h),for the differenth.Our automatic test avoids this multivariate testing problem.

Table 9also assesses the fit of Evans VAR(3) model. Although all Portmanteau tests would reject the model at the nominal 10% level, theAQtest does not reject that the model chosen by Evans is correct. Again, to rule out the possibility of overfitting, Table 9also reports the results if a VAR(1) and a VAR(2) had been fitted. Since the AQ test would reject the VAR(1) but would fail to reject the VAR(2), one can conclude that theAQ

test would indicate that Evans overfitted his model and that a VAR(2) would have been a more parsimonious specification.

This application highlights two of the main ideas stressed in this article. First, the application to Evans’ dataset indicates that inference based on the commonly used Portmanteau tests

Q(h) can be unreliable. Second, the application to BQ’s dataset shows that the Portmanteau tests Q(h) for the different lags

h may lead to conflicting empirical evidence. Our automatic

AQtest overcomes these problems, and hence, it provides a reliable specification testing procedure that should be useful for practitioners.

6. CONCLUSIONS

This article has introduced an automatic specification test for linear and general nonlinear multivariate time series mod-els. The test is based on checking whether the residuals of the considered model are uncorrelated. The key feature of the pro-posed test is the automatic selection of the order of the serial correlation tested, which is carefully chosen to combine the advantages of the AIC and BIC lag order criteria. On a more general ground, this article belongs to the small literature deal-ing with the problem of “bandwidth” choice for testdeal-ing, see Gao and Gijbels (2008) for an important recent contribution. This literature is rather scarce compared with the extensive literature on bandwidth choice for estimation. In our context, the number of autocorrelations tested can be thought as a bandwidth param-eter. Our construction builds on previous work by Inglot and Ledwina (2006a,b), and we apply these ideas to the multivariate Portmanteau test, which is one of the most widely used tests in applied time series. The result is an automatic Portmanteau test that is simple to implement, powerful, and more robust than existing alternative tests. Finally, it would be of interest to pro-vide further epro-vidence on the behavior of the test for the models described in Section3, such as conditional moment models.

ACKNOWLEDGMENTS

We are indebted to the Editor, Associate Editor, and two referees for many useful comments that have substantially im-proved the article. Escanciano acknowledges financial support

(13)

from the Spanish Plan Nacional de I+D+I, reference number SEJ2007-62908. Lobato acknowledges financial support from the Mexican CONACYT, reference number 151624, and from Asociaci´on Mexicana de Cultura.

[Received March 2012. Revised February 2013.]

REFERENCES

Ahn, S. K. (1988), “Distribution for Residual Autocovariances in Multivariate Autoregressive Models With Structured Parameterization,”Biometrika, 75, 590–593. [426,428]

Akaike, H. (1974), “A New Look at the Statistical Model Identification,”IEEE Transactions on Automatic Control, 19, 716–723. [427]

Bai, J., and Chen, Z. (2008), “Testing Multivariate Distributions in GARCH Models,”Journal of Econometrics, 143, 19–36. [430]

Bauwens, L., Laurent, S., and Rombouts, J. V. K. (2006), “Multivariate GARCH Models: A Survey,”Journal of Applied Econometrics, 21, 79–109. [430] Billingsley, P. (1961), “The Lindeberg-Levy Theorem for Martingales,”

Pro-ceedings of the American Mathematical Society, 12, 788–792. [428] Blanchard, O., and Quah, D. (1989), “The Dynamic Effects of Aggregate

De-mand and Supply Disturbances,”The American Economic Review, 79, 655– 673. [426,436]

Box, G., and Pierce, D. (1970), “Distribution of Residual Autocorrelations in Autoregressive Integrated Moving Average Time Series Models,”Journal of American Statistical Association, 65, 1509–1527. [426]

Canova, F. (2007),Methods for Applied Macroeconomic Research, Princeton, NJ: Princeton University Press. [436]

Chabot-Hall´e, D., and Duchesne, P. (2008), “Diagnostic Checking of Multi-variate Nonlinear Time Series Models With Martingale Difference Errors,”

Statistics and Probability Letters, 78, 997–1005. [427,429]

Chitturi, R. V. (1974), “Distribution of Residual Autocorrelations in Multiple Autoregressive Schemes,”Journal of the American Statistical Association, 69, 928–934. [426]

Christiano, L. J., Eichenbaum, M., and Evans, C. L. (1999), “Monetary Pol-icy Shocks: What Have We Learned and to What End,” inHandbook of Macroeconomics, eds. J. Taylor and M. Woodford, Amsterdam: Elsevier. [426]

Delgado, M. A., and Velasco, C. (2011), “An Asymptotically Pivotal Trans-form of the Residuals Sample Autocorrelations With Application to Model Checking,”Journal of the American Statistical Association, 106, 946–958. [426]

Durbin, J. (1970), ”Testing for Serial Correlation in Least Squares Regression When Some of the Regressors Are Lagged Dependent Variables,” Econo-metrica, 38, 410–421. [429]

Escanciano, J. C., and Lobato, I. N. (2009), “An Automatic Data-Driven Port-manteau Test for Testing for Serial Correlation,”Journal of Econometrics, 151, 140–149. [428,431]

Evans, G. W. (1989), “Output and Unemployment Dynamics in the United States: 1950–1985,” Journal of Applied Econometrics, 4, 213–238. [426,436]

Favero, C. (2001), Applied Macroeconometrics, Oxford: Oxford University Press. [436]

Francq, C., and Ra¨ıssi, H. (2007), “Multivariate Portmanteau Test for Autore-gressive Models With Uncorrelated but Nonindependent Errors,”Journal of Time Series Analysis, 28, 454–470. [427,428,429]

Francq, C., Roy, R., and Zakoian, J. M. (2005), “Diagnostic Checking in ARMA Models With Uncorrelated Errors,”Journal of the American Statistical As-sociation, 100, 532–544. [427]

Gao, J., and Gijbels, I. (2008), “Bandwidth Selection in Nonparametric Kernel Testing,”Journal of the American Statistical Association, 103, 1584–1594. [436]

Godfrey, L. G. (1976), “Testing for Serial Correlation in Dynamic Simultaneous Equation Models,”Econometrica, 44, 1077–1084. [429]

Guay, A., Guerre, E., and Lazarova, S. (2011), “Robust Adaptive Rate-Optimal Testing for the White Noise Hypothesis,”Journal of Econometrics, unpub-lished manuscript, available athttp://arxiv.org/abs/1106.2014. [426] Hansen, L. P., and Singleton, K. J. (1982), “Generalized Instrumental Variables

Estimation of Nonlinear Rational Expectations Models,”Econometrica, 50, 1269–1286. [430]

Hong, Y. (1996), “Consistent Testing for Serial Correlation of Unknown Form,”

Econometrica, 64, 837–864. [429]

Hosking, J. R. M. (1980), “The Multivariate Portmanteau Statistic,”Journal of the American Statistical Association, 75, 602–608. [426,428]

——— (1981a), “Equivalent Forms of Multivariate Portmanteau Tests,”Journal of the Royal Statistical Society,Series B, 43, 261–262. [426,428] ——— (1981b), “Lagrange-Multiplier Tests of Multivariate Time-Series

Mod-els,”Journal of the Royal Statistical Society,Series B, 43, 219–230. [426] Imhof, J. P. (1961), “Computing the Distribution of Quadratic Forms in Normal

Variables,”Biometrika, 48, 419–426. [429]

Inglot, T., and Ledwina, T. (2006a), “Toward Data Driven Selection of a Penalty Function for Data Driven Neyman Tests,”Linear Algebra and Its Applica-tions, 417, 124–133. [428,436]

——— (2006b), “Data Driven Score Tests of Fit for Semiparametric Ho-moscedastic Linear Regression Model,” Preprint 665, Institute of Mathe-matics Polish Academy of Sciences. [428,436]

Kallenberg, W. C. M. (2002), “The Penalty in Data Driven Neyman’s Tests,”

Mathematical Methods of Statistics, 11, 323–340. [428]

Ling, S., and Li, W. K. (1997), “Diagnostic Checking of Nonlinear Multivari-ate Time Series With MultivariMultivari-ate ARCH Errors,”Journal of Time Series Analysis, 18, 447–464. [427,431,435]

Ljung, G. M. (1986), “Diagnostic Testing of Univariate Time Series Models,”

Biometrika, 73, 725–730. [431]

L¨utkepohl, H. (2005),New Introduction to Multiple Time Series Analysis, Berlin: Springer-Verlag. [427,428,429,431,435]

Newbold, P. (1980), “The Equivalence of Two Tests of Time Series Model Adequacy,”Biometrika, 67, 463–465. [427,431]

Poskitt, D. S., and Tremayne, A. R. (1982), “Diagnostic Tests for Multiple Time Series Models,”The Annals of Statistics, 10, 114–120. [426,429] Quenouille, M. H. (1947), “A Large-Sample Test for the Goodness-of-Fit of

Autoregressive Schemes,”Journal of the Royal Statistical Society,Series A, 110, 123–129. [426]

Schwarz, G. (1978), “Estimating the Dimension of a Model,”The Annals of Statistics, 6, 461–464. [427]

Sims, C. (1980), “Macroeconomics and Reality,”Econometrica, 48, 1–48. [426] Tse, Y. K. (2002), “Residual-Based Diagnostics for Conditional

Heteroskedas-ticity Models,”Econometrics Journal, 5, 358–373. [431]