Directory UMM :Data Elmu:jurnal:J-a:Journal of Econometrics:Vol96.Issue1.May2000:

(1)

*Corresponding author. Tel.: 886-227822791 ext. 296, FAX: 886-27853946.

E-mail:[email protected] (W.-J. Tsay)

The spurious regression of fractionally

integrated processes

Wen-Jen Tsay

!

,

*, Ching-Fan Chung

"

!The Institute of Economics, Academia Sinica, Nankang, Taipei, Taiwan

"National Taiwan Univeristy, Taiwan

Received 1 September 1995; received in revised form 1 July 1999

Abstract

This paper extends the theoretical analysis of the spurious regression and spurious detrending from the usual I(1) processes to the long memory fractionally integrated processes. It is found that when we regress a long memory fractionally integrated process on another unrelated long memory fractionally integrated process, no matter whether these processes are stationary or not, as long as their orders of integration sum up to a value greater than 0.5, thetratios become divergent and spurious e!ects occur. Our "nding suggests that it is the long memory, instead of nonstationarity or lack of ergodicity, that causes such spurious e!ects. As a result, spurious e!ects might happen more often than we previously believed as they can arise even between stationary series while the usual"rst-di!erencing procedure may not completely eliminate spurious e!ects when data possess strong long memory. ( 2000 Elsevier Science S.A. All rights re-served.

JEL classixcation: C22

Keywords: Fractionally integrated processes; Long memory; Spurious regression; Spurious detrending

(2)

1. Introduction

The spurious regression was "rst studied by Granger and Newbold (1974) using simulation. They show that when unrelated data series are close to the integrated processes of order 1 or theI(1) processes, then running a regression with this type of data will yield spurious e!ects. That is, the null hypothesis of no relationship among the unrelatedI(1) processes will be rejected much too often. Furthermore, the spurious regression tends to yield a high coe$cient of deter-mination (R2) as well as highly autocorrelated residuals, indicated by a very low value of Durbin}Watson (D=_{) statistic. Granger and Newbold}_'_{s simulation}

results are later supported by Phillips'(1986) theoretical analysis. Phillips proves that the usualt test statistic in a spurious regression does not have a limiting distribution but diverges as the sample size approaches in"nity. He also shows that R2 has a non-degenerate limiting distribution while the D= _statistic

converges in probability to zero. Phillips' results has been generalized by Marmol (1995) to cases with integrated processes of higher orders.

The history of the research on spurious detrending follows a similar thread. Nelson and Kang (1981, 1984)"rst employ simulation to demonstrate that the regression of a driftlessI(1) process on a time trend produces an incorrect result of a signi"cant trend. Extending the Phillips' (1986) approach, Durlauf and Phillips (1988) derive the asymptotic distributions for the least squares es-timators in such a regression. In particular, the latter authors show that thettest statistics diverge and there are no correct critical values for the conventional signi"cance tests.

Most studies of the spurious regression concentrate on the nonstationaryI(1) processes. It re#ects the widely held belief that many data series in economics are

I(1) processes, or nearI(1) processes, as argued by Nelson and Plosser (1982). Against this backdrop, we also witness in recent years fast growing studies on fractionally integrated processes, or the I(d) processes with the di!erencing parameterdbeing a fractional number. TheI(d) processes are natural generaliz-ation of theI(1) processes that exhibit a broader long-run characteristics. More speci"cally, theI(d) processes can be either stationary or nonstationary, depend-ing on the value of the fractional di!erencing parameter. The major character-istic of a stationaryI(d) process is its long memory which is re#ected by the hyperbolic decay in its autocorrelations. A number of economic and"nancial series have been shown to possess long memory. See Baillie (1996) for an updated survey on the applications of the I(d) processes in economics and

"nance.

(3)

The main "nding from our study is that the spurious regression can arise among a wide range of long-memoryI(d) processes, even in cases where both dependent variable and regressor are stationary. A few conclusions may then be drawn. First, di!erent from what Phillips (1986) and Durlauf and Phillips (1988) have suggested, the cause for spurious e!ects seems to be neither nonstationarity nor lack of ergodicity but the strong long memory in the data series. As a result, spurious e!ects might occur more often than we previously believed as they can arise even among stationary series. Furthermore, the usual "rst-di!erencing procedure may not be able to completely eliminate spurious e!ects if the data series are not only nonstationary but possess strong long memory (such as in the case where they areI(d) processes withd'1).

2. A general theory of spurious e4ects

Our analysis of the spurious e!ects are based on several simple linear regression models in which the dependent variable and the single nonconstant regressor are independentI(d) processes withdlying in di!erent ranges. Before presenting these models, let us"rst brie#y review some basic properties of the

I(d) processes. A process>

t is said to be a fractionally integrated process of orderd, denoted

asI(d), if it is de"ned by (1!¸)d>

t"et, where¸is the usual lag operator,dis

the di!erencing parameter which can be a fractional number, and the innovation sequencee_t is white noise with a zero mean and"nite variance. The fractional di!erencing operator (1!¸)_dis de"ned as follows: (1!¸)_d"₊=

j/0tj¸j, where

t_j"C₍_j!d)/[C₍_j#1)C₍!d)] andC₍₎_{) is the gamma function. This process is}

"rst introduced by Granger (1980, 1981), Granger and Joyeux (1980), and Hosking (1981). They show that>

t is stationary whend(0.5 and is invertible

whend'!0.5. The main feature of theI(d) process is that its autocovariance function declines at a slower hyperbolic rate (instead of the geometric rate found in the conventional ARMA models):

c(j)"O(j2d~1),

wherec(j) is the autocovariance function at lagj. Whend'0, theI(d) process is said to have long memory since it exhibits long-range dependence in the sense that+=

j/~=c(j)"R. Whend(0, then+=j/~=Dc(j)D(Rand the process is

sometimes referred to as an intermediate memory process. See Chung (1994) for other long-memory properties of theI(d) process.

(4)

whether the fractional di!erencing parameterdis greater than 0.5 or not. The exact speci"cations of these models can be conveniently expressed with fourI(d) processes. Let us "rst de"ne two stationary ones with di!erent di!erencing parametersd

1 andd2 whose values lie between!0.5 and 0.5:

(1!¸)d1v

t"at and (1!¸)d2wt"bt,

wherea

tandbt are two white noises with zero mean and"nite variancesp2aand

p2

b, respectively; that is,vt andwtareI(d1) andI(d2) processes, respectively, and

both of them are stationary and invertible. When these two processes are employed in our later analysis, the values of their di!erencing parameters are mostly assumed to be in (0, 0.5); i.e., the stationary processesv

tandwtare often

assumed to have long memory. We can also de"ne two nonstationaryI(1#d₁) andI(1#d

2) processes by integratingvt andwt: y

t"yt~1#vt and xt"xt~1#wt.

Obviously, the orders of integration of these two nonstationary fractionally integrated processes lie between 0.5 and 1.5. Given these four fractionally integrated processes, we consider the following six simple linear regression models:

Model 1: y

t is regressed on an intercept andxt,

Model 2: v

t is regressed on an intercept andwt, whered1#d2'0.5,

Model 3: y

t is regressed on an intercept andwt, whered2'0,

Model 4: v

t is regressed on an intercept andxt, whered1'0,

Model 5: y

t is regressed on an intercept and the trendt,

Model 6: v

t is regressed on an intercept and the trendt, whered1'0.

In Model 1 the orders of integration of both the dependent variable and the regressor lie between 0.5 and 1.5, and can be equal to 1. So Model 1 may be considered a generalization of Phillips'(1986) spurious regression to the case of fractionally integrated processes. Model 2 presents the most interesting case in our analysis. In it both the dependent variable and the regressor are assumed to be stationary, ergodic, and strongly persistent in the sense that their fractional di!erencing parameters sum up to a value greater than 0.5. Following Phillips'

arguments, we tend to think no spurious e!ect should occur in such a model where variables are ergodic. But our analysis of Model 2 presents a result to the contrary. The analysis of Model 2 seems to go beyond the previous study of spurious e!ects and allows us to gain new insight into the problem.

(5)

expect the analysis of these two new models to be a mixture of those of Models 1 and 2.

In Models 5 and 6 we consider the e!ect of detrending the nonstationary and stationary fractionally integrated processes, respectively. Through these two models, we generalize the results of Durlauf and Phillips (1988). Also, Models 5 and 6 can be regarded as variants of Models 1 and 4, respectively, with the nonstationary regressor x

t replaced by the time trend. This similarity in the

model speci"cations will also be re#ected in their analytic results. The following assumption on the two white noise processesa

tandbtare made

throughout this paper.

Assumption1. Each of the two processesa

t andbtis independently and

identi-cally distributed with a zero mean; and their moments satisfy the following conditions: EDa

tDp(R, with p*maxM4,!8d1/(1#2d1)N; and EDbtDq(R

withq*maxM4,!8d

2/(1#2d2)N. Moreover,atandbtare independent of each

other.

We also assume, without loss of generality, that the initial values of the fractionally integrated processesv

0,w0,y0, andx0are all zero. Hence,yt and x

t can be considered as the partial sums of vt and wt, respectively; i.e., y

T"+T_t/1vt and xT"+Tt/1wt. The independent and identical distribution

assumption is made to simplify our analysis and could be relaxed, say, to the case wherea

t andbt are short-memory processes. See Chung (1995).

Before presenting Lemma 1, which is the cornerstone of our analysis, let us summarize two important asymptotic results on the partial sumsy

tandxt. First,

given the variancesp2

y"Var(yT) andp2x"Var(xT), Sowell (1990, Theorem 1)

proves that

p2 y&p2a

C₍₁!2d 1)

(1#2d

1)C(1#d1)C(1!d1) ¹1`2d1

and

p2 x&p2b

C₍₁!2d 2)

(1#2d

2)C(1#d2)C(1!d2) ¹1`2d2,

where z

T&wT means zT/wTP1 as ¹PR. Furthermore, Davydov (1970)

shows that as¹PR, 1

p

y y

*Tr+NBd1(r) and

1

p

x x

*Tr+NBd2(r),

forr3[0, 1], where [¹r] denotes the integer part of¹r, the notationNdenotes weak convergence, and B

(6)

which is de"ned by the following stochastic integral:

0(t) is the standard Brownian motion. See

Mandel-brot and Van Ness (1968). Our notation for the standard and the fractional versions of Brownian motions suggests that the former is a special case of the latter withd"0.

The independence betweenv

tandwtand betweenytandxtimpliesjointweak

motion, of which the two elementsB

d1 and Bd2 are independent. This result

implies the following lemma:

Lemma 1. Let Assumption 1 hold. Then,as¹PR,we have the following results:

(7)

6. 1

Moreover, joint weak convergence of 1}4, 8, 10, and 11 also applies. Here,B

d1(t)

andB

d2(t)are two independent normalized fractional Brownian motions,cv(j)and

c

w(j)are the autocovariance functions ofvtandwt,respectively,at lagj,andp2aand

p2

b are the variances of the underlying white noisesat andbt, respectively. The

notation P1 means convergence in probability.

All the theorem proofs are in the appendix. Note that, while the weak convergence ofz

t is in the space of functions, the weak convergence established

in the above lemma is in the real line, which is equivalent to convergence in distribution. Following the convention in the literature, we use the same nota-tionNfor both types of weak convergence.

In the rest of this section the results of Lemma 1 will be used to develop the theory of spurious e!ects, presented in a series of theorems and corollaries, for the proposed six models. The "rst two models will be discussed separately in Sections 2.1 and 2.2. These two models provide us with a framework which facilitates the explanations of the other four models in Sections 2.3 and 2.5. One subsection } Section 2.4 } will be devoted to the analysis of an important issue about how the orders of fractional integration are directly related to the spurious e!ects. The results in Lemma 1 have also been used in deriving the limiting distributions of the&modi"ed Durbin}Watson statistics'by Tsay (1998).

We will adopt the following notation for the various statistics from the Ordinary Least Squares (OLS) estimation. Leta( andbK denote the usual OLS estimators of the intercept and the slope. Their respective variances are esti-mated bys2_b ands2_a, from which we have the t ratiost

b"bK/sb andta"a(/sa.

(8)

of determination, andD=_{the Durbin}_}_{Watson statistic. Finally, in addition to}

the autocovariance functionsc_v(j) andc_w(j) ofv

t andwt, letov(j) andow(j) be

their respective autocorrelations at lagj.

2.1. Model 1 of nonstationary fractionally integrated processes

In Model 1 a nonstationary I(1#d

1) process yt is regressed on another

independent and nonstationaryI(1#d

2) processxt. Since the permissible range

for the values of the fractional di!erencing parametersd

1andd2is (!0.5, 0.5),

Model 1 generalizes Phillips' (1986) model of integrated processes in which

d

1"d2"0. Not surprisingly, all the results we derive for Model 1 are

straight-forward generalization of Phillips'theory of the spurious e!ects. The results for Model 1 are presented in the following theorem:

Theorem 1. Let Assumption 1 hold. Then,as¹PR,we have the following results:

1. p_x p_ybKN

:1

0Bd1(s)Bd2(s) ds![:1₀Bd1(s) ds][:₀1Bd2(s) ds]

:1

0[Bd2(s)]2ds![:1₀Bd2(s) ds]2

,_b H.

Note thatp

y/px"O(¹d1~d2).

2. 1

p

y

a(N_:₁

0Bd1(s) ds!b_H:1₀Bd2(s) ds,a_H,whereb_His dexned in1.Note that

p

y"O(¹0.5`d1).

3. 1

p2 y

s2N_:1

0[Bd1(s)]2ds![:1₀Bd1(s) ds]2!b_H2M:1₀[Bd2(s)]2ds![:1₀Bd2(s) ds]2N

,_p₂

H, wherebH is dexned in 1.Note thatp2y"O(¹1`2d1).

4. ¹_p₂ x

p2_y s2bN

p2 H

:1

0[Bd2(s)]2ds![:1₀Bd2(s) ds]2

,p2

Hb, wherep2H is dexned in 3.

Note thatp2

y/¹p2x"O(¹2d1~2d2~1).

5. ¹

p2 y

s2_aN_p₂

H

G

1#

[:1

0Bd2(s) ds]2

:1

0[Bd2(s)]2ds![:1₀Bd2(s) ds]2

H

,_p₂

Ha,wherep2H is dexned

in 3.Note thatp2_y/¹"O(¹₂_d1).

6. 1

J¹tb NbH

(9)

7. 1

J¹ta NaH

p

Ha

,wherea_His dexned in 2 andp2

Ha is dexned in 5.

8. _R2Nb2HM:10[Bd2(s)]2ds ![:1

0Bd2(s) ds]2N

:1

0[Bd1(s)]2ds![:1₀Bd1(s) ds]2

,whereb

H is dexned in 1.

9. D= P1 0.

Here, B

d1(t) and Bd2(t) are two independent normalized fractional Brownian

motions.

The most important result in Theorem 1 is that, as the sample size¹increases, the twotratiost_bandt_adiverge at the same rate ofJ¹, which is independent of the magnitudes of the fractional di!erencing parametersd

1andd2. This result is

exactly the same as what Phillips (1986) has obtained for the case where

d

1"d2"0. So even when the orders of integration in the dependent variable

and the regressor di!er from 1 by as much as 0.5, the usual problem in using the

t tests remains: the probability of rejecting the null hypothesis of b"0 or

a"0 based onttests increases monotonically as the sample size increases. Also note that Marmol (1995) generalizes Phillips'theory to cases where bothy

tand

x

t are integrated processes of the same integer orders that are higher than one.

The limiting distributions in Theorem 1 for the special case whered

1"d2are

also very similar to Marmol's results.

The limiting distributions of thetratios, after normalized byJ¹, are direct generalization of those derived by Phillips (1986). The same conclusion also holds forR2and theD=_{statistics. In other words, when we compare our results}

with Phillips', we observe a common feature in these four statistics; namely, the nonzero values of d

1 and d2 do not a!ect their convergence rates while the

e!ects on their limiting distributions are quite straightforward: all the standard Brownian motions in Phillips' theory are replaced by fractional Brownian motions. That the fractional di!erencing parametersd

1andd2play a relative

minor role here is mainly because the four statistics are all ratios so that the e!ects of d₁ and d₂ are cancelled out. In contrast, the results on the OLS estimators

bK anda( are a di!erent story. In Phillips' theory bothbK anda(/J¹converge to some non-normal non-degenerate limiting distributions. But for the present model of the fractionally integrated processes, the orders ofbK anda( are¹d1~d2and

¹d1`0.5, respectively. So whilea( always diverges (though the rate can be slow if

d

1 is close to!0.5),bK can be either divergent or convergent, depending on the

relative magnitudes ofd

1andd2. For example, if the order of integration in the

dependent variabley

t is smaller than that of the regressorxt; i.e.,d1(d2, then

bK converges to zero, just like the conventional case of no spurious e!ects. Moreover, if d

1!d2"!0.5, then, similar to the case of no spurious e!ects,

(10)

2.2. Model 2 of stationary fractionally integrated processes

In this section we consider Model 2 in which a stationary fraction-ally integrated processv

tis regressed on an independent and stationary

fraction-ally integrated processw

t. We show that, although bothvtandwtare stationary,

the spurious e!ect in terms of thet tests could still exist under an additional condition on the fractional di!erencing parameters: d

1#d2'0.5. Loosely

speaking, this condition implies that the two processes v

t and wt are both

strongly persistent.

Our analysis begins with a special case where we assume a set of more stringent conditions which helps deriving the limiting distribution of the OLS estimator. This theory is based on an important result of Fox and Taqqu (1987) who show that the product of two highly persistent but stationary Gaussian processes, if adequately normalized, can converge. After examining this special case, we then show how the spurious e!ects may still exist in a more general framework.

Let us"rst reproduce Fox and Taqqu's (1987) Theorem 6.1 here as Lemma 2.

Lemma 2. Let (X

t,>t) be a stationary jointly Gaussian sequence with

E(X

t)"E(>t)"0, E(X2t)"E(>2t)"1, and E(Xt>t)"r. Suppose that p1 and

p₂are two arbitrary real numbers and that there exist0(d

1, d2(0.5,such that

asjPR

E(X

tXt`j)&p21j~d1, E(Xt>t`j)&

op₁p₂b 1

Ja 1a2

j~(d1`d2)@2,

E(>

t>t`j)&p22j~d2, E(>tXt`j)&

op₁p₂b 2

Ja 1a2

j~(d1`d2)@2,

whereo is a constant between 1 and!1,whilea

1"A(d1,d1), a2"A(d2,d2), b

1"A(d1, d2),andb2"A(d2, d1)are four constants withA(d1,d2)being dexned

by:=

0x~(d1`1)@2(x#1)~(d2`1)@2dx,then

1

¹1~(d1`d2)@2

*Ts+ + t/1

(X

t>t!r)NZ(s),

where

Z(s)" p1p2 Ja

1a2

P

R2

P

s

0

C

2

<

i/1

(u!x

i)~(di`1)@2IM_x_i_:uN

D

dudM₁(x₁) dM₂(x₂).

Here,M

1 andM2 are two Gaussian random measures with respect to Lebesgue

(11)

Note that the two processesX

t and>t are not only strongly persistent, as

indicated by the hyperbolic convergence ratesd₁ andd₂ in their autocorrela-tions, but also highly correlated with each other, as indicated by the hyperbolic convergence rates in their covariances. However, in our application we are only interested in the case whereX

t and>t are independent so thatrandoin the

above lemma are both zero. The above lemma o!ers us the convergence rate of

+_Tt_/1X

t>t and its limiting process Z(t) given the Gaussian assumption and

a narrower range for the parametersd₁andd₂. In order to apply this lemma, we make the following assumption in addition to Assumption 1 made earlier.

Assumption 2. The two fractionally integrated processes v

t and wt are both

Gaussian and the corresponding fractional di!erencing parameters d 1 and d

2are both in the range of (0.25, 0.5).

Given the facts that

o_v(j)&C(1!d1) C₍_d

1)

j2d1~1 _and o

w(j)& C₍₁!d

2) C₍_d

2)

j2d2~1_,

it is straightforward to prove the following corollary in whichX

t and >t in

Lemma 2 are replaced byv

t/Jcv(0) andwt/Jcw(0), respectively.

Corollary 1. Let Assumptions 1 and 2 hold. Then,as¹PR,

1

¹_d1`d2

T + t/1

v t Jc_v(0)

w t

Jc_w(0)NZ(1),

where the limiting randomvariableZ(1)is dexned in Lemma 2 withd₁"1!2d

1,

d₂"1!2d

2, p21"C(1!d1)/C(d1), and p22"C(1!d2)/C(d2). Consequently, we have

¹₊T t/1

v t

p

y w

t

p

x

NCJc_v(0))c_w(0)Z(1),

where C2"C₍₁!2d

1)C(1!2d2)/(1#2d1)C(1#d1)C(1!d1)(1#2d2) C₍₁#d

2)C(1!d2).

The result of Corollary 1 supplements that of item 7 in Lemma 1. From these results, we can then establish the following theorem about the spurious e!ect in Model 2.

Theorem 2. Let Assumptions 1 and 2 hold. Then,as¹PR,we have the following

results:

1. ¹2

p

ypx

bK"O

(12)

2. ¹

p

y

a(NB

d1(1).Note thatpy/¹"O(¹d1~0.5).

3. s2 P1 _c v(0).

4. ¹s2 b P1 cv

(0)

c_w(0).

5. ¹s2_a P1 _c v(0).

6. ¹3@2

p_yp_xtb"O1(1).Note thatpypx/¹3@2"O(¹d1`d2~0.5).

7. J¹

p_y taN

1

Jc_v(0)Bd1(1).Note thatpy/J¹"O(¹d1).

8. ¹4

p2_yp2_xR2"O1(1).Note thatp2yp2x/¹4"O(¹2d1`2d2~2).

9. D= P1 2!2o_v(1)"2(1!2d1)

1!d 1

.

HereB

d1(t)is a normalized fractional Brownian motions.

The most important result from this theorem is the divergence rates of the two

t ratios t

b and ta, which are ¹d1`d2~0.5 and ¹d1, respectively. Recall that

d

1#d2!0.5 is necessarily greater than 0 (and smaller than 0.5) under

Assump-tion 2. This result re#ects the spurious e!ect in the t tests. Since both the dependent variable and the regressor are stationary and ergodic, the spuri-ous e!ect is not really expected (see Phillips 1986, p. 318). The surprising results we get here suggest that the cause for the spurious e!ect has more to do with the strong persistence than stationarity and ergodicity of the variables involved.

It is interesting to compare the divergence rates of thetratios here with the

J¹rate we observe in Model 1. We note that the divergence rates in the present model depend on the magnitudes of the two fractional di!erencing parameters

d

1andd2while those in Model 1 do not. Furthermore, thetratios diverge more

slowly in the present model than in Model 1. In particular, the divergence rate of

t_bcan become very slow when bothd₁andd₂approach to their lower boundary 0.25.

(13)

contrast to these irregular convergence rates of the OLS estimators, the estimated variancess2_bands2_a nevertheless converge at the standard¹~1rate. It is such disparity in the convergence rates between the OLS estimators, which converge at rates slower than¹~1@2, and their standard errors, which converge at the standard ¹~1@2 rates, that causes the resulting t ratios to diverge and hence the spurious e!ect.

R2in the present model converges to 0 as in the case of no spurious e!ects. It is di!erent from what we observe in Model 1 whereR2converges to a random variable. Consequently, as the sample size increases, the declining R2 in the present model will correctly re#ect the fact that the regressor does not help explain the variations in the dependent variable.

TheD=_{statistic does not converge in probability to zero and this result is}

also di!erent from that of Model 1. Its limit 2!2o_v(1) is similar to the one we

"nd in the conventional AR(1) case. This limit depends on the fractional di!erencing parameterd

1of the dependent variablevt and can only take value

in the range of (0,4/3), which is to the left of the value 2.

There is one technical detail that calls for some explanations: Unlike in Theorem 1, we do not give an explicit expression for the limiting distribution of

bK in Theorem 2 because such an expression requires the joint weak convergence of the sample averages of v

t, wt, and vtwt, while the proof of the joint weak

convergence is beyond the scope of the present paper. However, lacking in an explicit expression for the limiting distribution of bK does not hinder us from evaluating the convergence rate ofbK, which is all we need to show the spurious e!ects in thetratios. This same argument applies to some of the later analyses, including the following one where we consider a less restricted speci"cation of Model 2 that is de"ned by the following assumption.

Assumption 3. The sum of the two fractional di!erencing parametersd

1 and d

2is greater than 0.5.

Since the Gaussian distribution is not assumed while one of the fractional di!erencing parametersd

1andd2can be smaller than 0.25, Assumption 3 is thus

less stringent than Assumption 2.

Corollary 2. If Assumption 2 is replaced by Assumption 3 in Theorem 2,then all the

conclusions there remain true.

(14)

dependent variable and the regressor are both stationary and ergodic (so long as they are su$ciently persistent).

A profound implication from Model 2 is as follows: If we begin with Model 1 where both the dependent variable and the regressor are nonstationary fractionally integrated processes with the orders of integration 1#d

1 and

1#d

2, respectively, whered1#d2'0.5, then"rst-di!erencing both variables

cannot completely eliminate the spurious e!ects. WhileR2may be reduced and the D= _{statistic may be increased, the} _t_{ratios may still be so large that we}

cannot avoid making a spurious inference. This is a fairly serious problem with the regression for the fractionally integrated processes. It implies that even the popular"rst-di!erencing procedure might not prevent us from"nding a spuri-ous relationship among highly persistent data series. One lesson we learn from this discussion is that it is very important to check individual data series for possible long memory before regression can be applied.

2.3. Two intermediate cases: Models 3 and 4

Models 3 and 4 can be considered as two intermediate models between Models 1 and 2 in that one of the dependent variable and the regressor is stationary while the other is not. We expect the asymptotic results for these two new models to be hybrid of those of Models 1 and 2.

In Model 3 a nonstationaryI(1#d

1) processyt is regressed on an

indepen-dent and stationary I(d

2) process wt. Note that the fractional di!erencing

parameterd₂for the regressorw

there is assumed to be positive; i.e.,wthas long

memory. The asymptotic properties of the OLS estimators for Model 3 are given in the following theorem:

Theorem 3. Let Assumption 1 hold. Then,as¹PR,we have the following results.

1. ¹

p_yp_xbK"O1(1).Note thatpypx/¹"O(¹d1`d2).

2. 1

p

y

a(N_:₁

0Bd1(s) ds,a_H.Note that py"O(¹0.5`d1).

3. 1

p2 y

s2N_:₁

0[Bd1(s)]2ds![:1₀Bd1(s) ds]2,p2_H.Note thatp2y"O(¹1`2d1).

4. ¹

p2_ys2bN

p2 H

c_w(0),wherep2H is dexned in 3. Note thatp2y/¹"O(¹2d1).

5. ¹

p2 y

s2_aN_p2

(15)

6. J¹

p

x t

b"O1(1).Note thatpx/J¹"O(¹d2).

7. 1

J¹ta NaH

p

H

,wherea

H is dexned in 2 andp2H is dexned in 3.

8. ¹2

p2 x

R2"O

1(1).Note thatp2x/¹2"O(¹2d2~1).

9. D= P1 0.

HereB

d1(t)is a normalized fractional Brownian motions.

Since bothtratios diverge, Model 3 also su!ers from the spurious e!ect in terms of the t tests. Moreover, we "nd the results that the OLS estimator

a( diverge and thatD=_{converges in probability to 0 are close to what we get in}

Model 1, while the result of convergingR2is the same as that of Model 2. So Model 3 is indeed a mixture of Models 1 and 2.

It should be pointed out that in Theorem 3 the range of the fractional di!erencing parameterd

2of the regressorwt is restricted to the positive half of

the original range (!0.5, 0.5). For the case of a negative d

2, it is quite

straightforward to show that thetratios are convergent and there is no spurious e!ect.

In Model 4 a stationaryI(d

1) processvt is regressed on an independent and

nonstationary I(1#d

2) process xt. Similar to the restriction imposed on

Model 3, the fractional di!erencing parameterd

1of the dependent variablevtis

assumed to be positive so thatv

t has long memory. The asymptotic theory for

Model 4 is presented in the following theorem.

Theorem 4. Let Assumption 1 hold. Then,as¹PR,we have the following results.

1. ¹px

p_y bK"O1(1).Note thatpy/¹px"O(¹d1~d2~1).

2. ¹

p_ya("O1(1).Note thatpy/¹"O(¹d1~0.5).

3. s2 P1 _c v(0).

4. ¹_p₂

x)s2_bN cv

(0)

:1

0[Bd2(s)]2ds![:1₀Bd2(s) ds]2

,_p₂ Hb.

Note that1/¹_p₂

(16)

5. ¹₎s2

aNcv(0)M1#

[:1

0Bd2(s) ds]2

:1

0[Bd2(s)]2ds![:1₀Bd2(s) ds]2

N,_p2

Ha.

6. J¹

p

y

t_b"O

1(1).Note thatpy/J¹"O(¹d1).

7. _J¹

p

y t

a"O1(1).Note thatpy/J¹"O(¹d1).

8. ¹₂

p2_y R2"O1(1).Note thatpy2/¹2"O(¹2d1~1).

9.

D= P1 2!2o

v(1)"

2(1!2d₁) (1!d

1)

.

HereB

d2(t)is a normalized fractional Brownian motions.

Since bothtratios diverge (at the same rate), the spurious e!ect in terms of failingttests again exists in Model 4. But contrary to the results in Model 3, the OLS estimatorsbK anda(, together withR2, all converge in probability to zero, while theD=_{statistic converges to 2}!2o

v(1). These"ndings obviously bring

Model 4 closer to Model 2 than to Model 1.

2.4. The relationship between the orders of integration and the divergence rates

The divergent tratios in the above four models and the resulting failure of thettests are referred to as the spurious e!ects. In this section we compare the divergence rates oftratios across the four models and investigate how they are related to the respective model speci"cations.

(17)

Recall that in Models 2}4 restrictions have been imposed on the usual ranges (!0.5, 0.5) of the fractional di!erencing parametersd

1andd2. In Model 3 the

range ofd

2is restricted to be (0, 0.5), which is also the range ofd1in Model 4,

while the sum of d

1 andd2 must be greater than 0.5 in Model 2. From the

analysis in the previous paragraph, particularly the fact that the divergence rates are directly related to the magnitudes of d

1 and d2, we come to realize that

the restricted ranges of d

1 andd2 in Models 2}4 ensure the reduction in the

divergence rates oft_b from the ¹_0.5 level is not too great so that t_b remains divergent (in which case the spurious e!ects occur). Although we did not explicitly consider the asymptotic theory for cases where the fractional di! erenc-ing parameters lie outside their prescribed ranges, it is readily seen that the conditions we impose on the ranges are not only su$cient but also necessary for the existence of the spurious e!ect in terms of divergentt

b.

From a similar analysis for the divergence rates of thetratiot

awe also"nd

that reducing the order of integration in the dependent variable causes the divergence rate of t

a to decrease by the order of ¹d1~0.5, while reducing

the order of integration in the regressordoes not cause the divergence rate of

t_a to change, as we probably should have expected.

It is also interesting to see how the changes in the orders of integration of the dependent variable and the regressor a!ect the large-sample property ofR2. Recall that in Model 1R2converges to a random variable and such asymptotic behavior ofR2is considered part of the spurious e!ect by Phillips (1986). But when we examine Models 2}4, we note that reducing the order of integration in the dependent variable helps to increase its convergence rate by the order of

¹_1~2_d1 while reducing the order of integration in the regressor helps to increase the convergence rate by the order of¹_1~2_d2. As a result, in Models 2}4,R2all converge to 0, correctly re#ecting the fact that there is no relationship between the regressor and the dependent variable. This"nding implies that the spurious e!ects in Models 2}4 are con"ned to the two t ratios while the asymptotic tendency ofR2toward zero is not a!ected by the spurious e!ects (though the convergence rates are).

The sharp di!erence in the asymptotic behavior between thetratios andR2in Models 2}4 actually o!ers us an opportunity to diagnose the spurious e!ect in these models. That is, when we"nd two highly signi"canttratios coexisting with a completely contradictory near-zero R2, we are e!ectively reminded of the possibilities that one of the Models 2}4 may be at work and that the dependent variable and the regressor may possess strong long memory, while one of them may even be nonstationary. With the possibility of such an informal diagnosis, it seems that the spurious e!ects in Models 2}4 are less damaging than those in Model 1 in the sense that in Model 1 there is no internal inconsistency among the OLS estimates to indicate the spurious e!ects.

(18)

converge unless the dependent variable is nonstationary and its order of integra-tion is su$ciently large. Secondly, whethera( diverges or not and whether the

D=_{statistic converges in probability to zero or not depend entirely on whether}

the dependent variable is nonstationary or not. Note that, as mentioned earlier, even though the OLS estimatorsbK anda( can converge in probability to zero in the four proposed models, the correspondingtratios always diverge and it is these divergenttratios that are referred to as the spurious e!ects.

2.5. Models 5 and 6: Detrending fractionally integrated processes

As has been pointed out by Nelson and Kang (1981, 1984) and Durlauf and Phillips (1988), detrending integrated processes results in the spurious e!ect of

"nding a signi"cant trend. In this section we extend their analysis by considering the potential problems in detrending fractionally integrated processes. It turns out that the spurious e!ect of divergenttratios exists as long as the fractional di!erencing parameter is larger than zero. The implication is that whenever there is long memory in the process, the routine procedure of detrending can produce misleading results. It appears that the spurious e!ect in detrending occurs more often than we previously thought.

In our analysis of detrending fractionally integrated processes, we separate the nonstationary case from the stationary case. In Model 5 we examine the regression of a nonstationary I(1#d

1) process yt on a time trend t. The

asymptotic theory for the OLS estimation is given in the following theory.

1. ¹

p

y

bKN12:1

0sBd1(s) ds!6:1₀Bd1(s) ds,b_H.Note that py/¹"O(¹d1~0.5).

2. 1

p

y

a(N4:1

0Bd1(s) ds!6:₀1sBd1(s) ds,a_H.

3. 1

p2_ys2N:10[Bd1(s)]2ds![:₀1Bd1(s) ds]2!12[:1₀sBd1(s) ds!

1

2:1₀Bd1(s) ds]2

,_p2 H.

4. ¹3

p2_ys2bN12p2H,wherep2His dexned in 3. Note thatp2y/¹3"O(¹2d1~2).

5. ¹

p2 y

s2_aN4p2_H, wherep2_H is dexned in 3.Note that p2

(19)

6. 1

J¹tb N_b

H/J12pH,wherebH is dexned in 1 andp2His dexned in 3.

7. 1

J¹ta N_a

H/2pH,whereaH is dexned in 2 and p2His dexned in 3.

8. R2N b2H

12:1

0[Bd1(s)]2ds!12[:1₀Bd1(s) ds]2

,

whereb

His dexned in1.

9. D= P1 0 and p2_yD=N_c v(0)/p2H.

Here,B

d1(t)is a normalized fractional Brownian motion.

The results on detrending a stationary long memoryI(d

1) processvt, which is

our Model 6, are presented in the following theorem.

1. ¹2

p_ybKN6Bd1(1)!12:1₀Bd1(s) ds,b_H. Note thatpy/¹2"O(¹d1~1.5).

2. ¹

p_ya(N6:10Bd1(s) ds!2Bd1(1),a_H. Note thatpy/¹"O(¹d1~0.5).

3. s2 P1 _c v(0).

4. ¹3s2

b P1 12cv(0).

5. ¹s2

a P1 4cv(0).

6. J¹

p_y tbNbH/J12cv(0),wherebHis dexned in 1.Note thatpy/J¹"O(¹d1).

7. J¹

p_y taNaH/2Jcv(0),whereaH is dexned in 2.Note thatpy/J¹"O(¹d1).

8. ¹2

p2 y

R2N_b2

(20)

9. D= P1 2!2o_v(1)"2(1!2d

1)/(1!d1).

Here,B

d1(t)is a normalized fractional Brownian motion.

In terms of the convergence (or divergence) rates of the various OLS es-timators, Models 5 and 6 can be conveniently viewed as&special cases'of Models 1 and 4, respectively. More speci"cally, if we replace the termp_xby¹in those normalizing factors in Theorems 1 and 4, then we immediately get all the normalizing factors in Theorems 5 and 6. For example, while the normalizing factor forbK in Theorem 1 isp

x/py, the one in Theorem 5 is¹/py. Similarly, while

the normalizing factor for bK in Theorem 4 is¹p

x/py, the one in Theorem 6 is ¹2/p

y. (Also note that the fractional Brownian motionBd2(t) does not appear in

Theorems 5 and 6 since the regressors in Models 5 and 6 are the time trend and are unrelated to the I(d

2) process wt.) Given these observations, we then

conclude that all the analyses about Models 1 and 4 can be readily extended to Models 5 and 6. In particular, the divergence rates of the t ratios, which respectively are in the orders of¹_0.5and¹_d1 _{in Models 1 and 4, are also the}

rates in Models 5 and 6. (Note that in both Models 4 and 6 the same condition

d

1'0 is imposed on the stationary dependent variablevt so that the resulting t ratios are divergent.) As a result, the type of spurious e!ects we observe in Models 1 and 4 occur again in Models 5 and 6. That is, detrending a fractionally integrated process with a positive fractional di!erencing parameter, certainly including the usual case of theI(1) process, will result in the spurious"nding of a signi"cant trend. One important inference we draw from Models 5 and 6 is that the cause for the spurious e!ect in detrending a process is neither non-stationarity nor lack of ergodicity but long memory in the process.

From Models 5 and 6 we also note the following result: If the data series are nonstationary with the order of integration greater than 1, then the spurious e!ect can happen to the detrending procedure even after the series are "rst di!erenced. What "rst di!erencing does to the detrending procedure in such a case is simply reducing R2, increasing the value of the D= _{statistic, and}

slowing down the divergence of the two t ratios from the ¹_0.5 rate to the

¹_d1 rate. Based on this observation, it seems that the spurious e!ects in detrending may occur more often than we previously thought.

3. Conclusion

(21)

memory, instead of nonstationarity or lack of ergodicity, that causes the spuri-ous e!ects in terms of failing t tests. Nonstationarity in one or both of the dependent variable and the regressor only helps to accelerate the divergence rates of thetratios. We thus learn that spurious e!ects might occur more often than we previously believed as they can arise even among stationary series and the usual"rst-di!erencing procedure may not be able to completely eliminate spurious e!ects when data possess strong long memory. It is interesting to note Phillips (1995) recently has o!ered some contrast thoughts about spurious regressions (and he argues that spurious regressions may not be as serious as many researchers have been led to believe).

In Section 2.4 we have carefully examined the exact relationships between the orders of integration in the fractionally integrated processes and the divergence rates in thetratios. From this analysis we gain many insights into the problem of spurious e!ects which are not available in Phillips'(1986) classical study of

I(1) processes. In short, it is found that the extents of spurious e!ects are directly related to the degrees of long memory in the data. Our results on detrending fractionally integrated processes also greatly broaden Durlauf and Phillips'

(1988) theory of spurious detrending in which the relationship between the orders of integration and the divergence rates of thetratios again plays a useful role in the analysis.

A fairly extensive Monte Carlo study has also been conducted to verify the theoretical results, especially those of convergence rates, we have established in the paper. We do not report the simulation results here other than pointing out the fact that almost all our theoretical results are well supported by simulation. A few generalizations of our study are worthy of further consideration. A natural extension is to consider the multiple regression where there are more than one nonconstant regressor. Another one is to allow the fractionally integ-rated processes to have nonzero means. Based on Phillips' (1986) work, we expect most, if not all, of the asymptotic results we obtain from the simple regression case to hold in the multiple regression of fractionally integrated processes with drifts. These issues have been examined by Chung (1995).

One aspect of our study that is slightly more restricted than Phillips, (1986) and Durlauf and Phillips' (1988) analysis is that the fractionally integrated processes we consider are built on white noisesa

t andbt that are required to

satisfy the relatively stringent conditions as speci"ed in Assumption 1. These conditions e!ectively rule out the possibility of allowing short-run dynamics such as the ARMA components in the fractionally integrated processes we have studied. Chung (1995) has studied the case where Assumption 1 is relaxed to incorporate the short-run dynamics and found no substantial changes in the analysis of spurious e!ects.

Finally, our study of spurious regression can serve as the basis for the analyses of &fractional cointegration' where the dependent variable and regressors are

(22)

has attracted a lot of attention in the literature recently. One of the pioneer works in this area is by Cheung and Lai (1993).

Acknowledgements

We are very grateful for two referees and an associate editor for their valuable suggestions.

Appendix A. Proof of Lemma 1

The proofs of items 1}3 are straightforward applications of the continuous mapping theorem to the Davydov's results. They are omitted here. Item 4 fol-lows directly from Davydov's result, while items 5 and 6 are due to ergodicity of the two stationary processesv

t andwt.

To prove item 7, we note, sincev

t andwt are assumed to be independent and

have zero means, the autocovariance of the productv

twtat lagjis the product of

their respective autocovariance at lagj: c

v(j)cw(j). Also, it is well-known that

c

v(j)"O(j2d1~1) andcw(j)"O (j2d2~1) ifd1O0 andd2O0. Consequently, we

have

Var

A

+T t/1

v

twt

B

"¹ T~1

+ j/~(T~1)

A

1!DjD

¹

B

cv(j)cw(j)"¹ T~1

+ j/~(T~1)

O(DjD2d1`2d2~2)

G

"¹O (¹2d1`2d2~1)"O (¹2d1`2d2) if d

1#d2'0.5, "¹O (ln¹)"O (¹ln¹) if d

1#d2"0.5,

"O (¹) otherwise.

Using Proposition 6.2.3 from Brockwell and Davis (1991), we thus have

T + t/1

v twt

G

"O

1(¹d1`d2) if d1#d2'0.5, "O

1[(¹ln¹)0.5] if d1#d2"0.5, "O₁(¹_0.5) otherwise,

and, by the facts thatp2

y"O(¹1`2d1) andp2x"O(¹1`2d2), we also have

¹₊T t/1

y t

p

y w

t

p

x

G

"O

1(1) if d1#d2*0.5,

"O

1[(ln¹)0.5] if d1#d2"0.5, "O

(23)

To prove item 8, we note

The orders of the four sums of squares are based on the results of items 2 and 5. We also note that¹p

(24)

"_c

Note that the second equality results from the facts thaty

t andwt are

indepen-Note that the"fth equality is ensured by the assumption thatd

2'0. Given this

The result for+T

t/1(vt/py)(xt/px) can be proved in a similar fashion. To prove

item 10, we"rst note that

(25)

by applying the results of items 4 and 1. The weak limit of+T

t/1(t/¹)(wt/px) can

be derived using a similar argument. To prove item 11, we note

1

The orders of the last two terms at the end of the "rst line are based on the results of items 1 and 10. The result for (1/¹)+T

t/1(t/¹)(xt/px) can be proved by

a similar argument.

The joint weak convergence of 1}4 and 8,10,11 can be established by writing the vector of sample moments as a functional ofz

*Tr+"[p~1y y*Tr+,p~1x x*Tr+]@up to

an error of o

1(1). See Park and Phillips (1988, p. 491). But regarding the

conclusion in item 9 where we only have results on convergence rates instead of limiting distributions, the reason is that we cannot apply the usual stochastic calculus technique due to the fact thatB

d1(s) andBd2(s) are not martingale (see

De Jong and Davidson, 1997).

Appendix B. Proof of Theorem 1

Let us "rst summarize the formulas for all relevant statistics in the simple linear regression model ofy

(26)

To prove item 1, we note

where the weak convergence is due to items 1, 3, 8 of Lemma 1, and the joint weak convergence of the relevant sample moments. To prove item 2, we note

1

where the weak convergence is based on item 1 above and item 1 of Lemma 1. To prove item 3, we have

1

where the weak convergence is based on item 1 above and item 3 of Lemma 1. To prove item 4, we note

¹_p₂

where the weak convergence is based on item 3 above and item 3 of Lemma 1. To prove item 5, we see

(27)

where the weak convergence is based on item 1 above and item 3 of Lemma 1. To prove item 9, we note

(u(

y which converges weakly to

p2_H by the result of item 3.

The proofs of Theorems 2}6 are quite similar to that of Theorem 1. Note that Park and Phillips (1988) have shown the general structure of these types of proofs. For the reason of space, the details of the proofs are omitted here.

Appendix C. Proof of Corollary 2

It su$ces to show that¹₂_b_K/p_yp_x"O₁(1). But from the proof for item 7 of

(28)

References

Baillie, R.T., 1996. Long memory processes and fractional integration in econometrics. Journal of Econometrics 73, 6}59.

Brockwell, P.J., Davis, R.A., 1991. Time Series: Theory and Methods, 2nd Edition. Springer, New York.

Cheung, Y.-W., Lai, K.S., 1993. A fractional cointegration analysis of purchasing power parity. Journal of Business and Economic Statistics 11, 103}112.

Chung, C.-F., 1994. Calculating and analyzing impulse responses and their asymptotic distributions for the ARFIMA and VARMA models. Econometrics and Economic Theory Paper No. 9402, Michigan State University.

Chung, C.-F., 1995. Sample variances, sample covariances, and linear regression of stationary multivariate long memory processes. Preprint, Michigan State University.

De Jong, R., Davidson, J., 1997. The functional central limit theorem and weak convergence to stochastic integrals: results for weakly dependent and fractionally integrated processes. Preprint. Davydov, Y.A., 1970. The invariance principle for stationary processes. Theory of Probability and

Its Applications 15, 487}489.

Durlauf, S.T., Phillips, P.C.B., 1988. Trends versus random walks in time series analysis. Econo-metrica 56, 1333}1354.

Fox, R., Taqqu, M.S., 1987. Multiple stochastic integrals with dependent integrators. Journal of Multivariate Analysis 21, 105}127.

Granger, C.W.J., 1980. Long memory relationships and the aggregation of dynamic models. Journal of Econometrics 14, 227}238.

Granger, C.W.J., 1981. Some properties of time series data and their use in econometric model speci"cation. Journal of Econometrics 16, 121}130.

Granger, C.W.J., Newbold, P., 1974. Spurious regression in econometrics. Journal of Econometrics 2, 111}120.

Granger, C.W.J., Joyeux, R., 1980. An introduction to long-memory time series models and fractionally di!erencing. Journal of Time Series Analysis 1, 15}29.

Hosking, J.R.M., 1981. Fractional di!erencing. Biometrika 68, 165}176.

Mandelbrot, B.B., Van Ness, J.W., 1968. Fractional Brownian motions, fractional noise and applications. SIAM Review 10, 422}437.

Marmol, F., 1995. Spurious regressions between I(d) processes. Journal of Time Series Analysis 16, 313}321.

Nelson, C.R., Kang, H., 1981. Spurious periodicity in inappropriately detrended time series. Econometrica 49, 741}751.

Nelson, C.R., Kang, H., 1984. Pitfalls in the use of time as an explanatory variable in regression. Journal of Business and Economic Statistics 2, 73}82.

Nelson, C.R., Plosser, C., 1982. Trends and random walks in macro-economic time series: some evidence and implications. Journal of Monetary Economics 10, 139}162.

Park, J.Y., Phillips, P.C.B., 1988. Statistical inference in regressions with integrated processes: part 1. Econometric Theory 4, 468}497.

Phillips, P.C.B., 1986. Understanding spurious regressions in econometrics. Journal of Econometrics 33, 311}340.

Phillips, P.C.B., 1995. Nonstationary time series and cointegration. Journal of Applied Econo-metrics 10, 87}94.

Sowell, F., 1990. The fractional unit root distribution. Econometrica 58, 495}505.