*Corresponding author. Tel.: 886-227822791 ext. 296, FAX: 886-27853946.
E-mail:wtsay@ieas.econ.sinica.edu.tw (W.-J. Tsay)
The spurious regression of fractionally
integrated processes
Wen-Jen Tsay
!
,
*, Ching-Fan Chung
"
!The Institute of Economics, Academia Sinica, Nankang, Taipei, Taiwan"National Taiwan Univeristy, Taiwan
Received 1 September 1995; received in revised form 1 July 1999
Abstract
This paper extends the theoretical analysis of the spurious regression and spurious detrending from the usual I(1) processes to the long memory fractionally integrated processes. It is found that when we regress a long memory fractionally integrated process on another unrelated long memory fractionally integrated process, no matter whether these processes are stationary or not, as long as their orders of integration sum up to a value greater than 0.5, thetratios become divergent and spurious e!ects occur. Our "nding suggests that it is the long memory, instead of nonstationarity or lack of ergodicity, that causes such spurious e!ects. As a result, spurious e!ects might happen more often than we previously believed as they can arise even between stationary series while the usual"rst-di!erencing procedure may not completely eliminate spurious e!ects when data possess strong long memory. ( 2000 Elsevier Science S.A. All rights re-served.
JEL classixcation: C22
Keywords: Fractionally integrated processes; Long memory; Spurious regression; Spurious detrending
1. Introduction
The spurious regression was "rst studied by Granger and Newbold (1974) using simulation. They show that when unrelated data series are close to the integrated processes of order 1 or theI(1) processes, then running a regression with this type of data will yield spurious e!ects. That is, the null hypothesis of no relationship among the unrelatedI(1) processes will be rejected much too often. Furthermore, the spurious regression tends to yield a high coe$cient of deter-mination (R2) as well as highly autocorrelated residuals, indicated by a very low value of Durbin}Watson (D=) statistic. Granger and Newbold's simulation
results are later supported by Phillips'(1986) theoretical analysis. Phillips proves that the usualt test statistic in a spurious regression does not have a limiting distribution but diverges as the sample size approaches in"nity. He also shows that R2 has a non-degenerate limiting distribution while the D= statistic
converges in probability to zero. Phillips' results has been generalized by Marmol (1995) to cases with integrated processes of higher orders.
The history of the research on spurious detrending follows a similar thread. Nelson and Kang (1981, 1984)"rst employ simulation to demonstrate that the regression of a driftlessI(1) process on a time trend produces an incorrect result of a signi"cant trend. Extending the Phillips' (1986) approach, Durlauf and Phillips (1988) derive the asymptotic distributions for the least squares es-timators in such a regression. In particular, the latter authors show that thettest statistics diverge and there are no correct critical values for the conventional signi"cance tests.
Most studies of the spurious regression concentrate on the nonstationaryI(1) processes. It re#ects the widely held belief that many data series in economics are
I(1) processes, or nearI(1) processes, as argued by Nelson and Plosser (1982). Against this backdrop, we also witness in recent years fast growing studies on fractionally integrated processes, or the I(d) processes with the di!erencing parameterdbeing a fractional number. TheI(d) processes are natural generaliz-ation of theI(1) processes that exhibit a broader long-run characteristics. More speci"cally, theI(d) processes can be either stationary or nonstationary, depend-ing on the value of the fractional di!erencing parameter. The major character-istic of a stationaryI(d) process is its long memory which is re#ected by the hyperbolic decay in its autocorrelations. A number of economic and"nancial series have been shown to possess long memory. See Baillie (1996) for an updated survey on the applications of the I(d) processes in economics and
"nance.
The main "nding from our study is that the spurious regression can arise among a wide range of long-memoryI(d) processes, even in cases where both dependent variable and regressor are stationary. A few conclusions may then be drawn. First, di!erent from what Phillips (1986) and Durlauf and Phillips (1988) have suggested, the cause for spurious e!ects seems to be neither nonstationarity nor lack of ergodicity but the strong long memory in the data series. As a result, spurious e!ects might occur more often than we previously believed as they can arise even among stationary series. Furthermore, the usual "rst-di!erencing procedure may not be able to completely eliminate spurious e!ects if the data series are not only nonstationary but possess strong long memory (such as in the case where they areI(d) processes withd'1).
2. A general theory of spurious e4ects
Our analysis of the spurious e!ects are based on several simple linear regression models in which the dependent variable and the single nonconstant regressor are independentI(d) processes withdlying in di!erent ranges. Before presenting these models, let us"rst brie#y review some basic properties of the
I(d) processes. A process>
t is said to be a fractionally integrated process of orderd, denoted
asI(d), if it is de"ned by (1!¸)d>
t"et, where¸is the usual lag operator,dis
the di!erencing parameter which can be a fractional number, and the innovation sequenceet is white noise with a zero mean and"nite variance. The fractional di!erencing operator (1!¸)dis de"ned as follows: (1!¸)d"+=
j/0tj¸j, where
tj"C(j!d)/[C(j#1)C(!d)] andC()) is the gamma function. This process is
"rst introduced by Granger (1980, 1981), Granger and Joyeux (1980), and Hosking (1981). They show that>
t is stationary whend(0.5 and is invertible
whend'!0.5. The main feature of theI(d) process is that its autocovariance function declines at a slower hyperbolic rate (instead of the geometric rate found in the conventional ARMA models):
c(j)"O(j2d~1),
wherec(j) is the autocovariance function at lagj. Whend'0, theI(d) process is said to have long memory since it exhibits long-range dependence in the sense that+=
j/~=c(j)"R. Whend(0, then+=j/~=Dc(j)D(Rand the process is
sometimes referred to as an intermediate memory process. See Chung (1994) for other long-memory properties of theI(d) process.
whether the fractional di!erencing parameterdis greater than 0.5 or not. The exact speci"cations of these models can be conveniently expressed with fourI(d) processes. Let us "rst de"ne two stationary ones with di!erent di!erencing parametersd
1 andd2 whose values lie between!0.5 and 0.5:
(1!¸)d1v
t"at and (1!¸)d2wt"bt,
wherea
tandbt are two white noises with zero mean and"nite variancesp2aand
p2
b, respectively; that is,vt andwtareI(d1) andI(d2) processes, respectively, and
both of them are stationary and invertible. When these two processes are employed in our later analysis, the values of their di!erencing parameters are mostly assumed to be in (0, 0.5); i.e., the stationary processesv
tandwtare often
assumed to have long memory. We can also de"ne two nonstationaryI(1#d1) andI(1#d
2) processes by integratingvt andwt: y
t"yt~1#vt and xt"xt~1#wt.
Obviously, the orders of integration of these two nonstationary fractionally integrated processes lie between 0.5 and 1.5. Given these four fractionally integrated processes, we consider the following six simple linear regression models:
Model 1: y
t is regressed on an intercept andxt,
Model 2: v
t is regressed on an intercept andwt, whered1#d2'0.5,
Model 3: y
t is regressed on an intercept andwt, whered2'0,
Model 4: v
t is regressed on an intercept andxt, whered1'0,
Model 5: y
t is regressed on an intercept and the trendt,
Model 6: v
t is regressed on an intercept and the trendt, whered1'0.
In Model 1 the orders of integration of both the dependent variable and the regressor lie between 0.5 and 1.5, and can be equal to 1. So Model 1 may be considered a generalization of Phillips'(1986) spurious regression to the case of fractionally integrated processes. Model 2 presents the most interesting case in our analysis. In it both the dependent variable and the regressor are assumed to be stationary, ergodic, and strongly persistent in the sense that their fractional di!erencing parameters sum up to a value greater than 0.5. Following Phillips'
arguments, we tend to think no spurious e!ect should occur in such a model where variables are ergodic. But our analysis of Model 2 presents a result to the contrary. The analysis of Model 2 seems to go beyond the previous study of spurious e!ects and allows us to gain new insight into the problem.
expect the analysis of these two new models to be a mixture of those of Models 1 and 2.
In Models 5 and 6 we consider the e!ect of detrending the nonstationary and stationary fractionally integrated processes, respectively. Through these two models, we generalize the results of Durlauf and Phillips (1988). Also, Models 5 and 6 can be regarded as variants of Models 1 and 4, respectively, with the nonstationary regressor x
t replaced by the time trend. This similarity in the
model speci"cations will also be re#ected in their analytic results. The following assumption on the two white noise processesa
tandbtare made
throughout this paper.
Assumption1. Each of the two processesa
t andbtis independently and
identi-cally distributed with a zero mean; and their moments satisfy the following conditions: EDa
tDp(R, with p*maxM4,!8d1/(1#2d1)N; and EDbtDq(R
withq*maxM4,!8d
2/(1#2d2)N. Moreover,atandbtare independent of each
other.
We also assume, without loss of generality, that the initial values of the fractionally integrated processesv
0,w0,y0, andx0are all zero. Hence,yt and x
t can be considered as the partial sums of vt and wt, respectively; i.e., y
T"+Tt/1vt and xT"+Tt/1wt. The independent and identical distribution
assumption is made to simplify our analysis and could be relaxed, say, to the case wherea
t andbt are short-memory processes. See Chung (1995).
Before presenting Lemma 1, which is the cornerstone of our analysis, let us summarize two important asymptotic results on the partial sumsy
tandxt. First,
given the variancesp2
y"Var(yT) andp2x"Var(xT), Sowell (1990, Theorem 1)
proves that
p2 y&p2a
C(1!2d 1)
(1#2d
1)C(1#d1)C(1!d1) ¹1`2d1
and
p2 x&p2b
C(1!2d 2)
(1#2d
2)C(1#d2)C(1!d2) ¹1`2d2,
where z
T&wT means zT/wTP1 as ¹PR. Furthermore, Davydov (1970)
shows that as¹PR, 1
p
y y
*Tr+NBd1(r) and
1
p
x x
*Tr+NBd2(r),
forr3[0, 1], where [¹r] denotes the integer part of¹r, the notationNdenotes weak convergence, and B
which is de"ned by the following stochastic integral:
0(t) is the standard Brownian motion. See
Mandel-brot and Van Ness (1968). Our notation for the standard and the fractional versions of Brownian motions suggests that the former is a special case of the latter withd"0.
The independence betweenv
tandwtand betweenytandxtimpliesjointweak
motion, of which the two elementsB
d1 and Bd2 are independent. This result
implies the following lemma:
Lemma 1. Let Assumption 1 hold. Then,as¹PR,we have the following results:
6. 1
Moreover, joint weak convergence of 1}4, 8, 10, and 11 also applies. Here,B
d1(t)
andB
d2(t)are two independent normalized fractional Brownian motions,cv(j)and
c
w(j)are the autocovariance functions ofvtandwt,respectively,at lagj,andp2aand
p2
b are the variances of the underlying white noisesat andbt, respectively. The
notation P1 means convergence in probability.
All the theorem proofs are in the appendix. Note that, while the weak convergence ofz
t is in the space of functions, the weak convergence established
in the above lemma is in the real line, which is equivalent to convergence in distribution. Following the convention in the literature, we use the same nota-tionNfor both types of weak convergence.
In the rest of this section the results of Lemma 1 will be used to develop the theory of spurious e!ects, presented in a series of theorems and corollaries, for the proposed six models. The "rst two models will be discussed separately in Sections 2.1 and 2.2. These two models provide us with a framework which facilitates the explanations of the other four models in Sections 2.3 and 2.5. One subsection } Section 2.4 } will be devoted to the analysis of an important issue about how the orders of fractional integration are directly related to the spurious e!ects. The results in Lemma 1 have also been used in deriving the limiting distributions of the&modi"ed Durbin}Watson statistics'by Tsay (1998).
We will adopt the following notation for the various statistics from the Ordinary Least Squares (OLS) estimation. Leta( andbK denote the usual OLS estimators of the intercept and the slope. Their respective variances are esti-mated bys2b ands2a, from which we have the t ratiost
b"bK/sb andta"a(/sa.
of determination, andD=the Durbin}Watson statistic. Finally, in addition to
the autocovariance functionscv(j) andcw(j) ofv
t andwt, letov(j) andow(j) be
their respective autocorrelations at lagj.
2.1. Model 1 of nonstationary fractionally integrated processes
In Model 1 a nonstationary I(1#d
1) process yt is regressed on another
independent and nonstationaryI(1#d
2) processxt. Since the permissible range
for the values of the fractional di!erencing parametersd
1andd2is (!0.5, 0.5),
Model 1 generalizes Phillips' (1986) model of integrated processes in which
d
1"d2"0. Not surprisingly, all the results we derive for Model 1 are
straight-forward generalization of Phillips'theory of the spurious e!ects. The results for Model 1 are presented in the following theorem:
Theorem 1. Let Assumption 1 hold. Then,as¹PR,we have the following results:
1. px pybKN
:1
0Bd1(s)Bd2(s) ds![:10Bd1(s) ds][:01Bd2(s) ds]
:1
0[Bd2(s)]2ds![:10Bd2(s) ds]2
,b H.
Note thatp
y/px"O(¹d1~d2).
2. 1
p
y
a(N:1
0Bd1(s) ds!bH:10Bd2(s) ds,aH,wherebHis dexned in1.Note that
p
y"O(¹0.5`d1).
3. 1
p2 y
s2N:1
0[Bd1(s)]2ds![:10Bd1(s) ds]2!bH2M:10[Bd2(s)]2ds![:10Bd2(s) ds]2N
,p2
H, wherebH is dexned in 1.Note thatp2y"O(¹1`2d1).
4. ¹p2 x
p2y s2bN
p2 H
:1
0[Bd2(s)]2ds![:10Bd2(s) ds]2
,p2
Hb, wherep2H is dexned in 3.
Note thatp2
y/¹p2x"O(¹2d1~2d2~1).
5. ¹
p2 y
s2aNp2
H
G
1#[:1
0Bd2(s) ds]2
:1
0[Bd2(s)]2ds![:10Bd2(s) ds]2
H
,p2
Ha,wherep2H is dexned
in 3.Note thatp2y/¹"O(¹2d1).
6. 1
J¹tb NbH
7. 1
J¹ta NaH
p
Ha
,whereaHis dexned in 2 andp2
Ha is dexned in 5.
8. R2Nb2HM:10[Bd2(s)]2ds ![:1
0Bd2(s) ds]2N
:1
0[Bd1(s)]2ds![:10Bd1(s) ds]2
,whereb
H is dexned in 1.
9. D= P1 0.
Here, B
d1(t) and Bd2(t) are two independent normalized fractional Brownian
motions.
The most important result in Theorem 1 is that, as the sample size¹increases, the twotratiostbandtadiverge at the same rate ofJ¹, which is independent of the magnitudes of the fractional di!erencing parametersd
1andd2. This result is
exactly the same as what Phillips (1986) has obtained for the case where
d
1"d2"0. So even when the orders of integration in the dependent variable
and the regressor di!er from 1 by as much as 0.5, the usual problem in using the
t tests remains: the probability of rejecting the null hypothesis of b"0 or
a"0 based onttests increases monotonically as the sample size increases. Also note that Marmol (1995) generalizes Phillips'theory to cases where bothy
tand
x
t are integrated processes of the same integer orders that are higher than one.
The limiting distributions in Theorem 1 for the special case whered
1"d2are
also very similar to Marmol's results.
The limiting distributions of thetratios, after normalized byJ¹, are direct generalization of those derived by Phillips (1986). The same conclusion also holds forR2and theD=statistics. In other words, when we compare our results
with Phillips', we observe a common feature in these four statistics; namely, the nonzero values of d
1 and d2 do not a!ect their convergence rates while the
e!ects on their limiting distributions are quite straightforward: all the standard Brownian motions in Phillips' theory are replaced by fractional Brownian motions. That the fractional di!erencing parametersd
1andd2play a relative
minor role here is mainly because the four statistics are all ratios so that the e!ects of d1 and d2 are cancelled out. In contrast, the results on the OLS estimators
bK anda( are a di!erent story. In Phillips' theory bothbK anda(/J¹converge to some non-normal non-degenerate limiting distributions. But for the present model of the fractionally integrated processes, the orders ofbK anda( are¹d1~d2and
¹d1`0.5, respectively. So whilea( always diverges (though the rate can be slow if
d
1 is close to!0.5),bK can be either divergent or convergent, depending on the
relative magnitudes ofd
1andd2. For example, if the order of integration in the
dependent variabley
t is smaller than that of the regressorxt; i.e.,d1(d2, then
bK converges to zero, just like the conventional case of no spurious e!ects. Moreover, if d
1!d2"!0.5, then, similar to the case of no spurious e!ects,
2.2. Model 2 of stationary fractionally integrated processes
In this section we consider Model 2 in which a stationary fraction-ally integrated processv
tis regressed on an independent and stationary
fraction-ally integrated processw
t. We show that, although bothvtandwtare stationary,
the spurious e!ect in terms of thet tests could still exist under an additional condition on the fractional di!erencing parameters: d
1#d2'0.5. Loosely
speaking, this condition implies that the two processes v
t and wt are both
strongly persistent.
Our analysis begins with a special case where we assume a set of more stringent conditions which helps deriving the limiting distribution of the OLS estimator. This theory is based on an important result of Fox and Taqqu (1987) who show that the product of two highly persistent but stationary Gaussian processes, if adequately normalized, can converge. After examining this special case, we then show how the spurious e!ects may still exist in a more general framework.
Let us"rst reproduce Fox and Taqqu's (1987) Theorem 6.1 here as Lemma 2.
Lemma 2. Let (X
t,>t) be a stationary jointly Gaussian sequence with
E(X
t)"E(>t)"0, E(X2t)"E(>2t)"1, and E(Xt>t)"r. Suppose that p1 and
p2are two arbitrary real numbers and that there exist0(d
1, d2(0.5,such that
asjPR
E(X
tXt`j)&p21j~d1, E(Xt>t`j)&
op1p2b 1
Ja 1a2
j~(d1`d2)@2,
E(>
t>t`j)&p22j~d2, E(>tXt`j)&
op1p2b 2
Ja 1a2
j~(d1`d2)@2,
whereo is a constant between 1 and!1,whilea
1"A(d1,d1), a2"A(d2,d2), b
1"A(d1, d2),andb2"A(d2, d1)are four constants withA(d1,d2)being dexned
by:=
0x~(d1`1)@2(x#1)~(d2`1)@2dx,then
1
¹1~(d1`d2)@2
*Ts+ + t/1
(X
t>t!r)NZ(s),
where
Z(s)" p1p2 Ja
1a2
P
R2P
s0
C
2<
i/1
(u!x
i)~(di`1)@2IMxi:uN
D
dudM1(x1) dM2(x2).Here,M
1 andM2 are two Gaussian random measures with respect to Lebesgue
Note that the two processesX
t and>t are not only strongly persistent, as
indicated by the hyperbolic convergence ratesd1 andd2 in their autocorrela-tions, but also highly correlated with each other, as indicated by the hyperbolic convergence rates in their covariances. However, in our application we are only interested in the case whereX
t and>t are independent so thatrandoin the
above lemma are both zero. The above lemma o!ers us the convergence rate of
+Tt/1X
t>t and its limiting process Z(t) given the Gaussian assumption and
a narrower range for the parametersd1andd2. In order to apply this lemma, we make the following assumption in addition to Assumption 1 made earlier.
Assumption 2. The two fractionally integrated processes v
t and wt are both
Gaussian and the corresponding fractional di!erencing parameters d 1 and d
2are both in the range of (0.25, 0.5).
Given the facts that
ov(j)&C(1!d1) C(d
1)
j2d1~1 and o
w(j)& C(1!d
2) C(d
2)
j2d2~1,
it is straightforward to prove the following corollary in whichX
t and >t in
Lemma 2 are replaced byv
t/Jcv(0) andwt/Jcw(0), respectively.
Corollary 1. Let Assumptions 1 and 2 hold. Then,as¹PR,
1
¹d1`d2
T + t/1
v t Jcv(0)
w t
Jcw(0)NZ(1),
where the limiting randomvariableZ(1)is dexned in Lemma 2 withd1"1!2d
1,
d2"1!2d
2, p21"C(1!d1)/C(d1), and p22"C(1!d2)/C(d2). Consequently, we have
¹+T t/1
v t
p
y w
t
p
x
NCJcv(0))cw(0)Z(1),
where C2"C(1!2d
1)C(1!2d2)/(1#2d1)C(1#d1)C(1!d1)(1#2d2) C(1#d
2)C(1!d2).
The result of Corollary 1 supplements that of item 7 in Lemma 1. From these results, we can then establish the following theorem about the spurious e!ect in Model 2.
Theorem 2. Let Assumptions 1 and 2 hold. Then,as¹PR,we have the following
results:
1. ¹2
p
ypx
bK"O
2. ¹
p
y
a(NB
d1(1).Note thatpy/¹"O(¹d1~0.5).
3. s2 P1 c v(0).
4. ¹s2 b P1 cv
(0)
cw(0).
5. ¹s2a P1 c v(0).
6. ¹3@2
pypxtb"O1(1).Note thatpypx/¹3@2"O(¹d1`d2~0.5).
7. J¹
py taN
1
Jcv(0)Bd1(1).Note thatpy/J¹"O(¹d1).
8. ¹4
p2yp2xR2"O1(1).Note thatp2yp2x/¹4"O(¹2d1`2d2~2).
9. D= P1 2!2ov(1)"2(1!2d1)
1!d 1
.
HereB
d1(t)is a normalized fractional Brownian motions.
The most important result from this theorem is the divergence rates of the two
t ratios t
b and ta, which are ¹d1`d2~0.5 and ¹d1, respectively. Recall that
d
1#d2!0.5 is necessarily greater than 0 (and smaller than 0.5) under
Assump-tion 2. This result re#ects the spurious e!ect in the t tests. Since both the dependent variable and the regressor are stationary and ergodic, the spuri-ous e!ect is not really expected (see Phillips 1986, p. 318). The surprising results we get here suggest that the cause for the spurious e!ect has more to do with the strong persistence than stationarity and ergodicity of the variables involved.
It is interesting to compare the divergence rates of thetratios here with the
J¹rate we observe in Model 1. We note that the divergence rates in the present model depend on the magnitudes of the two fractional di!erencing parameters
d
1andd2while those in Model 1 do not. Furthermore, thetratios diverge more
slowly in the present model than in Model 1. In particular, the divergence rate of
tbcan become very slow when bothd1andd2approach to their lower boundary 0.25.
contrast to these irregular convergence rates of the OLS estimators, the estimated variancess2bands2a nevertheless converge at the standard¹~1rate. It is such disparity in the convergence rates between the OLS estimators, which converge at rates slower than¹~1@2, and their standard errors, which converge at the standard ¹~1@2 rates, that causes the resulting t ratios to diverge and hence the spurious e!ect.
R2in the present model converges to 0 as in the case of no spurious e!ects. It is di!erent from what we observe in Model 1 whereR2converges to a random variable. Consequently, as the sample size increases, the declining R2 in the present model will correctly re#ect the fact that the regressor does not help explain the variations in the dependent variable.
TheD=statistic does not converge in probability to zero and this result is
also di!erent from that of Model 1. Its limit 2!2ov(1) is similar to the one we
"nd in the conventional AR(1) case. This limit depends on the fractional di!erencing parameterd
1of the dependent variablevt and can only take value
in the range of (0,4/3), which is to the left of the value 2.
There is one technical detail that calls for some explanations: Unlike in Theorem 1, we do not give an explicit expression for the limiting distribution of
bK in Theorem 2 because such an expression requires the joint weak convergence of the sample averages of v
t, wt, and vtwt, while the proof of the joint weak
convergence is beyond the scope of the present paper. However, lacking in an explicit expression for the limiting distribution of bK does not hinder us from evaluating the convergence rate ofbK, which is all we need to show the spurious e!ects in thetratios. This same argument applies to some of the later analyses, including the following one where we consider a less restricted speci"cation of Model 2 that is de"ned by the following assumption.
Assumption 3. The sum of the two fractional di!erencing parametersd
1 and d
2is greater than 0.5.
Since the Gaussian distribution is not assumed while one of the fractional di!erencing parametersd
1andd2can be smaller than 0.25, Assumption 3 is thus
less stringent than Assumption 2.
Corollary 2. If Assumption 2 is replaced by Assumption 3 in Theorem 2,then all the
conclusions there remain true.
dependent variable and the regressor are both stationary and ergodic (so long as they are su$ciently persistent).
A profound implication from Model 2 is as follows: If we begin with Model 1 where both the dependent variable and the regressor are nonstationary fractionally integrated processes with the orders of integration 1#d
1 and
1#d
2, respectively, whered1#d2'0.5, then"rst-di!erencing both variables
cannot completely eliminate the spurious e!ects. WhileR2may be reduced and the D= statistic may be increased, the tratios may still be so large that we
cannot avoid making a spurious inference. This is a fairly serious problem with the regression for the fractionally integrated processes. It implies that even the popular"rst-di!erencing procedure might not prevent us from"nding a spuri-ous relationship among highly persistent data series. One lesson we learn from this discussion is that it is very important to check individual data series for possible long memory before regression can be applied.
2.3. Two intermediate cases: Models 3 and 4
Models 3 and 4 can be considered as two intermediate models between Models 1 and 2 in that one of the dependent variable and the regressor is stationary while the other is not. We expect the asymptotic results for these two new models to be hybrid of those of Models 1 and 2.
In Model 3 a nonstationaryI(1#d
1) processyt is regressed on an
indepen-dent and stationary I(d
2) process wt. Note that the fractional di!erencing
parameterd2for the regressorw
there is assumed to be positive; i.e.,wthas long
memory. The asymptotic properties of the OLS estimators for Model 3 are given in the following theorem:
Theorem 3. Let Assumption 1 hold. Then,as¹PR,we have the following results.
1. ¹
pypxbK"O1(1).Note thatpypx/¹"O(¹d1`d2).
2. 1
p
y
a(N:1
0Bd1(s) ds,aH.Note that py"O(¹0.5`d1).
3. 1
p2 y
s2N:1
0[Bd1(s)]2ds![:10Bd1(s) ds]2,p2H.Note thatp2y"O(¹1`2d1).
4. ¹
p2ys2bN
p2 H
cw(0),wherep2H is dexned in 3. Note thatp2y/¹"O(¹2d1).
5. ¹
p2 y
s2aNp2
6. J¹
p
x t
b"O1(1).Note thatpx/J¹"O(¹d2).
7. 1
J¹ta NaH
p
H
,wherea
H is dexned in 2 andp2H is dexned in 3.
8. ¹2
p2 x
R2"O
1(1).Note thatp2x/¹2"O(¹2d2~1).
9. D= P1 0.
HereB
d1(t)is a normalized fractional Brownian motions.
Since bothtratios diverge, Model 3 also su!ers from the spurious e!ect in terms of the t tests. Moreover, we "nd the results that the OLS estimator
a( diverge and thatD=converges in probability to 0 are close to what we get in
Model 1, while the result of convergingR2is the same as that of Model 2. So Model 3 is indeed a mixture of Models 1 and 2.
It should be pointed out that in Theorem 3 the range of the fractional di!erencing parameterd
2of the regressorwt is restricted to the positive half of
the original range (!0.5, 0.5). For the case of a negative d
2, it is quite
straightforward to show that thetratios are convergent and there is no spurious e!ect.
In Model 4 a stationaryI(d
1) processvt is regressed on an independent and
nonstationary I(1#d
2) process xt. Similar to the restriction imposed on
Model 3, the fractional di!erencing parameterd
1of the dependent variablevtis
assumed to be positive so thatv
t has long memory. The asymptotic theory for
Model 4 is presented in the following theorem.
Theorem 4. Let Assumption 1 hold. Then,as¹PR,we have the following results.
1. ¹px
py bK"O1(1).Note thatpy/¹px"O(¹d1~d2~1).
2. ¹
pya("O1(1).Note thatpy/¹"O(¹d1~0.5).
3. s2 P1 c v(0).
4. ¹p2
x)s2bN cv
(0)
:1
0[Bd2(s)]2ds![:10Bd2(s) ds]2
,p2 Hb.
Note that1/¹p2
5. ¹)s2
aNcv(0)M1#
[:1
0Bd2(s) ds]2
:1
0[Bd2(s)]2ds![:10Bd2(s) ds]2
N,p2
Ha.
6. J¹
p
y
tb"O
1(1).Note thatpy/J¹"O(¹d1).
7. J¹
p
y t
a"O1(1).Note thatpy/J¹"O(¹d1).
8. ¹2
p2y R2"O1(1).Note thatpy2/¹2"O(¹2d1~1).
9.
D= P1 2!2o
v(1)"
2(1!2d1) (1!d
1)
.
HereB
d2(t)is a normalized fractional Brownian motions.
Since bothtratios diverge (at the same rate), the spurious e!ect in terms of failingttests again exists in Model 4. But contrary to the results in Model 3, the OLS estimatorsbK anda(, together withR2, all converge in probability to zero, while theD=statistic converges to 2!2o
v(1). These"ndings obviously bring
Model 4 closer to Model 2 than to Model 1.
2.4. The relationship between the orders of integration and the divergence rates
The divergent tratios in the above four models and the resulting failure of thettests are referred to as the spurious e!ects. In this section we compare the divergence rates oftratios across the four models and investigate how they are related to the respective model speci"cations.
Recall that in Models 2}4 restrictions have been imposed on the usual ranges (!0.5, 0.5) of the fractional di!erencing parametersd
1andd2. In Model 3 the
range ofd
2is restricted to be (0, 0.5), which is also the range ofd1in Model 4,
while the sum of d
1 andd2 must be greater than 0.5 in Model 2. From the
analysis in the previous paragraph, particularly the fact that the divergence rates are directly related to the magnitudes of d
1 and d2, we come to realize that
the restricted ranges of d
1 andd2 in Models 2}4 ensure the reduction in the
divergence rates oftb from the ¹0.5 level is not too great so that tb remains divergent (in which case the spurious e!ects occur). Although we did not explicitly consider the asymptotic theory for cases where the fractional di! erenc-ing parameters lie outside their prescribed ranges, it is readily seen that the conditions we impose on the ranges are not only su$cient but also necessary for the existence of the spurious e!ect in terms of divergentt
b.
From a similar analysis for the divergence rates of thetratiot
awe also"nd
that reducing the order of integration in the dependent variable causes the divergence rate of t
a to decrease by the order of ¹d1~0.5, while reducing
the order of integration in the regressordoes not cause the divergence rate of
ta to change, as we probably should have expected.
It is also interesting to see how the changes in the orders of integration of the dependent variable and the regressor a!ect the large-sample property ofR2. Recall that in Model 1R2converges to a random variable and such asymptotic behavior ofR2is considered part of the spurious e!ect by Phillips (1986). But when we examine Models 2}4, we note that reducing the order of integration in the dependent variable helps to increase its convergence rate by the order of
¹1~2d1 while reducing the order of integration in the regressor helps to increase the convergence rate by the order of¹1~2d2. As a result, in Models 2}4,R2all converge to 0, correctly re#ecting the fact that there is no relationship between the regressor and the dependent variable. This"nding implies that the spurious e!ects in Models 2}4 are con"ned to the two t ratios while the asymptotic tendency ofR2toward zero is not a!ected by the spurious e!ects (though the convergence rates are).
The sharp di!erence in the asymptotic behavior between thetratios andR2in Models 2}4 actually o!ers us an opportunity to diagnose the spurious e!ect in these models. That is, when we"nd two highly signi"canttratios coexisting with a completely contradictory near-zero R2, we are e!ectively reminded of the possibilities that one of the Models 2}4 may be at work and that the dependent variable and the regressor may possess strong long memory, while one of them may even be nonstationary. With the possibility of such an informal diagnosis, it seems that the spurious e!ects in Models 2}4 are less damaging than those in Model 1 in the sense that in Model 1 there is no internal inconsistency among the OLS estimates to indicate the spurious e!ects.
converge unless the dependent variable is nonstationary and its order of integra-tion is su$ciently large. Secondly, whethera( diverges or not and whether the
D=statistic converges in probability to zero or not depend entirely on whether
the dependent variable is nonstationary or not. Note that, as mentioned earlier, even though the OLS estimatorsbK anda( can converge in probability to zero in the four proposed models, the correspondingtratios always diverge and it is these divergenttratios that are referred to as the spurious e!ects.
2.5. Models 5 and 6: Detrending fractionally integrated processes
As has been pointed out by Nelson and Kang (1981, 1984) and Durlauf and Phillips (1988), detrending integrated processes results in the spurious e!ect of
"nding a signi"cant trend. In this section we extend their analysis by considering the potential problems in detrending fractionally integrated processes. It turns out that the spurious e!ect of divergenttratios exists as long as the fractional di!erencing parameter is larger than zero. The implication is that whenever there is long memory in the process, the routine procedure of detrending can produce misleading results. It appears that the spurious e!ect in detrending occurs more often than we previously thought.
In our analysis of detrending fractionally integrated processes, we separate the nonstationary case from the stationary case. In Model 5 we examine the regression of a nonstationary I(1#d
1) process yt on a time trend t. The
asymptotic theory for the OLS estimation is given in the following theory.
Theorem 5. Let Assumption 1 hold. Then,as¹PR,we have the following results.
1. ¹
p
y
bKN12:1
0sBd1(s) ds!6:10Bd1(s) ds,bH.Note that py/¹"O(¹d1~0.5).
2. 1
p
y
a(N4:1
0Bd1(s) ds!6:01sBd1(s) ds,aH.
3. 1
p2ys2N:10[Bd1(s)]2ds![:01Bd1(s) ds]2!12[:10sBd1(s) ds!
1
2:10Bd1(s) ds]2
,p2 H.
4. ¹3
p2ys2bN12p2H,wherep2His dexned in 3. Note thatp2y/¹3"O(¹2d1~2).
5. ¹
p2 y
s2aN4p2H, wherep2H is dexned in 3.Note that p2
6. 1
J¹tb Nb
H/J12pH,wherebH is dexned in 1 andp2His dexned in 3.
7. 1
J¹ta Na
H/2pH,whereaH is dexned in 2 and p2His dexned in 3.
8. R2N b2H
12:1
0[Bd1(s)]2ds!12[:10Bd1(s) ds]2
,
whereb
His dexned in1.
9. D= P1 0 and p2yD=Nc v(0)/p2H.
Here,B
d1(t)is a normalized fractional Brownian motion.
The results on detrending a stationary long memoryI(d
1) processvt, which is
our Model 6, are presented in the following theorem.
Theorem 6. Let Assumption 1 hold. Then,as¹PR,we have the following results.
1. ¹2
pybKN6Bd1(1)!12:10Bd1(s) ds,bH. Note thatpy/¹2"O(¹d1~1.5).
2. ¹
pya(N6:10Bd1(s) ds!2Bd1(1),aH. Note thatpy/¹"O(¹d1~0.5).
3. s2 P1 c v(0).
4. ¹3s2
b P1 12cv(0).
5. ¹s2
a P1 4cv(0).
6. J¹
py tbNbH/J12cv(0),wherebHis dexned in 1.Note thatpy/J¹"O(¹d1).
7. J¹
py taNaH/2Jcv(0),whereaH is dexned in 2.Note thatpy/J¹"O(¹d1).
8. ¹2
p2 y
R2Nb2
9. D= P1 2!2ov(1)"2(1!2d
1)/(1!d1).
Here,B
d1(t)is a normalized fractional Brownian motion.
In terms of the convergence (or divergence) rates of the various OLS es-timators, Models 5 and 6 can be conveniently viewed as&special cases'of Models 1 and 4, respectively. More speci"cally, if we replace the termpxby¹in those normalizing factors in Theorems 1 and 4, then we immediately get all the normalizing factors in Theorems 5 and 6. For example, while the normalizing factor forbK in Theorem 1 isp
x/py, the one in Theorem 5 is¹/py. Similarly, while
the normalizing factor for bK in Theorem 4 is¹p
x/py, the one in Theorem 6 is ¹2/p
y. (Also note that the fractional Brownian motionBd2(t) does not appear in
Theorems 5 and 6 since the regressors in Models 5 and 6 are the time trend and are unrelated to the I(d
2) process wt.) Given these observations, we then
conclude that all the analyses about Models 1 and 4 can be readily extended to Models 5 and 6. In particular, the divergence rates of the t ratios, which respectively are in the orders of¹0.5and¹d1 in Models 1 and 4, are also the
rates in Models 5 and 6. (Note that in both Models 4 and 6 the same condition
d
1'0 is imposed on the stationary dependent variablevt so that the resulting t ratios are divergent.) As a result, the type of spurious e!ects we observe in Models 1 and 4 occur again in Models 5 and 6. That is, detrending a fractionally integrated process with a positive fractional di!erencing parameter, certainly including the usual case of theI(1) process, will result in the spurious"nding of a signi"cant trend. One important inference we draw from Models 5 and 6 is that the cause for the spurious e!ect in detrending a process is neither non-stationarity nor lack of ergodicity but long memory in the process.
From Models 5 and 6 we also note the following result: If the data series are nonstationary with the order of integration greater than 1, then the spurious e!ect can happen to the detrending procedure even after the series are "rst di!erenced. What "rst di!erencing does to the detrending procedure in such a case is simply reducing R2, increasing the value of the D= statistic, and
slowing down the divergence of the two t ratios from the ¹0.5 rate to the
¹d1 rate. Based on this observation, it seems that the spurious e!ects in detrending may occur more often than we previously thought.
3. Conclusion
memory, instead of nonstationarity or lack of ergodicity, that causes the spuri-ous e!ects in terms of failing t tests. Nonstationarity in one or both of the dependent variable and the regressor only helps to accelerate the divergence rates of thetratios. We thus learn that spurious e!ects might occur more often than we previously believed as they can arise even among stationary series and the usual"rst-di!erencing procedure may not be able to completely eliminate spurious e!ects when data possess strong long memory. It is interesting to note Phillips (1995) recently has o!ered some contrast thoughts about spurious regressions (and he argues that spurious regressions may not be as serious as many researchers have been led to believe).
In Section 2.4 we have carefully examined the exact relationships between the orders of integration in the fractionally integrated processes and the divergence rates in thetratios. From this analysis we gain many insights into the problem of spurious e!ects which are not available in Phillips'(1986) classical study of
I(1) processes. In short, it is found that the extents of spurious e!ects are directly related to the degrees of long memory in the data. Our results on detrending fractionally integrated processes also greatly broaden Durlauf and Phillips'
(1988) theory of spurious detrending in which the relationship between the orders of integration and the divergence rates of thetratios again plays a useful role in the analysis.
A fairly extensive Monte Carlo study has also been conducted to verify the theoretical results, especially those of convergence rates, we have established in the paper. We do not report the simulation results here other than pointing out the fact that almost all our theoretical results are well supported by simulation. A few generalizations of our study are worthy of further consideration. A natural extension is to consider the multiple regression where there are more than one nonconstant regressor. Another one is to allow the fractionally integ-rated processes to have nonzero means. Based on Phillips' (1986) work, we expect most, if not all, of the asymptotic results we obtain from the simple regression case to hold in the multiple regression of fractionally integrated processes with drifts. These issues have been examined by Chung (1995).
One aspect of our study that is slightly more restricted than Phillips, (1986) and Durlauf and Phillips' (1988) analysis is that the fractionally integrated processes we consider are built on white noisesa
t andbt that are required to
satisfy the relatively stringent conditions as speci"ed in Assumption 1. These conditions e!ectively rule out the possibility of allowing short-run dynamics such as the ARMA components in the fractionally integrated processes we have studied. Chung (1995) has studied the case where Assumption 1 is relaxed to incorporate the short-run dynamics and found no substantial changes in the analysis of spurious e!ects.
Finally, our study of spurious regression can serve as the basis for the analyses of &fractional cointegration' where the dependent variable and regressors are
has attracted a lot of attention in the literature recently. One of the pioneer works in this area is by Cheung and Lai (1993).
Acknowledgements
We are very grateful for two referees and an associate editor for their valuable suggestions.
Appendix A. Proof of Lemma 1
The proofs of items 1}3 are straightforward applications of the continuous mapping theorem to the Davydov's results. They are omitted here. Item 4 fol-lows directly from Davydov's result, while items 5 and 6 are due to ergodicity of the two stationary processesv
t andwt.
To prove item 7, we note, sincev
t andwt are assumed to be independent and
have zero means, the autocovariance of the productv
twtat lagjis the product of
their respective autocovariance at lagj: c
v(j)cw(j). Also, it is well-known that
c
v(j)"O(j2d1~1) andcw(j)"O (j2d2~1) ifd1O0 andd2O0. Consequently, we
have
Var
A
+T t/1v
twt
B
"¹ T~1+ j/~(T~1)
A
1!DjD
¹
B
cv(j)cw(j)"¹ T~1+ j/~(T~1)
O(DjD2d1`2d2~2)
G
"¹O (¹2d1`2d2~1)"O (¹2d1`2d2) if d1#d2'0.5, "¹O (ln¹)"O (¹ln¹) if d
1#d2"0.5,
"O (¹) otherwise.
Using Proposition 6.2.3 from Brockwell and Davis (1991), we thus have
T + t/1
v twt
G
"O
1(¹d1`d2) if d1#d2'0.5, "O
1[(¹ln¹)0.5] if d1#d2"0.5, "O1(¹0.5) otherwise,
and, by the facts thatp2
y"O(¹1`2d1) andp2x"O(¹1`2d2), we also have
¹+T t/1
y t
p
y w
t
p
x
G
"O1(1) if d1#d2*0.5,
"O
1[(ln¹)0.5] if d1#d2"0.5, "O
To prove item 8, we note
The orders of the four sums of squares are based on the results of items 2 and 5. We also note that¹p
"c
Note that the second equality results from the facts thaty
t andwt are
indepen-Note that the"fth equality is ensured by the assumption thatd
2'0. Given this
The result for+T
t/1(vt/py)(xt/px) can be proved in a similar fashion. To prove
item 10, we"rst note that
by applying the results of items 4 and 1. The weak limit of+T
t/1(t/¹)(wt/px) can
be derived using a similar argument. To prove item 11, we note
1
The orders of the last two terms at the end of the "rst line are based on the results of items 1 and 10. The result for (1/¹)+T
t/1(t/¹)(xt/px) can be proved by
a similar argument.
The joint weak convergence of 1}4 and 8,10,11 can be established by writing the vector of sample moments as a functional ofz
*Tr+"[p~1y y*Tr+,p~1x x*Tr+]@up to
an error of o
1(1). See Park and Phillips (1988, p. 491). But regarding the
conclusion in item 9 where we only have results on convergence rates instead of limiting distributions, the reason is that we cannot apply the usual stochastic calculus technique due to the fact thatB
d1(s) andBd2(s) are not martingale (see
De Jong and Davidson, 1997).
Appendix B. Proof of Theorem 1
Let us "rst summarize the formulas for all relevant statistics in the simple linear regression model ofy
To prove item 1, we note
where the weak convergence is due to items 1, 3, 8 of Lemma 1, and the joint weak convergence of the relevant sample moments. To prove item 2, we note
1
where the weak convergence is based on item 1 above and item 1 of Lemma 1. To prove item 3, we have
1
where the weak convergence is based on item 1 above and item 3 of Lemma 1. To prove item 4, we note
¹p2
where the weak convergence is based on item 3 above and item 3 of Lemma 1. To prove item 5, we see
where the weak convergence is based on item 1 above and item 3 of Lemma 1. To prove item 9, we note
(u(
y which converges weakly to
p2H by the result of item 3.
The proofs of Theorems 2}6 are quite similar to that of Theorem 1. Note that Park and Phillips (1988) have shown the general structure of these types of proofs. For the reason of space, the details of the proofs are omitted here.
Appendix C. Proof of Corollary 2
It su$ces to show that¹2bK/pypx"O1(1). But from the proof for item 7 of
References
Baillie, R.T., 1996. Long memory processes and fractional integration in econometrics. Journal of Econometrics 73, 6}59.
Brockwell, P.J., Davis, R.A., 1991. Time Series: Theory and Methods, 2nd Edition. Springer, New York.
Cheung, Y.-W., Lai, K.S., 1993. A fractional cointegration analysis of purchasing power parity. Journal of Business and Economic Statistics 11, 103}112.
Chung, C.-F., 1994. Calculating and analyzing impulse responses and their asymptotic distributions for the ARFIMA and VARMA models. Econometrics and Economic Theory Paper No. 9402, Michigan State University.
Chung, C.-F., 1995. Sample variances, sample covariances, and linear regression of stationary multivariate long memory processes. Preprint, Michigan State University.
De Jong, R., Davidson, J., 1997. The functional central limit theorem and weak convergence to stochastic integrals: results for weakly dependent and fractionally integrated processes. Preprint. Davydov, Y.A., 1970. The invariance principle for stationary processes. Theory of Probability and
Its Applications 15, 487}489.
Durlauf, S.T., Phillips, P.C.B., 1988. Trends versus random walks in time series analysis. Econo-metrica 56, 1333}1354.
Fox, R., Taqqu, M.S., 1987. Multiple stochastic integrals with dependent integrators. Journal of Multivariate Analysis 21, 105}127.
Granger, C.W.J., 1980. Long memory relationships and the aggregation of dynamic models. Journal of Econometrics 14, 227}238.
Granger, C.W.J., 1981. Some properties of time series data and their use in econometric model speci"cation. Journal of Econometrics 16, 121}130.
Granger, C.W.J., Newbold, P., 1974. Spurious regression in econometrics. Journal of Econometrics 2, 111}120.
Granger, C.W.J., Joyeux, R., 1980. An introduction to long-memory time series models and fractionally di!erencing. Journal of Time Series Analysis 1, 15}29.
Hosking, J.R.M., 1981. Fractional di!erencing. Biometrika 68, 165}176.
Mandelbrot, B.B., Van Ness, J.W., 1968. Fractional Brownian motions, fractional noise and applications. SIAM Review 10, 422}437.
Marmol, F., 1995. Spurious regressions between I(d) processes. Journal of Time Series Analysis 16, 313}321.
Nelson, C.R., Kang, H., 1981. Spurious periodicity in inappropriately detrended time series. Econometrica 49, 741}751.
Nelson, C.R., Kang, H., 1984. Pitfalls in the use of time as an explanatory variable in regression. Journal of Business and Economic Statistics 2, 73}82.
Nelson, C.R., Plosser, C., 1982. Trends and random walks in macro-economic time series: some evidence and implications. Journal of Monetary Economics 10, 139}162.
Park, J.Y., Phillips, P.C.B., 1988. Statistical inference in regressions with integrated processes: part 1. Econometric Theory 4, 468}497.
Phillips, P.C.B., 1986. Understanding spurious regressions in econometrics. Journal of Econometrics 33, 311}340.
Phillips, P.C.B., 1995. Nonstationary time series and cointegration. Journal of Applied Econo-metrics 10, 87}94.
Sowell, F., 1990. The fractional unit root distribution. Econometrica 58, 495}505.