The strength of evidence for unit
autoregressive roots and structural breaks:
A Bayesian perspective
John Marriott
!
, Paul Newbold
",
*
!Department of Mathematics, Statistics and Operational Research, Nottingham Trent University, Nottingham NG1 4BU, UK
"Department of Economics, University of Nottingham, Nottingham NG7 2RD, UK
Received 1 January 1998; received in revised form 1 October 1999
Abstract
Economic time series may be generated by a process with a unit autoregressive root, and the generating process may exhibit an abrupt break in trend. It is well known that the outcomes of classical tests for either one of these phenomena can be seriously in#uenced when the presence of the other is ignored. Therefore, care is required in disentangling evidence in the data supporting the two phenomena, and there is some question as to the extent to which such disentanglement is feasible. We approach this question from a Bayesian perspective, assessing the impact on the strength of evidence for each of the
phenomena in the presence of the other. ( 2000 Elsevier Science S.A. All rights
reserved.
JEL classixcation: C12; C15
Keywords: Bayesian analysis; Posterior odds; Structural breaks; Unit autoregressive roots
*Corresponding author. Tel.:#44-115-951-5392; fax:#44-115-951-4159. E-mail address:[email protected] (P. Newbold).
1. Introduction
In a seminal paper, Perron (1989) drew attention to the fact that Dickey} Fuller tests of the null hypothesis of a unit autoregressive root in the generating process of a time series could have very low power when the true process was stationary around a broken trend. Moreover, it was demonstrated that incorporating the possibility of a trend break at a given point in time could have a dramatic impact on the outcome of unit root tests. Subsequently, many authors, including Christiano (1992) and Zivot and Andrews (1992), stressed the importance of endogenous rather than exogenous selection of a break date. Ideally, as indicated for example by Perron (1989,1994), Banerjee et al. (1992) and Nunes et al. (1997), a complete analysis should permit the possibility of a break under the unit root speci"cation as well as under trend stationarity. Then, as noted by Vogelsang and Perron (1998), Dickey}Fuller-type tests, allowing for a break at an unknown point in time, can have unreliable size properties when there is a break under the null. Alterna-tively, as for example in Andrews (1993), one might want to test the null hypothesis of no trend break against the alternative of a break at an unknown point in time. Chu and White (1992) show that such tests can reject the null hypothesis far too often when the true generating process has a unit autoregres-sive root and no break. Further evidence on this phenomenon is provided by Nunes et al. (1995) and Bai (1998). The foregoing discussion suggests that there may be considerable di$culty in disentangling from data evidence on the unit root/stationarity dichotomy from evidence on the break/no break dichot-omy. Hendry and Neale (1991) provide informal graphical illustration of this point.
root/stationarity, and the e!ect of the stochastic structure of the model on the strength of evidence for a break.
There is a substantial literature on the Bayesian analysis of the possibility of a unit autoregressive root, though practically all of it neglects the possibility of a structural break. Most of the issues involved are most easily discussed in terms of the"rst-order autoregressive model for the series>
t (1!/¸)(>
t!k)"et (1)
where¸is the lag operator, ande
tis a zero-mean white noise, generally assumed to be Gaussian. The process is stationary forD/D(1, while/"1 corresponds to a random walk. The great majority of the classical theoretical and applied econometric literature seeks to distinguish between these two alternatives, and that is the problem approached here within the Bayesian paradigm. Interest in the Bayesian approach to this problem dates from Sims (1988) and Sims and Uhlig (1991). While several authors, including Phillips and Ploberger (1996) and Kim (1998) have addressed asymptotic properties of posterior distributions, and associated decision rules, our interest here is in practical implementation where sample sizes need not be large. Bayesian approaches to this problem have been considered by many authors, including De Jong and Whiteman (1991), Phillips (1991), Poirier (1991), Schotman and Van Dijk (1991a,b), Koop (1992), Uhlig (1994a,b), Schotman (1994), Zivot (1994), Lubrano (1995) and Marriott and Newbold (1998).
Ignoring for the present the choice of prior, there are at least two distinct approaches to posterior odds calculations. In one of these, values for the autoregressive parameter / of (1) are permitted to be greater than one, thus allowing the possibility of explosive models. Then, given the posterior density for/, posterior probabilities for/(1 are taken as measures of the strength of evidence for stationarity, though De Jong and Whiteman (1991) modify this somewhat by computing the probability of/(/
Newbold (1998), we"nd it inconsistent to simultaneously hold a proper prior for kunder stationarity while attaching non-zero prior probability to the random walk (/"1) model, where k is unde"ned. It is therefore tempting under stationarity to adopt an uninformative uniform improper prior fork. However, as discussed by O'Hagan (1995) in a general framework and by Schotman and Van Dijk (1991b) in the present context, when improper priors are used for parameters occurring in one model but not the other, posterior odds ratios are unde"ned. Marriott and Newbold (1998) circumvent this di$culty by analysing the series of "rst di!erences, showing that this involves no information loss about the autoregressive parameter under stationarity compared with the analy-sis of levels based on an improper prior on the mean. We shall follow the same strategy in this paper. Of course, in di!erencing, all information about the mean is lost, though in the context of the present paper we employ a proper prior on the amount of any mean shift amount, so that a posterior density for that parameter could be derived.
In this paper we adopt two approaches. First, as a benchmark we consider a prior for/that is uniform on (!1, 1). This at least has the virtue of ease of interpretation, while as illustrated in Fig. 2 and the associated discussion of Uhlig (1994b), in the region (!1, 1) it does not di!er radically from alternatives that have been proposed. That is not the case outside of that region, leading to a critique of the uniform prior by Phillips (1991) when explosive models are permitted. Then Phillips argues strongly for the Je!reys prior, demonstrating that its adoption can lead to large, and desirable, di!erences in posterior inference. It is ironic that this outcome is achieved by placing additional prior probability on/'1, a region in which there is little evidence that economists have genuine prior belief. This is, of course, an inevitable consequence of taking the posterior probability of/(1 as a measure of the strength of evidence for stationarity. Second, we follow Marriott and Newbold (1998) in arguing that imposition of point prior probability mass at/"1 would logically be asso-ciated with prior belief that the probability of/in a region close to one would be much greater than the probability of / in a comparable range far from one. These authors propose the beta prior with density
p(/)" C(a#b)
C(a)C(b)2a`b~1(1#/)a~1(1!/)b~1, D/D(1 (2)
withb"0.5, giving, as seems desirable in this application, a singularity at/"1. The larger is the parameterain (2), the more tightly concentrated towards/"1 is the prior density, and the analyst can select this parameter according to belief. For example, as emphasised by Sims (1988) and Geweke (1994), all else equal the higher the frequency of observation, the larger would one expect the autoregres-sive parameter to be. Marriott and Newbold (1998) obtained simulation results with attractive properties usinga"5, and for illustration we shall use this value here for comparison with the uniform prior.
In the remainder of this paper we allow for the possibilities of both a unit autoregressive root and a structural break. Although this seems like a natural problem for Bayesian analysis, generating posterior odds, relatively little work along these lines has been reported. An exception is De Jong (1996), whose analysis is in the same vein as De Jong and Whiteman (1991), taking the posterior probability of/(0.98 as a measure of the evidence for stationarity. Further, by contrast with our approach, point probability mass is not put on the possibility of no break.
2. Bayesian model selection for an ARMA generating process
Attention will be restricted to the case where, under stationarity with no breaks, the generating model has unknown mean but no trend, though extension to stationarity around a linear trend is quite straightforward. There is uncertainty about the presence of a unit autoregressive root in the model and also about whether there is a break in mean under stationarity. Moreover, if such a break exists, its location is uncertain. Within this framework we seek posterior prob-abilities for each of four possible structures allowing for stationarity/unit autoregressive root and break/no break.
Let >
t (t"0, 1,2,n) denote an observed time series. As one possible generating process, we consider the ARIMA(p, 1,q) model
a(¸)(1!¸)>
t"h(¸)et (3)
wheree
t is zero-mean white noise, with variancep2, a(¸)"1!a
1¸!2!ap¸p, h(¸)"1!h1¸!2!hq¸q
and¸is the lag operator. It is assumed that (p, q) are given, that the conditions for stationarity and invertibility are satis"ed, and thata(¸) andh(¸) have no common factors. The stationary alternative to (3) is the ARMA(p#1,q) model
C
1!(/#awhich reduces to (3) when /"1. The two structural break alternatives we consider are
In what follows we adopt the approach of Marriott and Newbold (1998) and formulate the four models in terms of the"rst di!erences,=
t"(1!¸)>t. In the case of the structural break models we will be working with
that is,=
t has a single&outlier', of magnituded"k2!k1, at timen1#1. The models that are to be analysed are therefore
M models proceeds by computing the posterior model probabilities, which are given by Bayes'theorem vector of ARMA parameters of modelM
i then
gives the integrated joint densities of (c
i, p, =) forM1 andM2 and
gives the integrated joint densities of (c
i, d,p,n1,=) forM3 and M4. Here, p(c
i, pDMi) andp(ci, d, p, n1DMi) are the joint prior densities for the parameters and p(=Dc
i, p, Mi) and p(=Dci, d,p, n1, Mi) are the likelihoods. For the ap-proach we are adopting here we assume the four models are equally likely a priori, so thatP(M
i)"0.25, implying that the marginal prior probability for a unit root model isP(M
1)#P(M3)"0.5.
The likelihoods can be determined through the Kalman"lter or algorithms of Newbold (1974) or Ansley (1979) and, for givenn1, take the form
p(=Dc
i, d,p,Mi)"(2np2)~(1@2)nDAiD~1@2exp
G
! 1where the elements of the matrixA
1 is taken as the discrete uniform density p(n
Integrating this with respect to pgives
p(c
Using Newbold (1974) it is straightforward to show that for these models
S(c
i,n1)"S(ci)!2d<@i¸iX#d2X@¸@i¸iX where the matrix ¸
i only involves the ARMA parameters for model Mi, <@
i<"S(ci) andXis a column vector of zeros with 1 in the (n1#1)th position. On completing the square ind, (10) can be written as
where we have written h"X@¸@
Finally we integrate (12) with respect topand sum with respect ton
1to obtain
i). The posterior model probabilities are then obtained from (7).
3. An AR(1) generating process
To get some insight into the properties of the Bayesian model selection procedure we concentrate now on the simplest special case, where the underly-ing model is either a random walk or a stationary"rst-order autoregression. The four models considered for the series=
t of"rst di!erences are then four models follow from an expression given in Newbold (1974) as
where
If we adopt the priors discussed in Section 2 straightforward but tedious calculus gives the following integrated joint densities for these models:
These can then be used to calculate the posterior model probabilities. We note here that we can compare anym)4 of the models by restricting the calculation of the posterior model probabilities to any subset of the models using (7) with min place of 4 andP(M
i)"m~1.
It is perhaps worth being precise about what generating models are being compared in such Bayesian analyses. In the stationary autoregression versus random walk comparison, the implication for the former is that nature "rst draws a parameter/from the uniform (!1, 1) distribution, and the time series is then generated from that parameter. Of course, other proper prior distribu-tions for / such as (2) could be employed. Similarly, in the structural break model, nature is assumed to draw the break date at random, with each possible break equally likely. The time series generating model then incorporates this selected break date. This seems to be a reasonable Bayesian analogue to the situation often assumed of a possible break at an unknown point in time in classical analyses of the breakpoint problem. Of course, it is possible to modify the approach by incorporating genuine prior information about likely break dates.
We conducted a series of simulation experiments in order to assess the power of the Bayesian model selection criterion in discriminating between alternative speci"cations. In each of these experiments our results are based on 1000 replications of series of 100 observations. The underlying generating model is
>
t"dZt#ut, ut"/ut~1#et (22)
where
Z
t"0; t)n1, Z
t"1; t*n1#1, ande
t are i.i.d. N(0, 1). The prior distributions of Section 2 were used, where in
p(dDp)&N(0, k2p2) it remains to set a value fork. Of course, in any practical implementation the analyst is free to select any particular value. Here, we chose a reference value for use in all simulations. This was done through a comparison of the stationary models with and without a break}that is,M
2andM4}setting the prior probabilities for each to 0.5. Data were then generated from (22), with /"d"0, and the posterior probabilitiesP(M
4D=) were calculated, and aver-aged over 1000 replications for various values of k. The outcomes decreased from 0.2 for k"3 to 0.05 for k"15. We chose to work subsequently with k"15 as the value that yields on average a posterior probability of 0.05 for a break when the true generating process is white noise with no break.
Retaining the value k"15, we continue to assess, through means of the posterior model probabilities, the relative strength of evidence forM
4compared withM
Table 1 Mean ofP(M
4D=) whenM2andM4are compared d
/ 0.0 1.0 1.5 2.0 2.5 3.0 3.5 4.0 5.0 10.0
(a)n 1"50
0.0 0.05 0.83 0.99 1.00 1.00 1.00 1.00 1.00 1.00 1.00 0.1 0.06 0.72 0.97 1.00 1.00 1.00 1.00 1.00 1.00 1.00 0.3 0.07 0.49 0.83 0.97 1.00 1.00 1.00 1.00 1.00 1.00 0.5 0.09 0.29 0.55 0.79 0.91 0.96 0.98 0.99 1.00 1.00 0.7 0.13 0.20 0.31 0.44 0.58 0.73 0.85 0.92 0.98 1.00 0.9 0.19 0.20 0.22 0.24 0.31 0.41 0.54 0.67 0.89 1.00 1.0 0.20 0.20 0.21 0.23 0.27 0.33 0.45 0.59 0.83 1.00 (b)n
1"5
0.0 0.05 0.15 0.34 0.64 0.88 0.97 1.00 1.00 1.00 1.00 0.1 0.06 0.13 0.29 0.52 0.80 0.92 0.99 1.00 1.00 1.00 0.3 0.07 0.12 0.21 0.36 0.60 0.78 0.92 0.97 1.00 1.00 0.5 0.09 0.12 0.18 0.26 0.38 0.58 0.76 0.89 0.98 1.00 0.7 0.13 0.15 0.16 0.21 0.29 0.41 0.56 0.71 0.92 1.00 0.9 0.19 0.19 0.20 0.23 0.26 0.35 0.46 0.59 0.84 1.00 1.0 0.20 0.20 0.20 0.23 0.26 0.35 0.47 0.58 0.84 1.00
a unit autoregressive root in the generating process. We considered separately two possible break dates}half-way through the series and after 5 observations. The results are shown in Table 1. As one would expect, the larger is the break sized, the greater the posterior odds in favour of a break. Also, all other things equal, there is stronger evidence for a break when that break occurs half-way through the series than when it occurs after 5 observations. The most interesting feature of the table is the increasing di$culty in distinguishing between the break and no break alternatives as the autoregressive parameter/increases. Of course, in the case of a "xed break date, simple calculations would imply increasingly imprecise estimators ofdin (22) as the problem of autocorrelated errors becomes more severe. Notice that this is not a&unit root phenomenon': the results for/"1.0 in Table 1 are not dramatically di!erent from those for /"0.9. Provided, as is done in our calculations, allowance is made for autocor-related errors, it is not very much more di$cult to detect a break in the former case than in the latter.
Table 2 Mean ofP(M
1D=) whenM1andM2are compared d
/ 0.0 1.0 1.5 2.0 2.5 3.0 3.5 4.0 5.0 10.0
(a)n 1"50
0.0 0.00 0.00 0.00 0.00 0.01 0.03 0.10 0.22 0.54 0.94 0.1 0.00 0.00 0.00 0.00 0.01 0.05 0.15 0.30 0.61 0.95 0.3 0.00 0.00 0.00 0.01 0.05 0.13 0.29 0.46 0.72 0.96 0.5 0.00 0.00 0.01 0.05 0.14 0.29 0.46 0.60 0.80 0.96 0.7 0.04 0.08 0.13 0.24 0.36 0.50 0.63 0.73 0.85 0.97 0.9 0.64 0.65 0.67 0.71 0.75 0.76 0.81 0.83 0.89 0.97 1.0 0.94 0.93 0.94 0.94 0.93 0.93 0.93 0.94 0.93 0.95 (b)n
1"5
0.0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.44 0.1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.52 0.3 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.03 0.65 0.5 0.00 0.00 0.00 0.00 0.01 0.01 0.02 0.05 0.11 0.75 0.7 0.04 0.05 0.06 0.07 0.10 0.13 0.17 0.23 0.32 0.80 0.9 0.64 0.64 0.65 0.63 0.66 0.68 0.69 0.71 0.76 0.88 1.0 0.94 0.93 0.94 0.94 0.92 0.94 0.93 0.93 0.93 0.93
a&converse Perron phenomenon'. When the true generating process is a random walk with an early (but not late) break, standard Dickey}Fuller tests that ignore the possibility of a break can produce spurious rejections of the unit root null hypothesis. In fact, this conclusion is speci"c to the case where the autoregres-sion is estimated by ordinary least squares. Leybourne and Newbold (1999) have shown that it does not occur when the symmetric weighted estimator of Pantula et al. (1994) is used. Our Bayesian calculations are based on the exact likelihood function (of the series of"rst di!erences), which of course is also symmetric in the data, and it is the use of exact likelihood rather than the Bayesian approach that accounts for the absence of the Leybourne et al. spurious rejection phenomenon in the"nal row of Table 2(b).
We now move to the Bayesian comparison of all four possible generating models. Mean posterior probabilities have been summarised in Tables 3 and 4. In the former we report the posterior probability that there is a structural break } that is P(M
3D=)#P(M4D=) } while Table 4 gives probabilities for a unit autoregressive root in the generating process}that isP(M
1D=)#P(M3D=). The most striking feature of Table 3 is that the results reported are very close indeed to those of Table 1, where no prior probability mass was attached to the random walk generating model. This is perhaps unsurprising as the analysis of Table 1 permits all values ofD/D(1, and as we already noted the di$culty of detecting a break increases steadily with/, with no impression of a discontinuity at/"1. The chief di!erences between Tables 1 and 3 occur in parts (a) of the tables for moderately large breaks where/is relatively large but less than one. In that case the mean posterior probabilities for a break are noticeably higher when the possibility of a unit root is ignored (Table 1) than when prior probability mass is attached to a random walk (Table 3). In this sense it might be said that our results re#ect some di$culty in disentangling evidence in the data for a structural break from evidence pointing to a unit autoregressive root. It is clear from parts (b) of Tables 1 and 3 that this phenomenon is far less marked when the break occurs very early in the series than when the break is at the series midpoint.
Table 3
Mean ofP(BreakD=) when all four models are compared d
/ 0.0 1.0 1.5 2.0 2.5 3.0 3.5 4.0 5.0 10.0
(a)n 1"50
0.0 0.05 0.82 0.99 1.00 1.00 1.00 1.00 1.00 1.00 1.00 0.1 0.06 0.72 0.98 1.00 1.00 1.00 1.00 1.00 1.00 1.00 0.3 0.07 0.50 0.85 0.97 1.00 1.00 1.00 1.00 1.00 1.00 0.5 0.10 0.31 0.55 0.77 0.90 0.95 0.98 0.98 0.99 1.00 0.7 0.13 0.20 0.30 0.40 0.54 0.65 0.74 0.83 0.94 1.00 0.9 0.19 0.19 0.19 0.21 0.26 0.34 0.48 0.60 0.84 1.00 1.0 0.18 0.17 0.18 0.20 0.25 0.32 0.46 0.59 0.85 1.00 (b)n
1"5
0.0 0.05 0.15 0.34 0.64 0.87 0.97 1.00 1.00 1.00 1.00 0.1 0.06 0.13 0.29 0.55 0.80 0.93 0.99 1.00 1.00 1.00 0.3 0.07 0.12 0.20 0.35 0.60 0.79 0.92 0.97 1.00 1.00 0.5 0.10 0.12 0.17 0.24 0.39 0.59 0.75 0.88 0.98 1.00 0.7 0.13 0.14 0.17 0.20 0.26 0.41 0.55 0.68 0.91 1.00 0.9 0.19 0.18 0.19 0.21 0.25 0.33 0.43 0.59 0.82 1.00 1.0 0.18 0.18 0.18 0.20 0.25 0.32 0.43 0.58 0.83 1.00
Table 4
Mean ofP(Unit rootD=) when all four models are compared d
/ 0.0 1.0 1.5 2.0 2.5 3.0 3.5 4.0 5.0 10.0
(a)n 1"50
0.5 0.00 0.00 0.01 0.01 0.02 0.02 0.02 0.02 0.01 0.00 0.7 0.04 0.08 0.12 0.17 0.22 0.23 0.23 0.19 0.10 0.06 0.9 0.62 0.64 0.67 0.68 0.71 0.72 0.72 0.71 0.67 0.62 1.0 0.93 0.93 0.93 0.93 0.93 0.93 0.93 0.93 0.93 0.93 (b)n
1"5
behaviour. However, if the break is very large, the existence of a break should be quite apparent in any competent analysis and therefore can be readily taken into account before assessing the likelihood of trend stationarity in the generating model. It is interesting in this regard to notice that the"rst and"nal columns of Table 4 are virtually identical. Put another way, for su$ciently large structural breaks, there is no di$culty in disentangling evidence in the data for a break from evidence of a unit autoregressive root.
4. The in6uence of the prior and the sample size
As we have noted, posterior odds can depend strongly on the choice of prior. The results of the previous section were based on a uniform prior for the autoregressive parameter/ under stationarity. For comparison we now con-sider also the prior density (2), witha"5 andb"0.5, termed the beta1 prior in Marriott and Newbold (1998). We again generated data from the process (22), and as in the previous section set to 15 the parameterkin the prior for break size. The analysis of this section allows all four possible models, and we report posterior probabilities for a break and for a unit root. However, to provide a contrast with the results of Section 3, we tabulate here the proportions of times each of these posterior probabilities is bigger than 0.5, as such criteria might be employed in decision rules. Finally, the impact of sample size was investigated by generating samples of both 100 and 200 observations, with breaks occurring a fractionq"0.5 or 0.1 through the series. Table 5 shows the proportion of times the posterior probability for a break exceeded 0.5, while the corresponding results for the posterior probability of a unit root are given in Table 6.
The results of Table 5 reinforce some of the conclusions drawn from Table 3. For example the closer is/to 0 the easier is a break to detect, and a break in the middle of a series is easier to detect than an early break. Unsurprisingly, we also see from Table 5 that the choice of prior for/does not have dramatic in#uence on the quality of inference about a structural break. However, it is noticeable that, when the true value of/is not high, a break would be correctly selected somewhat more often when the uniform prior for/is used than for the beta1 prior. This is presumably a consequence of the fact that values of/far from one are a priori unlikely according to the beta1 prior. It is also clear from Table 5 that, as would be hoped, with increasing sample size a break is more often correctly detected, except in the case where the autoregressive parameter is very large.
Table 5
Proportion of timesP(BreakD=)'0.5 when all four models are compared, for break fractionq! d
/ 0 1 2 3 4 5 10
(a)q"0.50
0.0 0.01 0.87 1.00 1.00 1.00 1.00 1.00
0.00 1.00 1.00 1.00 1.00 1.00 1.00
0.00 0.81 1.00 1.00 1.00 1.00 1.00
0.00 1.00 1.00 1.00 1.00 1.00 1.00
0.1 0.01 0.76 1.00 1.00 1.00 1.00 1.00
0.01 0.99 1.00 1.00 1.00 1.00 1.00
0.00 0.69 1.00 1.00 1.00 1.00 1.00
0.00 0.99 1.00 1.00 1.00 1.00 1.00
0.3 0.01 0.44 0.99 1.00 1.00 1.00 1.00
0.01 0.88 1.00 1.00 1.00 1.00 1.00
0.01 0.38 0.99 1.00 1.00 1.00 1.00
0.01 0.84 1.00 1.00 1.00 1.00 1.00
0.5 0.01 0.23 0.84 0.98 0.99 1.00 1.00
0.01 0.50 1.00 1.00 1.00 1.00 1.00
0.01 0.14 0.77 0.97 0.99 1.00 1.00
0.01 0.46 1.00 1.00 1.00 1.00 1.00
0.7 0.03 0.09 0.31 0.65 0.86 0.95 1.00
0.01 0.19 0.75 0.99 1.00 1.00 1.00
0.02 0.04 0.23 0.58 0.86 0.96 1.00
0.01 0.16 0.69 0.97 1.00 1.00 1.00
0.9 0.03 0.02 0.07 0.20 0.55 0.83 1.00
0.03 0.03 0.07 0.26 0.59 0.85 1.00
0.02 0.02 0.06 0.22 0.57 0.88 1.00
0.03 0.03 0.07 0.25 0.62 0.90 1.00
1.0 0.01 0.02 0.05 0.21 0.58 0.84 1.00
0.03 0.03 0.04 0.18 0.51 0.82 1.00
0.03 0.02 0.04 0.21 0.54 0.86 1.00
0.02 0.02 0.04 0.17 0.49 0.83 1.00
(b)q"0.10
0.0 0.01 0.22 0.97 1.00 1.00 1.00 1.00
0.00 0.68 1.00 1.00 1.00 1.00 1.00
0.00 0.18 0.94 1.00 1.00 1.00 1.00
0.00 0..06 1.00 1.00 1.00 1.00 1.00
0.1 0.01 0.19 0.91 1.00 1.00 1.00 1.00
0.01 0.49 1.00 1.00 1.00 1.00 1.00
0.00 0.14 0.88 1.00 1.00 1.00 1.00
Table 5 (Continued) d
/ 0 1 2 3 4 5 10
0.3 0.01 0.09 0.66 0.98 1.00 1.00 1.00
0.01 0.28 0.97 1.00 1.00 1.00 1.00
0.01 0.07 0.57 0.97 1.00 1.00 1.00
0.01 0.23 0.97 1.00 1.00 1.00 1.00
0.5 0.01 0.05 0.33 0.79 0.98 1.00 1.00
0.01 0.13 0.70 0.99 1.00 1.00 1.00
0.01 0.03 0.27 0.72 0.97 1.00 1.00
0.01 0.09 0.69 0.99 1.00 1.00 1.00
0.7 0.03 0.04 0.12 0.41 0.78 0.96 1.00
0.01 0.04 0.25 0.72 0.97 1.00 1.00
0.02 0.02 0.10 0.38 0.75 0.97 1.00
0.01 0.04 0.20 0.69 0.97 1.00 1.00
0.9 0.03 0.03 0.06 0.21 0.55 0.84 1.00
0.03 0.03 0.05 0.21 0.56 0.84 0.97
0.02 0.03 0.05 0.22 0.57 0.87 1.00
0.03 0.02 0.06 0.18 0.55 0.87 0.97
1.0 0.01 0.02 0.05 0.19 0.55 0.86 1.00
0.03 0.02 0.04 0.18 0.49 0.83 0.97
0.03 0.02 0.04 0.21 0.55 0.87 1.00
0.02 0.03 0.05 0.17 0.49 0.81 0.99
!In each cell, entries are respectively; uniform prior,n"100; uniform prior,n"200; beta1 prior, n"100; beta1 prior,n"200.
Table 6
Proportion of timesP(Unit rootD=)'0.5 when all four models are compared, for break fractionq! d
/ 0 1 2 3 4 5 10
(a)q"0.50
0.7 0.00 0.02 0.08 0.18 0.14 0.06 0.01
0.00 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00 0.02 0.01 0.00
0.00 0.00 0.00 0.00 0.00 0.00 0.00
0.8 0.13 0.20 0.36 0.51 0.43 0.26 0.20
0.00 0.00 0.01 0.04 0.05 0.02 0.00
0.00 0.00 0.02 0.06 0.07 0.03 0.00
0.00 0.00 0.00 0.00 0.00 0.00 0.00
0.9 0.73 0.74 0.78 0.83 0.82 0.77 0.72
0.27 0.31 0.43 0.54 0.51 0.40 0.29
0.15 0.18 0.26 0.36 0.32 0.22 0.17
0.00 0.01 0.03 0.04 0.06 0.03 0.01
0.95 0.91 0.93 0.93 0.94 0.91 0.93 0.93
0.82 0.83 0.84 0.86 0.85 0.83 0.83
0.55 0.53 0.56 0.60 0.60 0.57 0.52
0.28 0.27 0.34 0.36 0.38 0.33 0.27
1.0 0.98 0.98 0.98 0.99 0.98 0.98 0.98
0.97 0.97 0.97 0.97 0.98 0.97 0.98
0.86 0.84 0.84 0.86 0.84 0.84 0.86
0.89 0.87 0.89 0.89 0.89 0.86 0.89
(b)q"0.10
0.7 0.00 0.01 0.02 0.05 0.07 0.03 0.01
0.00 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00 0.01 0.00 0.00
0.00 0.00 0.00 0.00 0.00 0.00 0.00
0.8 0.13 0.17 0.22 0.30 0.30 0.23 0.14
0.00 0.00 0.01 0.00 0.01 0.01 0.00
0.00 0.00 0.01 0.02 0.03 0.02 0.00
0.00 0.00 0.00 0.00 0.00 0.00 0.00
0.9 0.73 0.73 0.75 0.76 0.77 0.74 0.72
0.27 0.29 0.35 0.40 0.40 0.33 0.29
0.15 0.16 0.22 0.21 0.23 0.19 0.18
0.00 0.01 0.03 0.02 0.03 0.03 0.02
0.95 0.91 0.91 0.92 0.92 0.92 0.93 0.92
0.82 0.84 0.84 0.83 0.86 0.83 0.81
0.55 0.55 0.55 0.57 0.56 0.53 0.53
Table 6 (Continued) d
/ 0 1 2 3 4 5 10
1.0 0.98 0.98 0.98 0.98 0.98 0.97 0.97
0.97 0.98 0.98 0.97 0.97 0.97 0.97
0.86 0.86 0.84 0.84 0.85 0.86 0.85
0.89 0.87 0.88 0.90 0.88 0.90 0.89
!In each cell, entries are respectively; uniform prior,n"100; uniform prior,n"200; beta1 prior, n"100; beta1 prior,n"200.
probability falls sharply for values of the autoregressive parameter less than one, yielding the conclusion that, as one might hope, the larger is the sample size the surer is the distinction between an integrated process and a stationary one.
5. Recursive odds
The procedures of this paper can be applied on-line in real time, with posterior odds re-calculated as each new observation becomes available. We investigate the behaviour of such recursive calculations through a single example, interest being in the impact on the posterior probabilities of a struc-tural break. In our example, with data again generated from (22), a break of magnituded"3 is taken to occur at observation 50. Using the beta1 prior on/, posterior probabilities are calculated for series ofnobservations for values of nfrom 45 to 100, so that the e!ect of the break on inference can be assessed. Table 7 shows the proportions of times the posterior probabilities of a break and of a unit root are greater than 0.5 for both a random walk generating process and a stationary"rst-order autoregression with/"0.8.
From Table 7 it can be seen that, immediately a break occurs, there is a jump in the proportion of times the posterior probability of a break exceeds 0.5, for both the stationary and the random walk generating models. Then, as further observations accrue, this proportion increases for the stationary but remains quite#at for the random walk model. This last conclusion is consistent with the "nding in Table 5 of no greater ability to identify a structural break with increasing sample size in the random walk case.
Table 7
Proportion of timesP(BreakD=)'0.5 andP(Unit rootD=)'0.5 for series ofnobservations with a break of sized"3 at observation 50, using beta1 prior on/
Break Unit root
n /"0.8 /"1.0 /"0.8 /"1.0
45 0.01 0.01 0.12 0.78
46 0.01 0.01 0.11 0.79
47 0.01 0.01 0.09 0.79
48 0.01 0.01 0.08 0.79
49 0.01 0.01 0.08 0.79
50 0.15 0.15 0.15 0.78
51 0.15 0.18 0.14 0.81
52 0.15 0.18 0.15 0.80
53 0.16 0.16 0.17 0.79
54 0.18 0.16 0.17 0.81
55 0.15 0.17 0.18 0.80
70 0.23 0.18 0.16 0.82
80 0.29 0.18 0.11 0.81
90 0.30 0.18 0.08 0.82
100 0.36 0.20 0.04 0.84
sample size becomes su$ciently large. Thus, the immediate impact of a struc-tural break is to lower the chances of correctly identifying a stationary process. Moreover, this impact persists for some time with increasing sample size, it requiring a further 20}30 observations before the chance of successfully identify-ing the stationary process begins to steadily rise.
6. The impact of a neglected second break
The methodology of this paper can be extended, at the cost of considerable further computational expense, to the case of two (or more) possible breaks. An alternative if the possibility of several outliers is suspected is to use a robust method, such as basing the likelihood on a Student'st-distribution rather than the normal, as proposed by Hoek et al. (1995).
Table 8
Mean ofP(Unit rootD=) for series of 100 observations with two breaks, each of sized"3! n
1 51 52 53 54 55 65 85 None
/"0.9 0.51 0.51 0.52 0.51 0.52 0.51 0.49 0.37
/"1.0 0.69 0.69 0.68 0.68 0.69 0.68 0.67 0.66
!The"rst break is at observation 50, and the second at observationn
1. Beta1 prior on/is used.
invariant to the location of the second break. The"nal entry in the table is for the case where the assumption of a single break is in fact true. We therefore estimate that, in the case of breaks of this magnitude, neglecting the possibility of a second break leaves the mean posterior probability of a random walk virtually unchanged when the true generating process is a random walk, but increases that mean posterior probability from about 0.37 to approximately 0.5 when data are generated by a "rst-order autoregression with parameter /"0.9. Thus, while the methodology retains some power to distinguish between the unit root process and the stationary alternative, that power is diminished when there is a neglected second break.
7. Summary
for very large breaks. Although they relate to a single simple speci"c model, we view these results as quite encouraging. For representative sample sizes of one hundred and two hundred observations, it appears that the tasks of distinguish-ing stationarity from unit roots and break from no break simultaneously are far from impossible.
Of course, as must inevitably be the case with posterior odds calculations of the sort we advocate, the speci"c results depend on the prior. This is simply a consequence of the fact that two generating processes are being compared }one in which the autoregerssive parameter is"xed at one, and a second in which that parameter is drawn from the prior before the series is generated. Naturally, the choice of this prior has an impact on posterior comparisons. We do not see this factor as a weakness of the Bayesian approach. On the contrary, the requirement to select a prior provides the analyst with an opportunity to think seriously aboout the problem. For example, as argued by Marriott and Newbold (1998), if point probability mass is to be put on a value of one for the autoregressive parameter }or, indeed if that value is to be given the special status of a null hypothesis in a classical analysis }a sensible prior for that parameter in the stationary case should, like the beta distribution used here, attach considerable mass to probabilities of ranges close to one. While such a facility is readily open to the Bayesian, it would be di$cult to incorporate its analogue into a classical analysis.
Acknowledgements
The authors are grateful to Arnold Zellner, an associate editor, and two referees whose comments led to substantial improvements in the paper.
References
Andrews, D.W.K., 1993. Tests for parameter instability and structural change with unknown change point. Econometrica 61, 821}856.
Ansley, C.F., 1979. An algorithm for the exact likelihood of a mixed autoregressive-moving average process. Biometrika 66, 59}65.
Bai, J., 1998. A note on spurious break. Econometric Theory 14, 663}669.
Banerjee, A., Lumsdaine, R.L., Stock, J.H., 1992. Recursive and sequential tests of the unit root and trend break hypotheses: theory and international evidence. Journal of Business and Economic Statistics 10, 271}287.
Box, G.E.P., Jenkins, G.M., 1970. Time Series Analysis, Forecasting and Control. Holden Day, San Francisco.
Christiano, L.J., 1992. Searching for a break in GNP. Journal of Business and Economic Statistics 10, 237}250.
De Jong, D.N., 1996. A Bayesian search for structural breaks in U.S. GNP. Advances in Econo-metrics B 11, 109}146.
De Jong, D.N., Whiteman, C.H., 1991. Trends and random walks in macroeconomic time series: A reconsideration based on the likelihood principle. Journal of Monetary Economics 28, 221}254.
Geweke, J., 1994. Priors for macroeconomic time series and their application. Econometric Theory 10, 609}632.
Hendry, D.F., Neale, A.J., 1991. A Monte Carlo study of the e!ects of structural breaks on tests for unit roots.. In: Hackl, P., Westlund, A.H. (Eds.), Economic Structural Change: Analysis and Forecasting.. Springer, New York.
Hoek, H., Lucas, A., Van Dijk, H.K., 1995. Classical and Bayesian aspects of robust unit root inference. Journal of Econometrics 69, 27}59.
Je!reys, H., 1967. Theory of Probability, 3rd Revised Edition. Oxford University Press, Oxford. Kim, J.Y., 1998. Large sample properties of posterior densities, Bayesian information criterion
and the likelihood principle in nonstationary time series models. Econometrica 66, 359}380.
Koop, G., 1992.&Objective'Bayesian unit root tests. Journal of Applied Econometrics 7, 65}82. Leybourne, S.J., Mills, T.C., Newbold, P., 1998. Spurious rejections by Dickey}Fuller-tests in the
presence of a break under the null. Journal of Econometrics 87, 191}203.
Leybourne, S.J., Newbold, P., 1999. Behaviour of Dickey}Fuller-type tests when there is a break under the null hypothesis. Department of Economics, University of Nottingham.
Lubrano, M., 1995. Testing for unit roots in a Bayesian framework. Journal of Econometrics 69, 81}109.
Marriott, J., Newbold, P., 1998. Bayesian comparison of ARIMA and stationary ARMA models. International Statistical Review 66, 323}336.
Newbold, P., 1974. The exact likelihood function for a mixed autoregressive-moving average process. Biometrika 61, 423}426.
Nunes, L.C., Kuan, C.M., Newbold, P., 1995. Spurious break. Econometric Theory 11, 736}749. Nunes, L.C., Newbold, P., Kuan, C.M., 1997. Testing for unit roots with breaks: evidence on the
Great Crash and the unit root hypothesis reconsidered. Oxford Bulletin of Economics and Statistics 59, 435}448.
O'Hagan, A., 1995. Fractional Bayes factors for model comparison. Journal of the Royal Statistical Society B 57, 99}138.
Pantula, S.G., Gonzales-Farias, G., Fuller, W.A., 1994. A comparison of unit root test criteria. Journal of Business and Economic Statistics 12, 449}459.
Perron, P., 1989. The Great Crash, the oil price shock and the unit root hypothesis. Econometrica 57, 1361}1401.
Perron, P., 1994. Trend, unit root, and structural change in macroeconomic time series. In: Rao, B.B. (Ed.), Cointegration for Applied Economists. MacMillan, New York.
Perron, P., Vogelsang, T.J., 1992. Nonstationarity and level shifts with an application to purchasing power parity. Journal of Business and Economic Statistics 10, 301}320.
Phillips, P.C.B., 1991. To criticise the critics: An objective Bayesian analysis of stochastic trends. Journal of Applied Econometrics 6, 333}364.
Phillips, P.C.B., Ploberger, W., 1996. An asymptotic theory of Bayesian inference for time series. Econometrica 64, 381}412.
Poirier, D.J., 1991. A comment on &To criticise the critics: An objective Bayesian analysis of stochastic trends'. Journal of Applied Econometrics 6, 381}386.
Schotman, P.C., 1994. Priors for the AR(1) model: Parameterization issues and time series consider-ations. Econometric Theory 10, 579}595.
Schotman, P.C., Van Dijk, H.K., 1991b. On Bayesian routes to unit roots. Journal of Applied Econometrics 6, 387}401.
Sims, C.A., 1988. Bayesian skepticism on unit root econometrics. Journal of Economic Dynamics and Control 12, 436}474.
Sims, C.A., Uhlig, H., 1991. Understanding unit rooters: A helicopter tour. Econometrica 59, 1591}1600.
Uhlig, H., 1994a. On Je!reys prior when using the exact likelihood function. Econometric Theory 10, 633}644.
Uhlig, H., 1994b. What macroeconomists should know about unit roots: A Bayesian perspective. Econometric Theory 10, 645}671.
Vogelsang, T.J., Perron, P., 1998. Additional tests for a unit root allowing for a break in the trend function at an unknown time. International Economic Review 39, 1073}1100.
Zellner, A., 1997. Past and recent results on maximal data information priors. In: Bayesian Analysis in Econometrics and Statistics: the Zellner View and Papers. Edward Elgar, pp. 127}148. Zivot, E., 1994. A Bayesian analysis of the unit root hypothesis within an unobserved components
model. Econometric Theory 10, 552}578.