07350015%2E2012%2E727718

(1)

Full Terms & Conditions of access and use can be found at

http://www.tandfonline.com/action/journalInformation?journalCode=ubes20

Download by: [Universitas Maritim Raja Ali Haji] Date: 11 January 2016, At: 21:59

Journal of Business & Economic Statistics

ISSN: 0735-0015 (Print) 1537-2707 (Online) Journal homepage: http://www.tandfonline.com/loi/ubes20

Real-Time Inflation Forecasting in a Changing

World

Jan J. J. Groen , Richard Paap & Francesco Ravazzolo

To cite this article: Jan J. J. Groen , Richard Paap & Francesco Ravazzolo (2013) Real-Time Inflation Forecasting in a Changing World, Journal of Business & Economic Statistics, 31:1, 29-44, DOI: 10.1080/07350015.2012.727718

To link to this article: http://dx.doi.org/10.1080/07350015.2012.727718

View supplementary material

Accepted author version posted online: 12 Sep 2012.

Submit your article to this journal

Article views: 849

(2)

Real-Time Inflation Forecasting in a Changing

World

Jan J. J. G

ROEN

Research and Statistics Group, Federal Reserve Bank of New York, New York, NY 10045 ([email protected])

Richard P

AAP

Econometric Institute, Erasmus University Rotterdam, Rotterdam, Netherlands ([email protected])

Francesco R

AVAZZOLO

Research Department, Norges Bank, Oslo, Norway ([email protected])

This article revisits the accuracy of inflation forecasting using activity and expectations variables. We apply Bayesian model averaging across different regression specifications selected from a set of potential predictors that includes lagged values of inflation, a host of real activity data, term structure data, (relative) price data, and surveys. In this model average, we can entertain different channels of structural instability, by either incorporating stochastic breaks in the regression parameters of each individual specification within this average, or allowing for breaks in the error variance of the overall model average, or both. Thus, our framework simultaneously addresses structural change and model uncertainty that would unavoidably affect any inflation forecast model. The different versions of our framework are used to model U.S. personal consumption expenditures (PCE) deflator and gross domestic product (GDP) deflator inflation rates for the 1960–2011 period. A real-time inflation forecast evaluation shows that averaging over many predictors in a model that at least allows for structural breaks in the error variance results in very accurate point and density forecasts, especially for the post-1984 period. Our framework is especially useful when forecasting, in real-time, the likelihood of lower-than-usual inflation rates over the medium term. This article has online supplementary materials.

KEY WORDS: Bayesian model averaging; Density forecasting; Model uncertainty; Real-time data; Structural breaks.

1. INTRODUCTION

Control of inflation is at the core of monetary policymak-ing and, consequently, central bankers have a great interest in reliable inflation forecasts to help them in achieving this aim. For other agents in the economy, accurate inflation forecasts are likewise of importance, either to be able to assess how pol-icymakers will act in the future or to help them in forming their inflation expectations when negotiating about wages, price contracts, and so on. And in the academic literature, inflation predictability is assessed to get a gauge on the characteristics of inflation dynamics in general.

The time series properties of inflation measures, however, have changed substantially over time, as shown by Cogley and Sargent (2002,2005) for the United States, by Benati (2004) for the United Kingdom, and by Levin and Piger (2004) for 12 main OECD (Organisation for Economic Co-operation and De-velopment) economies, all of which document significant time variation in the mean and persistence of inflation. Related to this, Cogley and Sargent (2002) and Haldane and Quah (1999) docu-mented substantial shifts in the traditional U.S. and U.K. Phillips curve correlations between inflation and unemployment over the post-WWII (World War II) period. There is also evidence that macroeconomic time series have experienced variance breaks over the post-WWII period that were unrelated to shifts in the mean. Cogley and Sargent (2005) and Cogley, Primiceri, and Sargent (2010), for example, used a time-varying vector autore-gression (VAR) model for U.S. inflation, unemployment, and

the interest rate with a stochastic volatility specification for the corresponding disturbance covariance matrix. Sensier and van Dijk (2004) found that for 80% of 214 U.S. macroeconomic time series over the 1959–1999 period, most of the observed reduction in volatility is due to breaks in conditional volatility rather than breaks in the conditional mean. In addition, Sims and Zha (2006) claimed using structural Bayesian VAR models with Markov switching time variation that the observed time varia-tion in U.S. macroeconomic dynamics is entirely due to breaks in the variance of shocks and not in regression parameters.

This suggests that adding structural change to time series models may help to improve forecasting inflation. Stock and Watson (2007,2008) showed that U.S. inflation is well described by a univariate unobserved component model with a stochastic volatility specification for the disturbances. The out-of-sample performance of this particular model appears to be hard to beat by a range of alternative models. More generally, Koop and Potter (2007), through change-point models, and Pesaran, Pettenuzzo, and Timmermann (2006), through a hierarchical hidden Markov chain model, showed that forecast models that incorporate structural breaks exhibit good out-of-sample fore-casting performance for a range of macroeconomic series.

January 2013, Vol. 31, No. 1 DOI:10.1080/07350015.2012.727718

29

(3)

Another issue for inflation forecasting is how to choose the predictor variables for future inflation. For example, a reduced-form version of the Phillips curve relationship implies for fore-casting a model where inflation depends on its lags, a measure of real activity (which approximates the degree of “economic slack” or excess demand in the economy) and, possibly, a mea-sure of inflation expectations. Typically, unemployment is used as the “slack measure” in such an inflation forecasting model. There is, however, a lot of uncertainty about the “appropri-ate” measure of real activity that can be used in such a fore-casting model. In fact, Stock and Watson (1999) showed that unemployment-based Phillips curve models are frequently out-performed by models using alternative real activity measures. They considered two approaches. One is based on a forecast combination of the different, possible choices of Phillips curve forecasting models. Next, they also considered a single Phillips curve-based model that uses a principal component extracted from all possible “economic slack” variables as the real activity measure.

Stock and Watson (1999) showed that the out-of-sample per-formance of these approaches is favorable compared with au-toregressive (AR) specifications, in particular in the case of the factor-based approach. Wright (2009) applied a form of Bayesian model averaging (BMA) across 93 potential specifi-cations, each using one alternative activity measure, to forecast different quarterly U.S. inflation measures out-of-sample, and he was able to beat AR inflation forecasts out-of-sample. Atke-son and Ohanian (2001), on the other hand, applied the Stock and Watson (1999) exercise on a longer U.S. sample. They used Phillips curve specifications based on unemployment and the Chicago Fed Economic Activity index, which summarizes in-formation across a range of activity variables, and across a large grid of lag orders of the predictor variables. In the Atkeson and Ohanian (2001) case, none of the inflation forecasting models are able to outperform random walk forecasts for inflation.

Similar to Stock and Watson (1999), we use in this article a forecasting model that is essentially an AR model for inflation with added exogenous regressors (an ARX model). But unlike the articles surveyed above, we use a framework that allows for both instability in the relationship between inflation and predictor variables and uncertainty regarding the inclusion of potential predictors in the inflation forecasting regression. BMA is used to deal with the latter model uncertainty, where we average over the range of regression models that incorporate all the possible combinations of predictor variables for inflation. To deal with instability, we allow for structural breaks of random magnitude in either the regression parameters, the error variance of the resulting model average, or both. Hence, our forecasting procedure simultaneously incorporates the two major sources of uncertainty, which the literature has shown to be relevant for forecasting and modeling inflation.

Our framework, described above, and other more regularly used approaches are used to model different definitions of U.S. inflation on a quarterly sample starting in 1960 and ending in 2011. A range of predictor variables are considered in the mod-eling exercise, including real variables, (relative) price data, sur-veys and financial variables, and lags of inflation. Full-sample results show that our methodology has time-varying proper-ties similar to existing studies and it identifies inflation

predic-tor variable combinations of different composition and size for these U.S. inflation rates.

The different BMA-based specifications are then used to fore-cast the different inflation measures, both in the current quarter and for a one-year ahead forecasting horizon. Where necessary, we use in the out-of-sample forecasting experiments real-time data for inflation and the predictor variables, that is, the original vintage of data that was available at the time of the forecast. The literature traditionally focuses on point forecast evaluation, see, for example, Stock and Watson (2007,2008,2010) and Faust and Wright (2011), but given the interest among practitioners in predicting the likelihood of exceptional inflation movements, such as deflation or accelerating inflation, an assessment of the relative density forecast performance among different inflation forecast models seems in order. Therefore, our real-time infla-tion forecasting analysis will focus on evaluating both point forecasts and density forecasts.

Manzan and Zerom (2009) evaluated distributions of U.S. inflation forecasts by means of quantile regressions for a number of macroeconomic indicators based on rolling window-based estimation, but they did not use real-time data. Jore, Mitchell, and Vahey (2010) and Clark (2011) assessed inflation density forecasts from, respectively, forecast combinations and four-variable VAR models based on real-time data that allow for a limited degree of instability through rolling windows and the use of stochastic volatility specifications. The latter studies are closest to our exercise, but we push the envelope further, as (1) our BMA-based specifications more explicitly take into account model uncertainty and differing degrees of structural instability, and (2) we consider a larger pool of alternative inflation forecast models.

The remainder of this article is organized as follows. In Sec-tion2, we introduce the framework for our BMA-based inflation models, which includes a detailed discussion of the underlying Bayesian estimation methodology. In Section 3, we describe the dataset and we operationalize our models by determining the prior values. Next, in Section4, we first describe some in-sample properties of our BMA-based models and then we eval-uate their real-time forecasting performance by comparing them with other frequently used inflation forecasting models. We find that allowing for model uncertainty in combination with some form of structural instability results in superior point and den-sity forecasts vis-`a-vis other inflation forecasting approaches. Finally, in Section5we conclude.

2. A BMA FRAMEWORK FOR INFLATION

In this article, we will use the following version of the Stock and Watson (1999) generalized Phillips curve specification as the starting point for our inflation forecasting model:

yt+h =β0+

where T is the total number of time series observations in the sample. Variableyt in (1) is the inflation measure, defined

(4)

as the quarterly inflation rateyt =100ln(Pt)=100(ln(Pt)− ln(Pt−1)), wherePt is a particular price index andh >0 is the forecast horizon withyt+h=100ln(Pt+h)=100(ln(Pt+h)− ln(Pt+h−1)). The aj t’s are the k1 real activity, costs indicator,

and inflation expectation proxy variables and the model also containsk2lagged values ofyt. The disturbance termεtin (1) is

Clearly, the number of potential predictor variableskin (1) can be large. It can encompass traditional “cost-push” drivers of inflation such as wage and production costs (the latter among others related to energy and imports), as well as proxies for excess demand in the economy that can push up inflation. A large number forkrenders the model inestimable and we therefore have to make a choice about which combination of predictors to include under what circumstances. Hence, we have to adapt (1) such that it incorporates this model uncertainty.

In (1), we model the quarterly inflation rate in quartert+h and therefore view inflation as anI(0) process, as in, for ex-ample, Ang, Bekaert, and Wei (2007). Other studies, such as Stock and Watson (1999,2003), assume that inflation isI(1) and needs to be transformed accordingly. We assume that any apparent nonstationarity in inflation rates is due to mean and/or variance breaks, which we want to incorporate in models when forecasting inflation. It is not realistic to assume that the re-lationship between inflation and its potential predictors in (1) has remained stable over time. Cogley and Sbordone (2008) and Groen and Mumtaz (2008) showed that an empirical New Keynesian Phillips curve model that allows for shifts in the equilibrium inflation rates yields a time-varying reduced-form inflation–activity relationship, given unchanged “deep parame-ters,” for a number of G7 economies. Stock and Watson (2008,

2010), on the other hand, suggested that there are nonlinear-ities in the relationship between inflation and the amount of “slack” in the economy, which could result in time variation in an inflation forecasting relationship such as in (1).

2.1 Our Framework

The previous discussion implies that for forecasting infla-tion we need to adapt the basic regression model (1) such that it incorporatesmodel uncertaintyandstructural breaksas both in-flation itself and the relationship between inin-flation and predictor variables likely have changed over time.

In our context, model uncertainty implies the uncertainty about which combination of predictor variables most accurately summarizes the impact of real activity, real costs, and expecta-tions on inflation dynamics. To allow for model uncertainty, we introduce in (1)kvariablesδj ∈ {0,1}, which describe the in-clusion of variablexj tin the regression model forj =1, . . . , k. Structural breaks in the regression parameters and the variance are incorporated by introducing time-varying regression param-etersβj tandσt in (1). This results in

scribes which regressors are included in the regression model. It can take 2k_{different values, resulting in 2}k_{different regression} models. Model selection is therefore defined in terms of vari-able selection; see George and McCulloch (1993) and Kuo and Mallick (1998). Note that the intercept parameterβ0t is always included in the model.

The time-varying parametersβj tandσtare described by

βj t=βj,t−1+κj tηj t, j =0, . . . , k,

k+1). The size of the structural breaks is described

by an independent random shockηj twith mean zero and vari-anceq2

jforj =0, . . . , k+1; see, for example, Koop and Potter (2007) and Giordani, Kohn, and van Dijk (2007) for a similar approach. To allow for occasional structural breaks, we assume that thek+2 variablesκj tare binary random variables, which equal 1 in the case of a structural break in thejth parameter at timetand 0 otherwise forj =0, . . . , k+1 andt =1, . . . , T. We assume that the vectorκt=(κ0t, . . . , κkt, κk+1,t)′ is a se-quence of uncorrelated 0/1 processes with

Pr[κj t =1]=πj, j =0, . . . , k+1. (4)

This specification implies that a regression parameterβj tin (3) remains the same as its previous valueβj,t−1 unlessκj t =1 in which case it changes withηj t; see, for example, Koop and Potter (2007) and Giordani, Kohn, and van Dijk (2007) for a similar approach. Stochastic structural breaks in the variance parameter lnσ_t2comply with a similar structure as theβj tparameters. The flexibility of the specification in (3) stems from the fact that the parameters βj t and σt2 are allowed to change every time period, but they are not imposed to change at every point in time. Another attractive property of (3) is that the changes in the individual parameters are not restricted to coincide but are allowed to occur at different points in time.

The model presented in (2)–(4) is the most general specifica-tion we consider in this article. We will call this the BMA-SBB-SBV specification indicating that we consider Bayesian model averaging (BMA), structural breaks in the regression parame-ters (SBB), and structural breaks in the error variance (SBV). We will also consider specifications whereκj t=1 for alljand

tin (4) in which case the parameters follow a random walk and structural breaks occur at every point in time. We will denote these specifications by RWB and RWV for the regression param-eters and volatility, respectively. The random walk specification is useful to describe many small breaks where the discrete speci-fication can be used to describe occasional large breaks; see also Giordani and Villani (2010) for a discussion.Table 1provides an overview of the alternative specifications we entertain in our framework, where different aspects of parameter and variance instability in the model are switched off or altered.

2.2 Prior Specification

For parameter inference in (2)–(4), we opt for a Bayesian approach. Such an approach allows us to incorporate parameter

(5)

Table 1. BMA inflation model specifications for (2)

βj t lnσt2 κj t κk+1,t

BMA-SBB-SBV As in (3) As in (3) As in (4) As in (4)

BMA-SBV βj t =βj∀t As in (3) – As in (4)

BMA-SBB As in (3) lnσ2

t =lnσ2∀t As in (4) – BMA βj t =βj∀t lnσt2=lnσ

2_∀_t _– _–

BMA-RWB-RWV As in (3) As in (3) κj t =1∀j, t κk+1,t =1∀t

uncertainty when forecasting inflation in a natural way. Also, Bayesian inference onD=(δ1, . . . , δk) leads to posterior

prob-abilities for the 2k possible model specifications. We will use these posterior probabilities for BMA to incorporate model un-certainty into a single inflation forecast. Finally, the approach provides us with the posterior distribution of the unobserved κtprocesses fort=1, . . . , T−h, which can be used to incor-porate uncertainty regarding the timing of structural breaks. By definition,κtin (2) does not depend onD, which implies that the value ofκtcan be different across different values ofD. Hence, structural breaks can occur in different parameters at different time periods across different models, and we average over the latter to obtain our final inflation forecasting relationship.

The parameters in the model (2)–(4) are the inclusion vec-tor D=(δ1, . . . , δk)′, the structural break probabilities π=

(π0, . . . , πk+1)′, and the vector of variances of the size of the

breaksq2=(q₀2, . . . , q_k2₊₁)′. We collect the model parameters in a (3k+4)-dimensional vectorθ =(D′, π′, q2′)′_.

For our Bayesian approach, we need to specify the prior distributions for the model parameters. For the variable inclusion parameters, we take a Bernoulli distribution with

Pr[δj =1]=λj, forj =1, . . . , k. (5)

Hence, the parameterλj reflects our prior belief about the inclu-sion of thejth explanatory variable; see George and McCulloch (1993) and Kuo and Mallick (1998). For the structural break probability parameters, we take Beta distributions

πj ∼Beta(aj, bj), forj =0, . . . , k+1. (6)

The parameters aj and bj can be set according to our prior belief about the occurrence of structural breaks. The expected prior probability of a break is aj/(aj +bj). Finally, for the variance parameters that reflect our prior beliefs about the size of the structural breaks, we take an inverted Gamma-2 prior that depends on scale parameter ¯ωj and degrees of freedom parameterνj, that is,

q_j2∼IG-2( ¯ωj, νj), forj =0, . . . , k+1, (7) with ¯ωj =ωjνj. The expected prior break size equals therefore the square root of (ωjνj)/(νj −2) forνj >2, and our choice of ωj thus reflects our prior beliefs regarding the expected size of a break.

The density of the joint priorp(θ) is given by the product of the densities of the prior specifications in (5)–(7).

2.3 Posterior Simulator

Posterior results are obtained using the Gibbs sampler of Geman and Geman (1984) combined with the technique of data

augmentation of Tanner and Wong (1987). The latent variables B= {βt}Tt=−1h, withβt =(β0t, β1t, . . . , βkt)′,S= {σt2}

T−h t=1 , and

K= {κ0t, . . . , κk+1,t}Tt=−1h are simulated alongside the model

parametersθ. To apply the Gibbs sampler, we need the complete data likelihood function, that is, the joint density of the data and the latent variables

p(y, B, S, K|x, θ)

=

T−h

t=1

⎛

⎝p

yt+h|D, xt, βt, σt2

p(βt|βt−1, κt, Q)

p

lnσ_t2|lnσ_t2₋₁, κk+1,t, qk2+1

k+1

j=0

πκj t

j (1−πj)1−κj t ⎞

⎠, (8)

wherey =(y1+h, . . . , yT) andx =(x1′, . . . , x

′ T−h)

′_{. The}

densi-ties that make up (8) correspond to those of normal distributions with conditional moments, and we specify these in more detail in Appendix A, in particular (A.1).

If we combine (8) together with the prior densityp(θ), we obtain the posterior density

p(θ, B, S, K|y, x)∝p(θ)p(y, D, S, K|x, θ). (9)

Our Gibbs sampler is a combination of the Kuo and Mallick (1998) algorithm for variable selection and the efficient sam-pling algorithm of Gerlach, Carter, and Kohn (2000) to han-dle the structural breaks. If we define K=(KβKσ) with Kβ = {κ0t, . . . , κkt}Tt=−1h and Kσ = {κk+1,t}Tt=−1h, then in each

iteration of the sampler, we sequentially cycle through the fol-lowing steps:

1. DrawDconditional onB,S,K,π,q2,y, andx. 2. DrawKβconditional onS,Kσ,θ,y, andx. 3. DrawBconditional onS,K,θ,y, andx. 4. DrawKσconditional onB,Kβ,θ,y, andx. 5. DrawSconditional onB,K,θ,y, andx. 6. Drawπconditional onD,B,S,K,q2_,_y_{, and}_x_.

7. Drawq2_{conditional on}_D_,_B_,_S_,_K_,_π_,_y_{, and}_x_.

Note that we sampleKβandBfrom their joint full conditional posterior distribution. The same holds forKσandS. A more de-tailed description of this Gibbs sampling algorithm is provided in Appendix A.

2.4 Posterior Predictive Density

To forecast inflation, we use the posterior predictive density of model (2)–(4). This density accounts for uncertainty in the inclusion of the predictorsxj t, uncertainty about the presence and size of structural breaks, and parameter uncertainty. The

(6)

predictive density ofyT+1+hmade at timeT +1 conditional on

The predictive density (10) consists of a weighted average over all possible model specifications in (2) with weights equal to the posterior model probabilities. Uncertainty regarding the timing of structural breaks is reflected in (10) by the posterior distribu-tion of the in-sample breaksK. Computation of such a predictive distribution is straightforward using the aforementioned Gibbs draws. We simulate in each Gibbs stepyT+h+1using (2)–(4) as

data-generating process (DGP), where we replace the param-eters and the in-sample latent variables by the draw from the posterior distribution. As the point forecast we can use the pos-terior mean or median of the predictive distribution, depending on the chosen loss function (see Gneiting2011).

3. DATA AND PRIOR CHOICES

In this section, we outline how we operationalize the different variants of our BMA framework (2)–(4) on our dataset for our inflation measures. In Section3.1, we discuss this data, whereas in Section3.2, we report on the prior values we chose to be able to take our framework to the data in a satisfactory manner.

3.1 Data

We will consider two measures of U.S. inflation observed at a quarterly frequency ranging from 1960Q1 to 2011Q2; these are the quarterly log changes in the personal consumption expen-ditures (PCE) deflator and the gross domestic product (GDP) deflator. Although both these inflation series tend to move to-gether, there are substantial differences in their respective com-positions. It is therefore likely that the two inflation measures will exhibit different short-to-medium run behavior and we thus may expect the optimal predictors for the two measures to dif-fer as well. Potentially there is a wide array of predictors for inflation that can be useful in the different variants from our model (2). Stock and Watson (1999), for example, considered up to 132 potential indicator variables using dynamic factors and forecast combination techniques. However, it is our aim to predict inflation inreal time. As both our inflation measures of interest and many potential predictor variables are revised over time, we need to restrict our pool of possible predictor variables to those for which we have the original vintages of real-time data. This limits the number of predictors to about 15 series.

The inflation measures and the set of potential predic-tors either come directly or are constructed from five data sources. These are the Real-Time Data Set for Macroeconomists (RTDSM) at the Federal Reserve Bank of Philadelphia, the Haver Analytics database, the ALFRED_{real-time database at} the Federal Reserve Bank of St. Louis, the Center for Research in Security Prices (CRSP) database from Wharton Research Data Services, and the Reuters/University of Michigan Survey of Consumers. We refer the reader to Appendix B for more details on the data sources and data construction.

Our range of predictor variables can be divided into three groups. For real activity, price, and cost measures, we have real GDP in volume terms (ROUTP), real durable PCE in volume terms (RCONS), real residential investment in volume terms (RINVR), the import deflator (PIMP), the unemployment ra-tio (UNEMPL), nonfarm payrolls data on employment (NFPR), housing starts (HSTS), the real spot price of oil (OIL), the real food commodities price index (FOOD), and the real raw mate-rial commodities price index (RAW). These variables provide information about either the degree of excess demand in the economy or about the real costs that firms face.

We also include a number of variables that are informa-tive about the current and future state of the economy. One of these variables is the M2 monetary aggregate, which can reflect information on the current stance of monetary policy and liquidity in the economy as well as spending in households and firms (increased M2 growth might reflect increased spend-ing by households and firms). In addition, we also use data on the term structure of interest rates, as this contains market-based forward-looking information about the business cycle, the stance of monetary policy, and inflation expectations. It is well known that the cross section of the term structure can be approx-imated very well by means of 2–3 factors; see, for example, Ang, Piazzesi, and Wei (2006) and Diebold, Rudebusch, and Aruoba (2006). Following the literature, we therefore approximate the term structure through three factors constructed from data on Treasury Bill rates and zero-coupon bond yields: the level factor (YL), the slope factor (TS), and the curvature factor (CS).

Finally, we proxy inflation expectations through the one-year ahead inflation expectations that come from the Reuters/Michigan Survey of Consumers (MS). Surveys can give potentially a very good steer about agents’ expectations and indeed Ang, Bekaert, and Wei (2007) claimed that in an out-of-sample context, inflation expectation surveys are the most accurate predictors for future U.S. inflation.

For most of these variables, we use the percentage change of the original series to remove possible stochastic and deter-ministic trends from the series; see also Table B.1 in Appendix B. An alternative way of removing trends from our predictor variables would be to express them as “gap measures,” that is, log deviations from some proxy of a variable’s trend (like a quadratic deterministic trend, exponential smoothing, statistical filters, and so on). Orphanides and van Norden (2005), however, found that, in real time, activity growth rates are more useful predictors for U.S. inflation than activity gap measures. We stan-dardize each predictor variable, in real time, by dividing it by its standard deviation in each vintage. This allows us to set similar priors across these variables because they now can be treated a

(7)

Table 2. Prior choices for the BMA inflation models fromTable 1

NOTES: In this table,λjrepresents our choices for the prior variable inclusion parameter

distribution (5) forj=1, . . . , k. Next,aj(j=0, . . . , k),ak+1,b0,bj(j=1, . . . , k), and

bk+1reflect how we parameterize prior break probability distribution (6) for, respectively,

the intercept, thekpredictor variables, and lnσ2

t. Finally,ω0,ωj(j=1, . . . , k),ωk+1,ν0,

νj(j=1, . . . , k), andνk+1give parameterizations of the prior break size distribution (7),

which, again, differs for the intercept, the predictor variables, and lnσ2

t.

priori similarly in terms of breaks and break sizes in Equations (2) and (3).

3.2 Prior Choices and Posterior Convergence

For the empirical analysis in this article, we use the 15 pre-dictors as discussed in Section 3.1and we also include four inflation lags (yt, . . . , yt−3) as potential explanatory variables

in the model. To be able to use the BMA-based framework from Section2, we need to take a stand on a proper calibration of the prior distributions outlined in Section2.2andTable 2provides an overview of these calibrations.

The parameter λj in (5) is set to 50%, which means that every variable (including the four inflation lags) has a 50% prior probability of being selected. The expected prior model size, including the intercept, is therefore 10.5.

We seta0=0.50 andb0=5 in (6), andω0=1 andν0=10

in (7) for the intercept parameter, which suggests that we can expect, on average, an intercept break to occur every 11 quar-ters of about size 1. For the regression paramequar-ters, we impose the view that they are subject to less and smaller breaks than the intercept, and we therefore chooseaj =0.5 andbj =30 (less breaks) andωj =0.01 and νj =20 (smaller magnitude) forj =1, . . . , k. This difference in prior settings reflects our in-terpretation of U.S. post-WWII history where large shifts in the level of inflation coincided with changes in the monetary policy regimes, such as the shift in emphasis by the Federal Reserve System from monetary accommodation in the 1970s to a strong preference for disinflation by the late 1970s and early 1980s. It is our view that the intercept in (2) should reflect these partic-ular type of shifts in average inflation, where time variation in the remaining regression parameters then can pick up changing correlations between inflation and its potential predictors across different phases of the business cycle. To account for the larger value ofω0thanωj, we choose a smaller value ofν0thanνjfor j =1, . . . , k.

The prior beliefs on the parameters concerning the variance equation reflect a larger number of breaks than for the intercept, but with a smaller magnitude (ak+1=0.8, bk+1=2, ωk+1=

0.5, and νk+1=10). These parameters correspond to beliefs

in other time-varying inflation model specifications such as in Cogley and Sargent (2005) and Primiceri (2005). Finally, the priors for the initial values of βj t and lnσt are normal (see Appendix A for details).

The posterior simulator described in Section2.3in combina-tion with the prior values fromTable 2can easily be adapted to estimate variants of our BMA-based framework other than the BMA-SBB-SBV specification (seeTable 1) by simply switch-ing off one or all of the channels of structural instability. This is less trivial when we consider the BMA-RWB-RWV version of our framework, whereκj t andκk+1,t are equal to 1 for allj andt. In this case,πjdoes not play any role and the information inq_j2using the prior values fromTable 2is possibly very large since it is always valid independently ofπj. Therefore, when estimating the BMA-RWB-RWV specification, we decrease the prior value ofνj, j =0, . . . , k+1 to 3, which is comparable with the prior choices made in other studies that use a stan-dard time-varying parameter specification, such as Cogley and Sargent (2005). The priors for the initial values for the time-varying regression parameters in the BMA-RWB-RWV speci-fication are similar to those of the other BMA-based variants.

Posterior densities for the different variants of our BMA ap-proach (seeTable 1) are computed using the prior calibrations fromTable 2and are based on 24,000 iterations of the Markov chain Monte Carlo (MCMC) sampler outlined in Section2.3. Of these, we take 4000 iterations as the burn-in period for the sampler and then select every second draw of the remaining posterior draws. However, in the case of the BMA-RWB-RWV specification, we use 44,000 posterior draws, with a burn-in sample of 4000 and retaining every fourth draw of the remain-ing iterations. These choices for the number of retained draws to be used for posterior inference are based on the outcome of a range of convergence tests; see Appendix C for more details.

The posterior results for our BMA-based framework are not very sensitive to the prior settings for theλparameter that gov-erns the prior distribution on the variable inclusion parameters δj in (2). A different prior setting for the probability and size of the breaks, however, could potentially impact the different posterior results substantially, as is illustrated by a simple prior sensitivity analysis in Appendix D on a simulated dataset. The analysis in Appendix D first determines a proper prior setting for the simulated data and then quantifies the impact on the posterior distributions from our BMA-SBB-SBV model when the priors on break probabilities and sizes diverge from their appropriate values. The general purpose of this analysis is to show the impact when choosing relatively too large or too small values for these prior parameters. The results suggest that the posterior inclusion probabilities are hardly affected by the prior setting for the probability and size of breaks unless the prior on the expected break size on the regression parameters is too large. Furthermore, the prior settings for the probability and expected size of a break influence the posterior results of the timing and size of breaks. Prior choices that do not accurately approximate the true breaking process in either the intercept, regression pa-rameters, or the variance influence posterior inference on the breaks both in the mean equation and in the variance equation.

Proper parameterizations of the break probability and break size prior distributions are indeed of importance for our empiri-cal application. Unreported posterior results show that if we take a smaller prior value forω0(i.e., the break size in the intercept),

the posterior estimates of the intercept exhibit more frequent but smaller breaks than for the prior setting inTable 2. As a conse-quence, this alternative prior setting results in a posterior mean

(8)

of the intercept parameterβ0t that follows the inflation series more closely and this results in a sharp decrease in the marginal posterior variable inclusion probabilitiesδj, as this pattern in the intercept erroneously reduces the contribution of the predictor variables in explaining inflation. If we set smaller prior values forbjand higher values forωj(j =1, . . . , k) for the regression parameters of the predictor variables than those outlined in

Table 2, implying a higher prior probability of exhibiting breaks of a higher prior expected size, then the corresponding posterior parameter estimates will exhibit many small breaks and the marginal posterior variable inclusion probabilities are low.

Posterior inference on the regression parameters and latent breaks for the 19 predictor variables within our BMA-based specifications would be based on their marginal posterior dis-tributions, which are averaged over all possible model specifi-cations. Each of these may contain different predictor variables and breaks, and this complicates the interpretation of the pos-terior results. Nonetheless, there is some merit in analyzing the posterior means of the individual break probabilitiesπj, which are averaged across all entertained model specifications. It al-lows us to assess how informative the data have been in driving the posterior time variation in our BMA-based specifications. In Section E.1 of Appendix E, we discuss in detail the posterior marginal break probabilitiesπj’s for the relevant variants of our BMA-based framework over the full 1960–2011 sample. From this analysis, we observe that the posterior mean probability of a break in the interceptπ0is about nine times smaller than the

corresponding prior mean, for both inflation rates ath=1 and h=5. For the posterior means ofπj of the predictor variables, the reverse pattern occurs: in the case of BMA-SBB-SBV (Ap-pendix E.1, Table E.1), for example, the posterior marginal pre-dictor variable break probabilities are, on average, about 6 and 10 times larger than the prior mean break probability ath=1 andh=5, respectively. The prior setting in Table 2therefore seems to be robust to different types of instability in the data and the data remain informative about the model parameters.

4. REAL-TIME PREDICTION OF U.S. INFLATION

RATES

In this section, we focus on our empirical exercise: the out-of-sample forecasting performance of our BMA family of in-flation models relative to other, often very parsimonious, mod-els that are frequently used for inflation forecasting. Section

4.1provides an overview of these alternative models, whereas the evaluation methodology is outlined in Section4.2. After a brief description of the in-sample properties of our BMA-based framework in Section4.3, we discuss the out-of-sample fore-casting results in Section4.4.

4.1 Forecasting Models

The starting point of our forecasting exercise will be the differ-ent variants of the BMA framework outlined in Section2.1, and we refer toTable 1for an overview of these variants. To assess how our different BMA models perform in real time, we need to compare them with viable alternative inflation forecasting models. First among these models is the Atkeson and Ohanian (2001) random walk model (RW), which is traditionally seen as

a hard-to-beat model when it comes to out-of-sample inflation prediction. More specifically, this specification assumes thath -quarter ahead inflation can be predicted by the average inflation rate over the last four quarters in a given vintage of inflation data, that is,

Also, time-invariant AR specifications for inflation, using lag orders between 1 and 4, are considered as parsimonious alter-natives:

withp∗is the optimal lag order as determined by the Bayesian-Schwarz information criterion (BIC) across lag orders up till 4 and thus we will refer to (12) as BIC-AR. Finally, we take as a simple representation of the traditional Phillips curve model a version of (12) augmented with the current unemployment rate UNEMPLt(BIC-AR-UR) in a given data vintage, which gives

yt+h =β0+

where againp∗is the optimal lag order according to BIC. The aforementioned approaches are naive in the sense that they do not allow explicitly for structural instability in the fore-casting relationship, and also do not incorporate information from many, additional, predictors. A simple way to achieve the latter, which in practice has been found to have good forecasting properties, is the ridge regression approach. It encompasses a simple linear regression of the quarter-to-quarter inflation rate

hquarters ahead on all 15 predictor variables described in the previous section plus four inflation lags, that is,

yt+h=β0+Xt′βj +σ εt+h, (14) where the predictor variables are assembled in thek×1 vector Xt=(x1t, . . . , xkt)′andεt+h∼NID(0,1). Thek×1 parameter vectorβ_jin (14) could now be estimated using a ridge regression (shrinkage) estimator:

where ˜yt+h,X˜t are demeaned inflation and predictor variables, respectively, to account for the effect of the intercept term, and λis the scalar shrinkage parameter.

De Mol, Giannone, and Reichlin (2008) showed that a ridge regression like (14)–(15) boils down to a regression of, in our case,h-quarter ahead quarterly inflation on a weighted average of all the principal components from Xt, which can be used to model the impact of the unobserved common factors inXt onyt+h. We chooseλ=10 in (15), as De Mol, Giannone, and Reichlin (2008) showed that the degree of shrinkage should be proportional to the number of regressors to achieve the best

(9)

Table 3. Alternative models for forecasting inflation

Name Description Specification

RW Random walk (11)

BIC-AR AR model with BIC lag order selection (12)

BIC-AR-UR Simple Phillips curve model with AR and BIC lag order selection (13)

Ridge Ridge regression withλ=10 (14)–(15)

AR-BMA-SBB-SBV BMA-SBB-SBV specification with only four inflation lags inxt (16) AR-BVS-SBB-SBV Specification of AR-BMA-SBB-SBV with highest posterior probability (16)

BVS-SBB-SBV Specification of BMA-SBB-SBV with highest posterior probability (2)

UCSV Unobserved component model with SV (17)

forecasting performance in a data-rich context. A Gibbs sam-pler based on an independent Normal-Gamma prior is used to estimate (14) and (15), with noninformative priors on the in-tercept andσ2, and a shrinkage normal prior on the remaining regression parameters N(0, λ−1Ik).

We also want to assess our range of BMA inflation forecasting models to alternatives that explicitly model structural instabil-ity in the data. One such alternative is simply a variation on our BMA-SBB-SBV model where we average solely over the inflation lags and hencext =(yt, yt−1, yt−2, yt−3):

yt+h=β0t+

4

j=1

δjβj tyt−j+1+σtεt+hwithεt+1∼NID(0,1),

(16)

together with (3) and (4). Hence, in this AR-BMA-SBB-SBV specification, we use our stochastic variable selection algorithm to determine the posterior probabilities of all possible inflation lag combinations and use these posterior probabilities for BMA. Alternatively, we can use the resulting posterior model prob-abilities that correspond with (16) to select the “best” lag order for a given vintage of inflation data at forecast horizonhand use the resulting model to forecastyt+h; we denote this as the AR-BVS-SBB-SBV model (with BVS standing for Bayesian vari-able selection). We can also apply this approach to our full set of 15 predictor variables and 4 quarterly inflation lags; that is, us-ing the model posterior probabilities that result from the BMA-SBB-SBV MCMC algorithm to select the best combination of predictor variables and inflation lags for the h-quarter ahead quarter-to-quarter inflation rate in a given vintage of data. Only this model is then used to predict with stochastic breaksyt+h. This approach will be labeled as the BVS-SBB-SBV model.

Another option is an inflation forecast model that has been successfully used by Stock and Watson (2007,2008) to predict inflation. They proposed an unobserved components model with stochastic volatility (UCSV) specification for the unobserved component of inflation and the temporary deviation from it, that is,

yt=βt+σtεt, lnσt2=lnσ

2

t−1+u1t, βt=βt−1+ωtηt, lnωt2=lnω2t−1+u2t,

(17)

where εt∼NID(0,1), ηt ∼NID(0,1), and ut=(u1tu2t)′∼

NID(0, ρI2), withρa scalar parameter controlling the

smooth-ness of the stochastic volatility processes, and where εt, ηt, andut are independent. We use the Stock and Watson (2008)

Gibbs sampler to estimate (17), withρ =0.04; Stock and Wat-son (2007,2008) motivated their choice forρ based on the fit of (17) for U.S. inflation rates over the 1955–2004 sample. The resulting forecast foryt+his set equal to the filtered estimate of βtfrom (17) for a given data vintage of quarterly inflation rates.

Table 3summarizes the above described range of models that serve as alternatives to our BMA family of inflation forecasting models.

4.2 Forecast Evaluation Methodology

We use the models described in Section 4.1 to obtain and evaluate one-quarter and one-year ahead forecasts for the quarter-on-quarter inflation rate of both the PCE deflator and the GDP deflator in the United States. Except for the RW, BIC-AR, and BIC-AR-UR models, all other models are estimated using MCMC simulation procedures. For computational reasons, we obtain the one-year ahead forecasts through direct forecasting, as indirect one-year ahead forecasting would entail iterating forward the posterior predictive distribution multiple times, including predicting thex-variables, and thus rerunning MCMC samplers multiple times.

Each forecast is based on a reestimation of the underlying model using an expanding window of historical data. For exam-ple, suppose the first one-quarter and one-year ahead forecasts are produced in quartert0. As we want to evaluate the forecasts in

real time, we use the original vintage of data available att0,

con-taining data up tot0−1, to standardize the predictor variables

by dividing them by their standard deviations and reestimate the range of models on the samplet =1, . . . , t0−h, with the

forecast horizonh=1 or 5. This results in a quarterly inflation nowcast for quartert0 (horizonh=1) and a direct quarterly

inflation forecast for quartert0+4 (horizonh=5) using data

onxj t fort =1, . . . , t0−1 and the parameter estimates or the

posterior draws from the estimation up tot0−h. These forecasts

are evaluated against the vintage of inflation data that is avail-ableh quarters ahead, that is, the vintage att0+h. We repeat

this process of predictor variable standardization, reestimation, and forecast generation fort0+1, . . . , T −h. In the end, we

will have time series of forecasts (and in most cases their distri-bution) and forecast errors fort=t0, . . . , T −h, which we use

to compute several forecast evaluation criteria.

We follow the usual practice in the literature, and evaluate the point forecasts from the different models using the square root of the mean squared forecast errors (RMSE). Gneiting (2011) showed that the RMSE is only a consistent evaluation measure

(10)

when the point forecast equals the mean of the predictive distri-bution, and we therefore base the RMSE of a model on the mean of the corresponding distribution of inflation predictions. Given the results by Stock and Watson (2007,2008), it appears that the UCSV model produces forecasts for U.S. inflation rates that are hard to beat by Phillips curve and naive time series models (including the RW model). Hence, we assess real-time inflation point forecasts with the RMSE ratio for a model relative to that of UCSV, and a ratio smaller than 1 suggests that the mean fore-cast of the UCSV model is beaten by that of a competing model. Point forecasts of inflation, however, cannot provide insight into the uncertainty that is associated with producing these fore-casts, and inflation density forecasts are more useful for this pur-pose. Density forecasts of inflation can also be used to assess the ability of a model to predict unusual developments, such as the likelihood of deflation or accelerating inflation given current information. We therefore will also evaluate inflation density forecasts from our range of models (see Tables1and3), based on the continuous ranked probability score (CRPS). This CRPS circumvents some of the drawbacks of the usually employed log score (the logarithm of the predictive density evaluated inyt+h), as the latter does not reward values from the predictive density that are close but not equal to the realization (see, e.g., Gneiting and Raftery2007) and it is very sensitive to outliers. Gneiting and Ranjan (2011) showed that it is invalid to use weighted log scores to emphasize certain areas of the distribution (such as the tails) in density forecast evaluation.

The CRPS for a modellmeasures the average absolute dis-tance between the empirical cumulative distribution function (CDF) of yt+h, which is simply a step function in yt+h, and the empirical CDF that is associated with modell’s predictive density:

where Fis the CDF from the predictive densityf of modell

estimated at timet,I(.) takes a value 1 ifyt+h≦zand equals 0 otherwise,Ef is the expectation operator, andYt+h,l andYt′+h,l are independent random variables with common sampling den-sity equal to the posterior predictive denden-sity of modellforyt+h at timet. It can easily be computed from the posterior draws of an MCMC sampler (draws from Y′

t+h can be obtained by random resampling from these draws) or can be computed an-alytically for a Gaussian approximation. We use the historical average of (18) across the forecasts, denoted as avCRPS, where a ratio of the avCRPS of a model relative to that of the UCSV model smaller than 1 suggests that density forecasts from model

lare better than those from the UCSV model.

Finally, we want to assess how our range of models fares when different areas of their predictive densities are emphasized in the forecast evaluation, such as the tails. Gneiting and Ranjan (2011) proposed to integrate weighted versions of Gneiting and Raftery (2007) quantile scores (QS), with weights chosen such to emphasize a certain area of the underlying forecast density.

We will use a discrete approximation to this integration, that is,

avQS-Cl= 1

(0,1) results in the CRPS measure (18) (see Gneiting and Ranjan

2011). In (19), avQS-C emphasizes the center, avQS-R the right tail, and avQS-L the left tail of the predictive density relative to the realizationhperiods ahead. We will report ratios of avQS-C, avQS-R, and avQS-L measures of a certain model relative to those of the UCSV. A ratio smaller than 1 indicates that a model generates more precise predictions for certain parts of the (unknown) distribution of future inflation rates.

A number of our models are nested and we employ an expand-ing data window for parameter updatexpand-ing, which will impact on the distribution of a statistic as in Diebold and Mariano (1995), for either point or density forecasts; see, for example, Clark and McCracken (2012), Amisano and Giacomini (2007), and Gneiting and Ranjan (2011). In addition, Clark and McCracken (2009) showed that data revisions in the forecast variable can potentially change the limiting distribution and results of equal predictive ability tests. They suggested a corrected test for linear direct multiperiod ahead forecasting models under a quadratic loss function. We cannot apply this test, however, as we use a number of time-varying specifications and adapting this test to our setup is beyond the scope of this article.

Instead, similar to Faust and Wright (2011), we build on Monte Carlo results by Clark and McCracken (2012). They found that in the case of RMSE-based evaluation, comparing the Harvey, Leybourne, and Newbold (1997) small sample correc-tion of the Diebold and Mariano (1995) statistic with standard normal critical values results in a good sized test of the null hypothesis of equal finite-sample forecast precision for both nested and nonnested models, including cases with expanded window-based model updating. We therefore follow a similar procedure, where we compare the value of the aforementioned statistic with one-sided critical values from the standard nor-mal distribution, in the case of both point and density forecasts.

(11)

Thus, we test the null of equal finite-sample forecast accuracy, based on either an RMSE, avCRPS, C, R, or avQS-L measure,versusthe alternative that a model outperformed the UCSV benchmark.1

4.3 In-Sample Properties

Although the main focus of our analysis is on assessing the out-of-sample real-time forecasting performance of our BMA family of models for U.S. inflation, we first briefly discuss in this section the in-sample properties of our BMA-based fore-casting models when we estimate them on the full 1960Q1– 2011Q2 sample. This will give us insight into how consistent their implied inflation behavior is compared with the stylized facts uncovered by the existing literature, and how the combina-tion of model averaging and structural break modeling impacts on predictor variable selection in our framework.

The existing literature has focused a lot on documenting the time-varying properties of inflation. Based on this literature, which we surveyed in Section1, we can summarize these prop-erties as follows:

• The persistence of U.S. inflation increased during the 1970s, reaching peaks around 1974–1975 and around 1980, and subsided after 1982–1983 (see, e.g., Cogley and Sar-gent2005; Cogley, Primiceri, and Sargent2010).

• A majority of studies suggest that the downward shift in U.S. inflation variability from the late 1980s and early 1990s onward has been due to an exogenous variance break unrelated to breaks in the mean and/or persistence; some studies claim that this also happened during the 1970s (see, e.g., Sensier and van Dijk2004; Sims and Zha2006).

To reassure us that our framework does not provide counter-intuitive results, which would influence the interpretation of our real-time forecasting exercise in Section 4.4, we can ex-tract time-varying measures of inflation persistence and inflation shock volatility from full-sample estimates of the relevant ver-sions of our BMA-based framework. We then can compare the behavior of these measures with those that are representative of the aforementioned findings of the literature. As a proxy for the latter, we can use time-varying measures of inflation persistence and inflation innovation volatility from a full-sample estimate of the AR-BMA-SBB-SBV model (16), as one usually allows, in this literature, for structural change but does not condition on a

1_{In an unreported exercise, we ran a Monte Carlo experiment similar to the one} in Amisano and Giacomini (2007, sec. 5). Amisano and Giacomini (2007) used a DGP in which a random walk model and a Phillips curve model, calibrated on data, have equal finite-sample predictive power for the conditional mean of inflation, while imposing identical conditional variance across all models and inflation. We follow their setup, where we use (11) and (13), recursively estimated in real time on our data using an expanding window, and asses the size of the standard Diebold and Mariano (1995) statistic and the one with the Harvey, Leybourne, and Newbold (1997) correction for the relative avCRPS measure using standard normal critical values. We ran these experiments for both

h=1 andh=5 (we impose for the latter horizon anMA(4) in the innovation variance of inflation) at different sample sizes, using a nominal size of 5% and 10,000 iterations. The standard Diebold and Mariano (1995) statistic became quite oversized ath=5 at times but the corrected version kept the finite-sample size fairly close to the nominal size.

large set (of combinations) of additional explanatory variables. Appendix E.2 contains a detailed discussion of such a compar-ison for BMA-SBB-SBV, BMA-SBB, and BMA-SBVversus

AR-BMA-SBB-SBV. From this comparison, we find that these four specifications exhibit similar stylized facts for inflation: relatively low persistence in the 1960s, a substantial increase of this persistence in the 1970s, which reduces dramatically in the second part of the 1980s (see Appendix E.2, Figure E.1), as well as a decrease in inflation shock volatility for all specifications from the late 1980s and early 1990s onward (see Appendix E.2, Figure E.2). The implied inflation characteristics from the time-varying variants of our BMA-based specifications are therefore consistent with what the prior literature has documented.

Next, we want to assess how the interaction between model averaging across all potential regression specifications and al-lowing for breaks affect the variable selection in our BMA-based specifications. We can do this by looking in detail at the in-sample posterior variable inclusion probabilities of variants of our BMA family of models that either ignore structural breaks (BMA), shut off variance breaks (BMA-SBB), do not allow for regression parameter breaks (BMA-SBV), or allow for both types of breaks (BMA-SBB-SBV). Section E.3 of Appendix E reports on this comparison, which results in some interesting findings. When structural breaks are not allowed for, these in-clusion probabilities are generally much higher than the prior value of 50%, and suggest that, on average, individual speci-fications in the BMA model contain about 17 variables for all inflation rates and horizons (see Figure E.3 of this appendix). Variable inclusion probabilities drop significantly when the dif-ferent sources of breaks are included in the model, as the max-imum average posterior inclusion probability is about 42% for the BMA-SBV model at the one-year horizon for both infla-tion rates. Hence, allowing for structural breaks results in more parsimonious models.

The comparison in Figure E.3 of Appendix E.3 also suggests that for the general BMA-SBB-SBV specification, the model size seems to adapt more to the forecast horizon, with more parsimony at longer horizons. Ath=1 its average model size is about three times smaller than for the BMA case, whereas at h=5 BMA-SBB-SBV uses, on average, six times less predic-tors than BMA in the individual regressions. Finally, a closer look at the selected models suggests that there is substantial uncertainty in the specification of the individual regressions in our BMA-based approach, with, for example, BMA-SBB-SBV averaging over at least 250 specifications for both inflation rates and horizons.

The previous discussion might give some a priori insight in the potential success of our BMA-based specifications in fore-casting inflation out-of-sample, despite the fact that they, in principle, are heavily parameterized models (see also the pre-dictive density (10)). In general, it is well known that Bayesian model selection prevents overfitting of the data and leads to parsimonious models. For example, the BIC contains a penalty function for the number of parameters and the BIC values for different models can be used to approximate their respective Bayes’ factors. It is also well known that marginal likelihoods and one-step ahead predictive likelihoods are related (see, e.g., Geweke2005, p. 67), and hence, Bayesian model selection also takes into account the forecasting performance of a model.

(12)

Table 4. Forecast evaluation for PCE deflator inflation

Horizon:h=1 Horizon:h=5 Horizon:h=1 Horizon:h=5

RMSE avCRPS RMSE avCRPS RMSE avCRPS RMSE avCRPS

Forecast evaluation sample: 1980Q1–2011Q2 Forecast evaluation sample: 1985Q1–2011Q2

BMA-SBB-SBV 0.91 0.87∗∗ _1.00 ₀_.₉₂ ₀_.₈₈ ₀_.₈₈∗ ₀_.₈₈∗ ₀_.₈₃∗∗

(−0.85) (−1.90) (0.02) (−0.81) (−0.94) (−1.55) (−1.45) (−1.89)

BMA-SBV 1.01 0.81∗∗∗ _1.06 ₀_.₈₈ _1.03 ₀_.₈₅∗∗∗ _1.03 ₀_.₈₅

(0.14) (−3.44) (0.43) (−0.95) (0.52) (−2.49) (0.19) (−1.22)

BMA-SBB 1.03 0.83∗∗∗ _1.13 ₀_.₉₉ _1.02 ₀_.₈₆∗∗∗ _1.02 ₀_.₉₂

(0.79) (−3.21) (1.04) (−0.11) (0.80) (−2.63) (0.17) (−0.87)

BMA 1.09 0.89∗∗∗ _1.37 _1.10 _1.08 ₀_.₉₀∗∗ _1.13 ₀_.₉₃

(2.30) (−2.40) (1.56) (0.58) (2.19) (−1.89) (0.74) (−0.41)

BMA-RWB-RWV 0.99 0.82∗∗∗ ₀_.₉₉ ₀_.₈₃∗∗ ₀_.₉₆∗∗ ₀_.₈₃∗∗∗ ₀_.₉₃∗ ₀_.₇₈∗∗∗

(−0.31) (−4.39) (−0.12) (−2.28) (−1.68) (−3.62) (−1.53) (−3.47)

RW 1.00 1.00 0.99 1.25 1.04 1.08 0.98 1.31

(0.12) (0.02) (−0.03) (3.05) (0.91) (1.10) (−0.56) (3.56)

BIC-AR 1.04 1.05 1.03 1.00 1.05 1.12 1.08 1.06

(3.47) (0.92) (0.41) (0.06) (4.01) (1.83) (1.22) (0.80)

BIC-AR-UR 1.06 1.06 1.05 1.02 1.06 1.12 1.11 1.07

(2.75) (1.04) (0.78) (0.22) (2.59) (1.86) (1.53) (1.02)

Ridge 0.99 1.05 1.02 1.22 1.02 1.12 1.04 1.25

(−0.30) (0.83) (0.27) (2.75) (0.94) (1.80) (0.50) (3.11)

BVS-SBB-SBV 1.17 0.97 1.18 0.91 1.13 0.96 1.00 0.84∗∗

(1.66) (−0.43) (1.23) (−0.85) (1.13) (−0.55) (0.06) (−1.93)

AR-BMA-SBB-SBV 0.98 0.83∗∗∗ ₀_.₉₆ ₀_.₈₄∗∗∗ _1.01 ₀_.₈₇∗∗ ₀_.₉₆ ₀_.₈₂∗∗∗

(−0.84) (−3.36) (−0.95) (−2.39) (0.30) (−2.30) (−0.68) (−2.42)

AR-BVS-SBB-SBV 0.98 0.83∗∗∗ ₀_.₉₄ ₀_.₇₉∗∗∗ _1.01 ₀_.₈₈∗∗ ₀_.₉₂∗ ₀_.₇₆∗∗∗

(−0.90) (−3.32) (−1.13) (−3.06) (0.51) (−2.15) (−1.44) (−3.56)

NOTES: The numbers are ratios of the RMSE and avCRPS measures relative to the UCSV model;boldnumbers indicate a better performance than UCSV. In parentheses, we report the Diebold and Mariano (1995) statistic with the Harvey, Leybourne, and Newbold (1997) correction for the null hypothesis of equal finite-sample prediction accuracyversusthe alternative hypothesis that a model outperforms UCSV for either of these measures, where∗_,∗∗_{, and}∗∗∗_{indicate rejection of this null at the 10%, 5%, and 1% levels, respectively, based on one-sided}

standard normal critical values. For model mnemonics, see Tables1and3.

Therefore, our finding that BMA-based approaches combined with structural breaks will average over quite parsimonious re-gression specifications, especially in the case of BMA-SBB-SBV ath=5, might lead us to suspect that they could do well in an out-of-sample context. We do find such a pattern for the in-sample fit across the different BMA-based specifications when we compare their posterior marginal BIC values; see Appendix E.3 (Table E.3). There is, however, a widely documented discon-nect between in-sample and out-of-sample predictive contents of models, as in, for example, Stock and Watson (2003), and in the end this is an empirical question. We will address this in the next section.

4.4 Out-of-Sample Results

The models discussed in Section4.1 are used to generate quarter-on-quarter PCE deflator and GDP deflator inflation fore-casts for the current quarter (h=1) and one-year ahead (h=5). These are evaluated by computing the forecast evaluation mea-sures discussed in Section 4.2 across two periods: 1980Q1– 2011Q2 and 1985Q1–2011Q2. These evaluation samples span a number of large events that potentially could have caused time variation in the dynamics of inflation rates, for example, the “monetarist experiment” by the Federal Reserve under Volcker, the “Great Moderation” during the late 1980s, the 1990s, and early 2000s, the 9/11 catastrophe in 2001, and, of course, the “Great Recession” of 2007–2009.

The longer period of 1980Q1–2011Q2 was chosen based on computational considerations: 1979Q4 (i.e., the data from the 1980Q1 vintage) was the furthest we could have gone back without compromising the proper estimation of models with stochastic structural breaks and due to the availability of proper real-time data. However, this evaluation sample does encompass the early 1980s, which represents the end of the Great Inflation era but does not convey any information about the start of the Great Inflation. It is for this reason that often in the literature, such as in Atkeson and Ohanian (2001) and Faust and Wright (2011), more attention is paid to inflation forecasting in the post-1984 period, also because inflation was deemed to be hard to forecast during the Great Moderation period. Therefore, we also analyze the predictive performance of our inflation models over the 1985Q1–2011Q2 sample. The forecasts of our models are based on estimates of the model parameters and, if relevant, the latent variables computed using an expanding window of data, starting with 1960Q1–1979Q4 using the original 1980Q1 data vintages for the 1980Q1–2011Q2 sample and commencing with 1960Q1–1984Q4 using the original 1985Q1 data vintages for the 1985Q1–2011Q2 sample. The range of forecasts is used to assess the real-time out-of-sample performance of our models relative to the Stock and Watson (2007,2008) UCSV model (17).

Table 4reports on PCE deflator inflation ratios of the RMSE and avCRPS measures for our BMA family of inflation models and those summarized inTable 3relative to these measures pro-duced by the UCSV model. When we focus first on the point

(13)

Table 5. Forecast evaluation for GDP deflator inflation

Horizon: h=1 Horizon: h=5 Horizon: h=1 Horizon: h=5

RMSE avCRPS RMSE avCRPS RMSE avCRPS RMSE avCRPS

Forecast evaluation sample: 1980Q1–2011Q2 Forecast evaluation sample: 1985Q1–2011Q2

BMA-SBB-SBV 0.91∗∗∗ ₀_.₉₇ ₀_.₉₉ ₀_.₉₃ ₀_.₈₅∗∗∗ ₀_.₉₅ ₀_.₈₅∗ ₀_.₈₄∗∗

(−2.59) (−0.71) (−0.08) (−0.85) (−3.71) (−0.90) (−1.63) (−1.82)

BMA-SBV 0.95 0.83∗∗∗ _1.04 ₀_.₈₆ ₀_.₉₃∗ ₀_.₈₃∗∗∗ ₀_.₉₉ ₀_.₈₃

(−1.18) (−3.54) (0.29) (−1.09) (−1.51) (−2.80) (−0.03) (−1.07)

BMA-SBB 0.97 0.90∗∗ _1.40 _1.12 ₀_.₉₂∗ ₀_.₉₂∗ ₀_.₉₆ ₀_.₉₉

(−0.62) (−1.80) (1.25) (1.07) (−1.58) (−1.35) (−0.34) (−0.09)

BMA 1.15 0.94 1.63 1.24 1.03 0.88∗ _1.19 ₀_.₉₆

(1.37) (−0.74) (1.65) (0.96) (0.36) (−1.60) (0.78) (−0.20)

BMA-RWB-RWV 0.96 1.08 0.96 0.96 0.93 1.12 0.94 0.99

(−0.81) (1.22) (−0.36) (−0.30) (−0.99) (1.62) (−0.71) (−0.10)

RW 0.91∗ _1.21 ₀_.₉₆ _1.43 ₀_.₈₈∗∗ _1.30 ₀_.₈₈∗∗ _1.56

(−1.48) (3.02) (−0.57) (4.43) (−1.66) (3.76) (−1.82) (5.03)

BIC-AR 0.97 1.32 0.99 1.13 0.99 1.43 1.05 1.22

(−0.52) (5.14) (−0.07) (1.14) (−0.18) (5.94) (0.67) (2.06)

BIC-AR-UR 0.96 1.32 0.99 1.12 0.99 1.42 1.08 1.23

(−0.73) (5.10) (−0.15) (1.14) (−0.18) (5.70) (0.86) (2.20)

Ridge 0.96 1.38 1.04 1.39 0.96 1.48 1.05 1.47

(−0.95) (6.26) (0.61) (4.02) (−0.70) (6.60) (0.54) (4.24)

BVS-SBB-SBV 2.18 1.41 1.42 1.18 2.39 1.34 1.27 1.11

(13.42) (4.37) (2.21) (1.76) (2.39) (4.10) (2.02) (1.10)

AR-BMA-SBB-SBV 0.96 0.89∗∗ ₀_.₉₈ ₀_.₈₉∗ ₀_.₉₆ ₀_.₉₁ ₀_.₉₈ ₀_.₈₇∗

(−0.88) (−1.87) (−0.28) (−1.39) (−0.69) (−1.23) (−0.21) (−1.30)

AR-BVS-SBB-SBV 2.09 1.03 0.94 0.89∗ _2.25 _1.06 _1.00 ₀_.₈₉

(14.28) (0.54) (−0.56) (−1.38) (12.88) (0.90) (0.04) (−1.09)

NOTES: See the notes forTable 4.

forecast evaluation over the 1980Q1–2011Q2 evaluation sample in the left-hand side panel of this table, it appears that forh=1 the BMA-SBB-SBV specification compares favorably with the UCSV model (although not statistically significant) and the cor-responding RMSE ratio is lower than those of the other BMA-based models and the AR-BMA-SBB-SBV model. For one-year ahead forecasts, the picture is more bleak for our approach as the RMSE ratio for the BMA-RWB-RWV model is just barely lower than 1. In terms of density forecasts, however, there is for both horizons evidence based on the avCRPS measure that a number of models are producing more precise density forecasts than the UCSV model. All our BMA-based inflation models perform better when it comes to current quarter inflation density predic-tions, and also the two AR specifications with stochastic struc-tural breaks (AR-BMA-SBB-SBV and AR-BVS-SBB-SBV) perform well in that case. Ath=5, the SBB-SBV, BMA-SBV, BMA-SBB, BMA-RWB-RWV, AR-BMA-SBB-BMA-SBV, and AR-BVS-SBB-SBV models exhibit better density forecasts, and this is statistically significant for the last three.

The right-hand side panel of Table 4 reports on the fore-cast evaluation for PCE deflator inflation over the 1985Q1– 2011Q2 period. Over this period, both the BMA-SBB-SBV and the BMA-RWB-RWV model outperform the UCSV model in terms of both point forecast accuracy and density forecast accu-racy at the two forecast horizons. The BMA-SBB-SBV member of our BMA range of inflation models performs better than the BMA-RWB-RWV one, especially when it comes to one-year ahead prediction. The remaining BMA-based inflation models

outperform the UCSV model solely in terms of inflation density predictions. Of the alternatives to our BMA-based approaches, only the AR-BMA-SBB-SBV and AR-BVS-SBB-SBV models seem to do perform satisfactorily and only relative to density forecasts from the UCSV model.

The results of a similar evaluation for GDP deflator inflation is summarized inTable 5. For the 1980Q1–2011Q2 period, our BMA-based specifications, except for the one without structural breaks, outperformed the UCSV model in terms ofh=1 point forecasts; see the left-hand side panel ofTable 5. Interestingly, in terms of current quarter GDP deflator inflation density forecasts, the BMA-based specifications that shut down either of the two channels for structural instability, BMA-SBV and BMA-SBB, are the most successful ones according to the avCRPS ratios. Of the alternative forecasting models, the RW model performs well for point forecasting and AR-BMA-SBB-SBV does so for den-sity forecasting. As for PCE deflator inflation, one-year ahead GDP deflator inflation forecasting for the 1980Q1–2011Q2 sam-ple is more challenging for the models under consideration, as the improvement relative to point predictions from the UCSV model is small or even absent. The BMA-SBB-SBV, BMA-SBV, AR-BMA-SBB-SBV, and AR-BVS-SBB-SBV models outper-form the UCSV-based density forecasts at this horizon. Our BMA-based approach dominates when we conduct the fore-cast evaluation for the 1985Q1–2011Q2 sample, on which we report in the right-hand side panel of Table 5. When trading off point forecasts (as summarized by the RMSE ratios) ver-susdensity forecasts (summarized by the avCRPS ratios), our