Chapter 3. Previous Literature and Analytical Methodology
2. Econometric Analysis Methodology
Figure 3-1. City Gas Consumption Trend in Energy-Intensive Industries
Source: Monthly Energy Statistics
석유화학 Petrochemical
철강 Steel
조립금속 Fabricated metal
Third, this study analyzed the city gas demand function of each energy-intensive industry among consumers of industrial city gas. As mentioned above, it is not advisable to bundle and analyze the demand for city gas of each industry all together in one total demand, because the characteristics of the demand function vary by the usage type of city gas. As examined in Chapter 2, each industry has different characteristics in terms of its consumption of industrial city gas. In addition, the statistical characteristics of time series data on city gas consumption differ by industry. This can be confirmed by Figure 3-1 (above). As each industry reacts differently to the relative energy prices and temperature, it is problematic to group all of these industries together and analyze them using a single industrial city gas demand function. Therefore, unlike previous studies, this study uses different city gas demand functions for each of the three industries with high industrial city gas consumption, and analyzes them separately.
2.1. Unit Root Test with Structural Changes33
The trends of time series data with unit roots (referred to as integrated time series or non-stationary time series data) and stationary time series data with one or more structural changes (or breaks) are similar to each other. This means that a specific time series with a structural change but no unit root tends to be mistaken for having a unit root when a general root test is performed.
To better understand this, let’s take a quick look at unit roots and the characteristics of a time series with unit roots through a simple AR (1) model, as shown below.
yt=αyt−1+ε t (1)
If the value of α is less than 1, the time series yt is considered to be a stationary time series. A stationary time series is a time series in which the distributional properties of random variables do not depend on the time at which the series is observed. In general, a stationary time series often refers to weakly stationary time series, whose random variables have a constant mean and variance when shifted in time.34
Figure 3-2. Trends of Stationary Time Series and Non-Stationary Time Series
Note: The data was generated using the GAUSS program, and the data generating process was: yt=αyt−1+ε t , εt∼ iid N(0, 1), y0= 0
A stationary time series has the same mean and variance for any sub-interval, and therefore its plots show relative regularity. A stationary time series with a deterministic trend (or non-stochastic trend) shows a certain variability as it rises or falls in a certain trend. In the figure above, time series data were generated using Equation (1) with α values of 0.3, 0.5, 0.7, and 1. Time series with an α value of less than 1 are stationary time series. As the graph shows, all three stationary time series appear to have a
33 This section draws upon research from Perron (2006).
34 Weak stationarity requires only constant mean and constant variance, and therefore has “weak(er)” requirements than strict stationarity, which requires a constant distribution over time.
-20 -15 -10 -5 0 5 10 15 20 25
1 11 21 31 41 51 61 71 81 91 101 111 121 131 141 151 161 171 181 191
α=0.3 α=0.5 α=0.7 α=1 (1) α=1 (2)
constant mean and variance.
On the other hand, when the α value is 1, yt is called a non-stationary time series, or an integrated process, and it is said to have unit roots. A non-stationary time series is a time series that is not stationary, which was defined previously in this study. It is a time series with different distributional properties depending on the sub-interval. A non-stationary time series is also called an integrated process, because when α equals 1, Equation (1) is expressed as follows:
yt = yt−1+ε t= y0+∑tτ=1ετ
This shows that yt is expressed by the sum of the initial value and all of the integrated shocks (ε) from the past. Also, a non-stationary time series has a unit root, since the unit root of the characteristic equation of Equation (1) becomes 1 (unit root) when α equals 1.
A non-stationary time series in which α equals 1 follows a stochastic trend. A stochastic trend means that the time series has a trend that changes stochastically, as shown in Figure 3-2 (above). Since the stochastic trend of a non-stationary time series is not constant, its data plots can result in completely different shapes even if the data is generated using the same data generating process (DGP), as shown in Figure 3-2 (compare α=1 (1) and α =1 (2)).
The ADF test (Augmented Dicky-Fuller test) is most widely used as a unit root test, and it is based on the following equation:
∆yt=αyt−1+β0+β1t +∑pk=1θk∆yt−k+ν t (2)
The null hypothesis is “yt has a unit root (α=0),” and the alternative hypothesis is “yt has no unit root (α<0).”35 These hypotheses are tested using a t-statistic for α=0.36
However, as can be seen in the graphs of α=1 (1) and α=1 (2) in Figure 3-2, when time series that follow a stochastic trend are divided at breakpoints, these subdivided data show similarities to a stationary time series with a deterministic trend. For this reason, when a stationary time series with breaks in its mean or trend is tested by a general unit root testing method that does not take such breaks into account, it is often erroneously considered to have unit roots.
Therefore, Perron (1989, 1990, 1994, and 1997) and Perron and Vogelsang (1992a, 1992b, 1993a, and 1993b) proposed a unit root testing method that accounts for such breaks (or structural changes),37 which is based on Equation (3) below:38
∆yt=αyt−1+β01+β11t +β02DUt+β12DTt∗+∑p θk
k=1 ∆yt−k+ν t (3) In this equation, DUt and DTt∗ are defined as follows:
35 Equation (2) is Equation (1) without 𝑦𝑦𝑡𝑡−1 on both sides of the equation and including additional terms. Therefore, when α has a value of 1 due to 𝑦𝑦𝑡𝑡 having a unit root in Equation (1), α has a value of 0 in Equation (2).
36 Here, the t-statistic does not follow a standard t-distribution, meaning that the critical value for the test differs from that of a standard t-test.
37 Strictly speaking, a break and a structural change are conceptually different. A break refers to the occurrence of a break in the time series from a statistical point of view, while a structural change refers to the occurrence of a structural change in the data generating process or model. However, the two concepts are used interchangeably in statistics and econometrics.
38 The unit root test used in this study allowed for one break in the trend. This section also describes the test method that allows a single break.
DUt= 1 if t〉 T1, 0 otherwise,
DTt∗= t−T1 if t〉T1, 0 otherwise.
DUt is used to account for a break in the constant term, and DTt* is to account for a break in a trend.
The unknown T1 is a breakpoint. Since the time of the breakpoint is unknown, the statistics are calculated as follows:
tα∗=inf λ1∈[ε, 1−ε]tα(λ1)
In this equation, λ1 is the value that determines the position of the breakpoint, and tα(λ1) is the t- statistic calculated in Equation (3) when the breakpoint is determined by λ1. A breakpoint is calculated by T1= [Tλ1].39 ε generally uses 0.15, meaning that the smallest value from the calculation of tα(λ1) by trimming off the first and last 15 percent of the sample and regarding all other time points as breakpoints becomes a test statistic. The test statistic of the ADF test is a negative number, which means that the smallest value is actually the largest absolute value. Therefore, the time point at which the null hypothesis (yt has a unit root) is most likely to be rejected is selected as a breakpoint, and the statistic value at this time point is used for the unit root test.40
2.2 Bai-Perron test
Testing structural changes in regressions is a classic problem that has been addressed in econometrics for a long time. When the time of a structural change is known due to an important event such as a financial crisis or oil shock, it is relatively simple to test whether the structural change occurred before or after that time. In such cases, a Chow test is conducted to ascertain the significance of the coefficient value of a dummy variable by inserting a dummy variable that marks the structural breakpoint. Let’s examine this using a simple regression equation:
yt=α+βxt+εt, t = 1, . . . , T
Here, if the time of structural change is known as t=τ*, the above regression equation can be expanded as follows:
yt =α+βxt+γDt(τ∗) +δxtDt(τ∗) +εt, t = 1, . . . , T (4) Here,
Dt(τ∗) = 1 if t≥ τ∗, 0 otherwise
the Chow test is an F-test for H0: γ=δ= 0.
However, if the time of a structural change is not known, the problem becomes a little complicated.
The tests that have been widely used in such cases are so-called sup-tests (supremum tests). A sup-test was first proposed by Quandt (1960), and the statistic pioneered by Quandt (1960) is called the Quandt Likelihood Ratio (QLR) statistic, or Sup-Wald statistic. This test is similar to the Chow test described
39 [·] is a function for calculating the integer value.
40 Since the distribution of statistics calculated in this way has a different distribution from the standard ADF test statistics, the critical value used for the test differs from that used in the ADF test.
above, but it requires additional steps to perform the test, because the time points of structural change are unknown, and the distribution of the resulting statistics becomes quite complicated.
First, when the time of structural change in Equation (4) is τ∗ , the F-statistic calculated to test H0: γ=δ= 0 is F(τ∗). Then, the QLR statistic is as follows:
QLR = maxτ∗∈[τ0, τ1] F(τ∗)
Generally τ0 and τ1 are set as follows: τ0= [0.15T] , τ1= [0.85T] .41 The Chow test is performed for all time points after the first and last 15 percent of the sample are trimmed, and the F- statistic is calculated. And the largest value among them is the QLR statistic. Since the statistic values obtained in this way are the maximums of several F-statistic values, they have a completely different distribution from a simple F-statistic.42
This type of test is called a sup-test because it computes all the statistic values at each time point for a sample of a specific interval and uses the maximum (or supremum) as the test statistic.43
The Bai-Perron test used in this study is a generalized version of a sup-test, and it was proposed by Bai (1997) and Bai and Perron (1998 and 2003). This test can be applied when the number and time points of structural changes in a regression equation are completely unknown. The operational principles of the Bai-Perron test are as follows:44
1) First, the maximum value of the statistic is obtained by applying a sup-test such as the QLR test described above to the whole sample. The obtained statistic is then used to decide the presence of a structural change.
2) If the structural change is statistically significant, the sample is divided into two at the breakpoint.
The same test method as above is applied to each subsample to test for structural change and estimate the time point.
3) The above test is repeated until no statistically significant structural breakpoint is found.
When structural breakpoints in regression have been estimated, the sample can be divided into intervals at each breakpoint, and the structural changes in regression can be estimated using dummy variables for each interval.45
41 [·] is a function for calculating the integer value.
42 The limiting distribution of the QLR statistic is derived by Andrews (1993) and Andrews and Ploberger (1994).
43 A sup-test uses a variety of statistics, such as the likelihood ratio statistic, Wald statistic, and LM statistic.
44 In general, the Bai-Perron test includes the global maximizer test, sequential testing procedure, global plus sequential test, and others. The test method described in this paper is the sequential testing procedure, which is the most intuitive of all the tests.
45 This test method has the advantage of not having to estimate the number of structural changes in advance by using information criteria, such as BIC or LWZ, since it sequentially tests all cases, from those with no structural changes to those with significant structural changes. See Section 5.2.2. of Bai and Perron (1998).