Methodology - DATA AND METHODOLOGY - Cryptocurrencies Price Forecast and Hedging Capabilities

CHAPTER 3: DATA AND METHODOLOGY

3.4 Methodology

d_OilClose 0.011004 0 -27.62 19.36

Variable Std. Dev. C.V. Skewness Ex. kurtosis

d_BTCClose 1028.1 308.55 -0.19783 9.3102

d_LTCClose 7.5708 74.701 -1.7007 33.781

d_SP500Close 36.583 69.515 -0.93823 10.513

d_FTSEClose 62.117 135.58 -1.1255 13.13

d_GoldClose 13.454 65.475 -0.5564 9.5597

d_OilClose 1.7111 155.5 -2.3896 55.917

Variable 5% Perc. 95% Perc. IQ range Missing obs.

d_BTCClose -1545.9 1594.2 453.81 1

d_LTCClose -10.698 9.8298 4.037 1

d_SP500Close -58.209 54.134 17.785 1

d_FTSEClose -99.884 87.861 38.135 1

d_GoldClose -21.53 21.1 7.1 1

d_OilClose -2.206 2.105 0.75 1

Also we are going to test the short and long term relationship between cryptocurrencies (Bitcoin, Litecoin) with financial Markets (S&P500, FTSE100) and commodities (Gold, Oil) using (Granger Causality test) to detect if short term relationship existing using Vector Auto Regression (VAR) model, and finally we will conduct (Engle-Granger) Cointegration Test to detect any long term relationship and if existing we will estimate this relationship using Vector Error Correction Model (VECM), as both causality and Cointegration forecast will be evaluated using ME, MAE, MPE and MAPE error evaluation metrics, below is a brief about the chosen Models :

3.4.1 (ARIMA) Model

Autoregressive Integrated Moving Average (ARIMA) is a time series statistical model used in forecasting, it is looking to the past values and trend to forecast future values, its widely used in many areas especially in stock price forecasts.

ARIMA name referring to the three components used in the model, (AR) Autoregressive which the model using past values of the variable itself to predict its future value, (I) integrated refers to the differencing level required in order to transform the raw data to stationary data, however if the raw data found to be stationary at level the model will be changed to be ARMA instead of ARIMA, last is (MA) Moving average referring to error terms of previous time point which used to forecast future observations (medium.com).

To forecast future values using ARIMA model we need to determine its three parameters p, d, q, where p is the number of autoregressive term (AR), d refers to differencing level required (I) and q refer to the number of moving average terms (MA).

Advantages of ARIMA model coming from its simplicity of use, providing good short-term prediction, depending only on historical data and can deal with non-stationary data throw the

transformation (I), the disadvantages are the model that it has less accuracy in long term forecast, parameters are subjective, the model are not accurate if data including seasonality or trend changes (investopedia.com).

The ARIMA model equation can be written as follow:

Y

=ϕ

Y

t−1

+ϕ

Y

t−2

…ϕ

Y

t−p

+ϵ

+ θ

ϵ

t−1

+ θ

ϵ

t−2

+…θ

ϵ

t−q source of the equation (quantinsti.com)

Where

ϕ

is the coefficient of autoregressive term (AR),

ϕ

p refers to the order of (AR) term,

ϵ

^tis the error term (Random error),

θ

¹coefficient of the first moving average term,

θ

^q^{is the}

order of moving average term.

3.4.2 Granger Causality

Granger Causality is an important tool used in analyzing time series data, it is first Proposed in (1969) by Prof Clive W.J. Granger recipient of the Nobel Prize in Economics (2003), Clive Granger defended the name of the statistical test as to be a true Causality, however; many others argued that it's kind of correlations rather than true causality, because of this many literatures used Granger-Causes to refer to the test result instead of true causality.

So we can say that X granger causes Y if X can provide statistically significant enhance about future forecasting of Y (Wikipedia), Granger Causality test is used in many fields especially in economy, finance, genomics and neuroscience (A.Shojaie and Fox, 2021).

The test has been built in the following regression model.

(Source of the equation (https://real-statistics.com)

Here, the αj and βj are the regression coefficients and εi is the error term. The test is based on the null hypothesis that there is no Granger causality exists (H0: β1 = β2 = … = βm = 0) so once

the null hypothesis is rejected accordingly Granger-causality existing as X granger causes Y.

Limitations of Granger causality test mainly coming from that it does not provide any details about the relation in between the variables because it’s not actual causality and it cannot be performed on non-stationary data.

Since Granger causality can provide causality between two series and Vector Autoregression (VAR) can provide estimate for multiple series in even a comprehensive way, so we will use VAR to determine the Granger causality for the variables chosen in this study using Gretl software.

3.4.3 Vector Autoregression (VAR)

Vector Autoregression (VAR) is a multivariate time series analysis model that used especially when variables have relationship that affects each other and the variable itself past observation, VAR are widely Used in economics & finance and Recently in Medicine (captech.com).

Compared to ARIMA model discussed earlier ARIMA Model is a univariate affected only with the variable past observation of itself, while (VAR) is affected by its past observation as well as past observations of other variables.

Two-dimensional (VAR) equation will be as follow:

y1, t=c1+ϕ11,1y1,t−1+ϕ12,1y2,t−1+e1,t y2, t=c2+ϕ21,1y1,t−1+ϕ22,1y2,t−1+e2,t,

source of the equation (Hyndman and Athanasopoulos, 2nd edition)

Those who criticized (VAR) say that it has not been built on structural economic theory throw its equation as all variables affecting each other which makes interpretation of the equation coefficients difficult, on the other hand (VAR) advantages such as the model can forecast many related variables, testing Granger-causality between two variables and the model can provide

response analysis of variables (Hyndman and Athanasopoulos, 2^nd edition).

The variables data used in (VAR) must be stationary which we already transform the variable data earlier for this reason.

3.4.4 Cointegration &Vector Error Correction Model (VECM)

Cointegration refers to long-term relationship between time series variables where the same random (stochastic) upward and downward trend affecting them jointly, where they are both not stationary at level but the difference among them is stationary in the long run, this common trend called cointegration (Kilian, Lütkepohl, 2017, Cambridge Press)

So Granger causality test which we will use in this study will not detect if the variables are moving together in the long-term so we need to do cointegration test to investigate if relationship exist in the long-run, hence we are going to use Engle Granger cointegration test which introduced by Granger & Robert Engle 1987 to detect if our time series variables are integrated, we will do this using Gretl software.

Engle Granger test will show us if the residuals are stationary and if the variable are stationary at level using ADFT test as this confirm that cointegration exist between these variables, therefore if cointegration exist we will deploy Vector Error Correction Model (VECM) to investigate the relationship. Vector Error Correction Model (VECM) on the other hand is a multivariate time series it is extend to Vector Autoregression (VAR) where it detects the long- run relationship (error correction term) as well as short run, VECM model is widely used in finance and macroeconomics.

Dalam dokumen Cryptocurrencies Price Forecast and Hedging Capabilities (Halaman 47-52)