Chapter 6: Results of the HWES and ARIMA forecasting models
6.3 Auto-Regressive Integrated Moving Average (ARIMA)
6.3.1 Results and discussion of the application of the ARIMA model
6.3.1.3 Predictions based on the superior ARIMA Models
The two ARIMA models tested had closer performance statistics (MAE, MAPE, and RMSE) for all three periods examined. Thus, both ARIMA models are fitted for each evaluated sample period (period one: 2009 to 2017; period two: 2007 to 2017; period three: 2007 to 2020), and the prediction results are presented in the Appendix in the following order. Table B.7 and Table B.10 in Appendix B show the forecast results of ARIMA Model 1 and ARIMA Model 2 originating from the models’ application to the NZX 50 Index from 2009 to 2017 (sample period one). Table C.7 and Table C.10 in Appendix C represent the prediction results of
ARIMA Model 1 and ARIMA Model 2 derived from ARIMA models’ application to the NZX 50 Index from 2007 to 2017 (sample period two). Likewise, Table D.7 and Table D.10 in Appendix D provide the prediction results of ARIMA Model 1, and ARIMA Model 2 applied to NZX 50 Index from 2007 to 2020 (sample period three). Examination of the actual and predicted series in Tables B.7, B.10, C.7, C.10, D.7, and D.10 reveal that both ARIMA models effectively predict the NZX 50 Index for all the sample periods investigated. However, closer scrutiny of the actual and predicted series reveals that ARIMA Model 1 is more effective than ARIMA Model 2 for all sample periods tested. The reason is that the residuals from ARIMA Model 1 are marginally lower than that of ARIMA Model 2. These findings confirm my conclusions based on the MAE, MAPE, and RMSE comparisons presented in Table 6.5.
Figures 6.15 – 6.20 portray the comparative time plots displaying the actual and the predicted series. More precisely, Figure 6.15 and Figure 6.16 signify the time plots showing observed values against the forecast values obtained from ARIMA Models 1 and 2, respectively, when the models are applied to sample period one (2009 to 2017). Similarly, Figure 6.17 and Figure 6.18 represent time plots displaying actual series against the predicted series derived from ARIMA Models 1 and 2, respectively, when these models are applied to period two (2007 to 2017). Likewise, Figure 6.19 and Figure 6.20 represent time plots displaying actual values against the predicted values extracted from ARIMA Models 1 and 2, respectively, when the models are applied to period three (2007 to 2020). Closer scrutiny of these figures confirms the predictive precision of the devised ARIMA models.
Figure 6.15: ARIMA (Model 1) Actual versus Prediction – Period one (2009 – 2017)
Figure 6.16: ARIMA (Model 2) Actual versus Prediction – Period one (2009 – 2017)
Figure 6.17: ARIMA (Model 1) Actual versus Prediction – Period two (2007 – 2017)
Figure 6.18: ARIMA (Model 2) Actual versus Prediction – Period two (2007 – 2017)
Figure 6.19: ARIMA (Model 1) Actual versus Prediction – Period three (2007 – 2020)
Figure 6.20: ARIMA (Model 2) Actual versus Prediction – Period three (2007 – 2020)
Consequently, the residuals, which are the deviations of the observed values from the estimations, are derived from the applications of the ARIMA models evaluated in three samples. The residuals are then further analysed. The descriptive statistics of the residuals are shown in Tables B.8 and B.11 in Appendix B, Tables C.8 and C.11 in Appendix C, and Tables D.8 and D.11 in Appendix D. More precisely, Tables B.8 (B.11), C.8 (C.11), and D.8 (D.11) show the residuals of ARIMA Model 1 (ARIMA Model 2) applied to sample periods one, two, and three, respectively. During sample period one (2009 – 2017), the mean of the residuals for the ARIMA Model 1 is 0.6214, which is slightly lower than the mean of the residuals (0.7163) for the ARIMA Model 2. Similarly, the centre measures of the residuals are clustered around 0, confirming the prediction accuracies of the tested ARIMA models.
OLS regressions, which measure the linear relationship between independent and dependent variables, are used to investigate the simple linear relationship between "prediction results from each ARIMA model" as a function of the “actual observations”. R2 is used to ascertain the model's goodness of fit. Tables B.9, B.12, C.9, C.12, D.9, and D.12 provide the OLS regressions based on the forecasts from the ARIMA models applied to different sample periods as a function of the actual observations. Forecasts from each ARIMA model are used as the dependent variable, and the observed values are used as the independent variable. For example, Table B.9 provides the simple OLS regression results showing the linear relationship between the predictions from the ARIMA Model 1 when it is applied to sample period one and the observed values. More precisely, Table B.9 shows that 99.55% of the total variation in the predicted series from ARIMA Model 1 is explained by the actual observations, confirming the explanatory power of the regression based on the ARIMA Model 1 and the observed values.
The slope coefficient suggests that if the variable “actual observation” increased by one, ARIMA Model 1 based prediction series increased by 0.9965. This evidence confirms the
closeness of the predictions from ARIMA model 1 and the actual observations and validates the prediction efficacy of ARIMA Model 1.
Finally, three widely used hypothesis tests are performed to determine the effectiveness of the OLS regressions presented in Tables B.9, B.12, C.9, C.12, D.9, and D.12. These investigations further validate the prediction efficacy of ARIMA Models evaluated in my thesis. For example, Table B.9 show simple regression results of the predictions from ARIMA Model 1 as a function of the actual observations. Firstly, I assess the overall joint significance of each of the estimated regressions. A joint significance test is performed to ascertain whether the regression has any explanatory power between the predictions from any tested ARIMA model and the actual observations. A joint explanatory power confirms that the two variables, for example, the forecasts from ARIMA Model 1 and the actual observations, are close to each other. For this investigation, using the F-test, a null hypothesis of no collective explanatory power of the estimated regression (all regression parameters are zero) is tested at a 5 percent significance level. For example, the P-value of the F-statistic for the estimated OLS regression presented in Table B.9 is 0.0000. It is statistically significant at 5 percent significance, thus, leading to a conclusion that estimated OLS regression has a joint explanatory power. This conclusion is consistent with all other OLS regressions presented in Tables B.12, C.9, C.12, D.9, and D.12. These findings validate that the actual observations and the predictions are in proximity to each other and confirm the predictive efficacy of the ARIMA models. Secondly, a hypothesis test for the estimated slope is conducted to determine whether the independent variable is a significant (linear) predictor of the dependent variable. For example, the P-value of the slope coefficient in Table B.9 is 0.0000, which is less than the 5% significance level.
Thus, this evidence confirms the rejection of the null hypothesis that the actual slope coefficient is zero. This finding confirms the independent variable “actual observations” is a significant predictor of the dependent variable “predictions from ARIMA Model 1”. This conclusion is in
harmony with the rest of the OLS regressions presented in Tables B.12, C.9, C.12, D.9 and D.12, and confirm the rejection of the null hypothesis that the actual slope coefficient is zero.
This finding justifies that the variable named “actual” has a statistically significant effect on the “predicted” series from ARIMA in OLS regressions presented in Tables B.9, B.12, C.9, C.12, D.9, and D.12. Finally, R2 [304]–[306] are used to assess the goodness of fit of the estimated OLS models related to ARIMA models. R2 has limiting values of 1 (fits perfectly) and 0 (not better than the mean of the dependent variable). For example, the R2 of Table B.9 is 0.9955, which confirms the superior explanatory power of the OLS regression of ARIMA Model 1 for period one (2009 – 2017). This finding is in harmony with all other OLS regressions presented in Tables B.12, C.9, C.12, D.9, and D.12.
The findings from the residual summaries (presented in Tables B.8, B.11, C.8, C.11, D.8, and D.11) and the OLS regressions (given in Tables B.9, B.12, C.9, C.12, D.9, and D.12) validate the predictive precision of the ARIMA models when they are employed to the NZX 50 Index in the tested periods.