Wavelet-based time series model to improve the forecast accuracy of PM10

(1)

Wavelet-based time series model to improve the forecast accuracy of PM

₁₀

concentrations in Peninsular Malaysia

Ng Kar Yong_& Norhashidah Awang

Received: 10 August 2018 / Accepted: 4 January 2019

#Springer Nature Switzerland AG 2019

Abstract This study presents the use of a wavelet- based time series model to forecast the daily average particulate matter with an aerodynamic diameter of less than 10μm (PM10) in Peninsular Malaysia. The high- light of this study is the use of a discrete wavelet transform (DWT) in order to improve the forecast accuracy. The DWT was applied to convert the highly vari- able PM10series into more stable approximations and details sub-series, and the ARIMA-GARCH time series models were developed for each sub-series. Two different forecast periods, one was during normal days, while the other was during haze episodes, were designed to justify the usefulness of DWT. The models’ performance was evaluated by four indices, namely root mean square error, mean absolute percentage error, probability of detection and false alarm rate. The results showed that the model incorporated with DWT yielded more accurate forecasts than the conventional method without DWT for both the forecast periods, and the improvement was more prominent for the period during the haze episodes.

Keywords ARIMA-GARCH . Discrete wavelet transform . Forecast . Particulate matter . Time series

Introduction

Transboundary haze from Indonesia remains an annual intractable environmental issue in the Southeast Asia region, including Malaysia (Sahani et al.2014). Studies have shown that PM10is the major component of haze (Afroz et al.2003). Furthermore, among the five primary air pollutants (PM10, ozone, nitrogen dioxide, sulphur dioxide, carbon monoxide), PM10is the most significant pollutant in Malaysia even during normal days (DOE 2015). Apart from forest fires and peatland burning during haze events, industries and automobiles are the key sources of PM10pollution (Afroz et al.2003). The combustion processes from open burning and the re- lease of nitrogen oxides (NOx) and sulphur dioxide (SO2) from factories and vehicles are the precursors of PM10(Bhattacharjee et al.1999; Afroz et al.2003). In addition, weather conditions will also largely determine the formation of PM10. Usually, high concentrations are observed during the Southwest monsoon which is a dry season, while low concentrations are recorded during the rainy season. Additionally, temperature, wind speed, wind direction and pressure will also critically affect the PM10concentration (Liew et al.2009,2011).

According to the Malaysian Ambient Air Quality Guidelines, the stipulated limit for a 24-h average PM10concentration is 150μg/m³(DOE2015). Concen- trations exceeding this limit will lead to an unhealthy status in the Air Pollution Index (API) (DOE2000). The PM10 pollution will not only cause a downgrade in visibility (Shaharuddin et al.2008) and economic losses https://doi.org/10.1007/s10661-019-7209-6

N. K. Yong

:

N. Awang (*)

School of Mathematical Sciences, Universiti Sains Malaysia, 11800 USM, Penang, Malaysia

e-mail: [email protected]

(2)

(Afroz et al. 2003), but most importantly, it will also lead to serious health effects to the respiratory and cardiopulmonary systems (Peng et al.2008; Anderson et al.2012; WHO2013). Therefore, advance warning of high PM10days should be provided to the public, and this relies on an accurate forecasting system.

Various statistical forecasting methods have been developed and employed in pollution research. The most commonly used methods are the linear models such as the linear regression models (van der Wal and Janssen 2000; Diaz-Robles et al. 2008; Vlachogianni et al. 2011) and the autoregressive integrated moving average (ARIMA) models (Chelani and Devotta2006;

Diaz-Robles et al.2008; Joo and Kim2015). For exam- ple, van der Wal and Janssen (2000) conducted an analysis of PM10 concentrations in the Netherlands using Kalman filtering and multiple linear regression (MLR). Vlachogianni et al. (2011) constructed a MLR model to predict PM10and NOxconcentrations in Ath- ens and Helsinki. Although these linear models performed well in general, they failed to capture the sudden elevation of the pollutants.

Numerous attempts have been done to improve the forecast accuracy during pollution peak episodes by applying nonlinear models or hybrid models. One rep- resentative was the study by Diaz-Robles et al. (2008).

They combined an artificial neural network (ANN) and ARIMA models to forecast the PM10concentrations in Temuco, Chile, and this method was proven to be more excellent than ARIMAX (ARIMA model with exoge- nous variables), ANN or MLR methods in detecting the pre-emergencies events. Liu (2009) proposed various types of regression with time series error (RTSE) models to forecast the daily average PM10concentrations in Ta- Liao, Taiwan. The results showed that the RTSE model incorporated with principal component analysis (PCA) enhanced the probability of the detection of the PM10

concentrations above 150μg/m³, as well as reduced the false alarm rate. Reisen et al. (2014) applied a joint approach of seasonal autoregressive fractionally integrated moving average (SARFIMA) and generalised autoregressive conditional heteroscedasticity (GARCH) models to forecast the PM10concentrations in Cariacica, Brazil, and this combined model success- fully described the volatile time series.

Besides applying the nonlinear models, wavelet transformation incorporated with the time series models has been validated to be useful in analysing the pollutant’s behaviour (Shaharuddin et al. 2008) and

improving the forecast performance (Siwek and Osowski 2012; Joo and Kim 2015). Siwek and Osowski (2012) used wavelet decomposition and an ensemble of neural networks to forecast the PM10

concentrations in Warsaw, Poland. They suggested that the variability of the decomposed series decreased through wavelet decomposition, thus, smoothing the forecasting process and enhancing the forecast p e r f o r m a n c e . T h e y c o n c l u d e d t h a t w a v e l e t decomposition was the main factor for the great improvement in the forecast results. Joo and Kim (2015) also demonstrated the advantage of wavelet filtering in enhancing the forecast accuracy particularly in the scenarios of seasonal time series with a lot of noises.

To our knowledge, there are no studies conducted in Malaysia which use a wavelet-based time series model in air pollutant forecasting. Since our dataset exhibits large variances and is non-stationary, we propose this technique with the aim of improving the forecast performance especially during PM10peak seasons. Using this technique, the PM10series is decomposed into a few approximations and detailed sub-series by using the discrete wavelet transform (DWT). Then these sub- series are modelled using the integrated ARIMA- GARCH modelling technique. The final forecast is then aggregated from the forecasts of the sub-series. To assess the performance of the proposed method, the final forecasts from the proposed technique are compared to the forecasts from the time series modelling without DWT. The following section explains the data used and the methods involved in this study. The subsequent section presents the results and discussion. Finally, a conclusion is provided in the last section.

Experimental

Data description

Secondary data of PM10 concentrations from 2013 to 2014 was obtained from the Malaysian Department of Environment (DOE). The PM10readings were recorded at 52 continuous air quality monitoring stations located across Malaysia (DOE2015), using aβ-ray attenuation mass monitor (BAM-1020). This network of monitoring stations was established and operated by Alam Sekitar Malaysia Sdn Bhd (ASMA), which is a company empowered by the Malaysia DOE (Liew et al.2011).

The daily average PM10concentrations aggregated from

(3)

the hourly readings were used in this study. Five monitoring stations with different location backgrounds in Peninsular Malaysia were selected, namely stations at Seberang Jaya (CA09), Nilai (CA10), Klang (CA11), Petaling Jaya (CA16) and Batu Muda (CA58). These monitoring stations were chosen by considering their different location backgrounds and the completeness of the data. Furthermore, these monitoring stations experi- enced high PM10 concentrations particularly during haze periods. Table 1 shows the categorisation of the stations. There was only one missing value in total. The mean of the nearby points with a span of two was used to estimate the missing value.

The procedure for this study works as follows. First, the DWT is applied to the original PM10series to obtain a set of subsidiary series of approximations and detail components. Then, independent ARIMA models are fitted to the series of approximations and details. If the result from a diagnostic test of the residual shows time- varying variance, the GARCH model is applied together with the ARIMA model. The overall forecasts are finally acquired by summing up the forecasts from the sub- series. The subsequent sub-sections elucidate the meth- odologies used, which are DWT and ARIMA-GARCH modelling.

Discrete wavelet transform

DWT is an approach working with a multi-scale. It is a dyadic subsampling from continuous wavelet transform (CWT) to avoid excessive information for the time series reconstruction (Percival and Walden 2000). It breaks down the series into subsidiary series with the same scales and frequencies, and at the same time, has the benefit of time localization (Joo and Kim 2015).

With decomposition, the variability of the sub-series is reduced, thus facilitating the forecasting task (Siwek and Osowski2012). The procedures used to conduct DWT in this study are written in R language.

The DWT process is a process of filtering original time series to separate the trend and variation components. This process is accomplished by Mallat’s pyramid algorithm as illustrated in Fig.1.

Initially, the original seriesYis passed through the high-pass filterHand the low-pass filterG, followed by downsampling by factor 2 (↓2) to obtain the first-level wavelet (W1) and scaling coefficients (V1). The downsampling by two means retaining every other value of the filtering output. Next,V₁is treated in a similar way of filtering and downsampling to obtain second- levelW2andV2. This operation is continued up to level J, and the final outputs areW1,…,WJ andVJ, with their coefficients reduced to half when proceeding to a successive higher level. Mathematically, the DWT procedure is given (Percival and Walden2000) as

W¼½W1 ⋯ WJ VJ^T¼½ω1 ⋯ ωJ νJ^TY¼Η^TY; ð1Þ whereWis theN-dimensional vector consisting of (N/

2^j)-dimensional sub-vectors of DWT coefficients (Wj

andV_J),His theN×NDWT matrix containing orthog- onal wavelet matricesωjand scaling matrixνJ(sizes of N/2^j×N) at decomposition levelj,j= 1, 2,…,J, andY is theN-dimensional series vector. The matrix ofωjand νJ comprises rows of circularly shifted N-periodized filters. The N-periodized filter refers to a sequence which consists oflfinite non-zero numbers, while the rest of theN−lare zeroes. LetLbe the length of the filter, and the non-zero numbers in rows ofωjandνJ

are equal to 2^jL.

In fact, through multiresolution analysis (MRA), the series is decomposed and resulting in detail seriesDjat each level,j= 1, 2,…,Jand approximation series at the last levelA_J, whereJis the level of decomposition. As such, the original seriesYcan be recovered by plainly adding up the details and approximation series as Y¼ ∑^J

j¼1

D_jþA_J: ð2Þ

whereD_j≡ω^T_jW_jandA_J≡ν^T_JV_J(Percival and Walden 2000).

Time series modelling

After decomposing the PM10 series by DWT, the yieldedD_j^'s andA_J^'s become the inputs to the ARIMA model. The ARIMA (p,d,q) model which describes an Table 1 Categorisation of the five monitoring stations

Station ID Location Background category

CA09 Seberang Jaya Suburban

CA10 Nilai Industrial

CA11 Klang Urban

CA16 Petaling Jaya Industrial

CA58 Batu Muda Urban

(4)

observation at timet,Yt in terms of itsp past values, present error,εtandqpast errors, is expressed as

∇^dYt¼ϕ1Y_t−1þ…þϕpY_t−pþεtþθ1εt−1þ…

þθqεt−q; ð3Þ

where {ϕ1, …,ϕp,θ1, …,θq} arep+qparameters,p, d, q are the order of autoregressive, differencing, and moving average, respectively, and∇is the differencing operator such that ∇^dYt= (1−B)^dYt, where B is the backshift operator defined as B^kYt=Yt−Yt−k. This model assumes a second-order stationary time series.

Hence, the augmented Dickey-Fuller (ADF) test and the Kwiatkowski-Phillips-Schmidt-Shin (KPSS) test are used to examine the stationarity of the series. If the series is non-stationary, differencing is performed d times repeatedly until the series shows stationarity. If there are multiple candidates of the models, the model with the lowest Bayesian Information Criterion (BIC) is chosen (Brockwell and Davis2002).

After ARIMA modelling, the residuals are checked for their autocorrelation by autocorrelation function (ACF), their heteroscedasticity by the ARCH Lagrange multiplier (LM) test, and normality by histogram and the Jarque-Bera (JB) test. In the current analysis, the residuals are considered non-autocorrelated if their ACF are within the confidence interval at the first six lags. Ho- moscedasticity is confirmed by the insignificantpvalue of the LM test statistics at lag six. Otherwise, it proves the existence of heteroscedasticity. A bell-shaped histogram and an insignificantpvalue of JB test suggest a normal distribution of the residuals. Here, the signifi- cance level used is at 5% level.

When the constant variance assumption of ARIMA model is violated, GARCH is applied to model the heteroscedastic ARIMA errors. A GARCH model states that the variance of errors from Eq.3can be estimated by its lagged errors and lagged variances. The GARCH (r,s) model is defined as

h_t¼α0þ ∑^r

i¼1αiε²_t₋_iþ ∑^s

j¼1β_jh_t−_j ð4Þ

εt¼zt

ffiffiffiffiht

p ;zt∼Dð0;1Þ; ð5Þ

where ht is the conditional variance, εt are the ARIMA errors, {α0,αi,βj} are the parameters and D is the normal or student-t probability density function of the residuals, with conditions of α0>

0,αi, βj≥0 and ∑^r

i¼1αiþ ∑^s

j¼1β_j<1 so that the GARCH process is weakly stationary (Brockwell and Davis 2002). If r= 0, the model is simplified to ARCH(s) model. The modelling of ARIMA and ARIMA-GARCH are carried out by using Eviews 8.

Results and discussion

The datasets were separated into modelling and validation sets with periods of 365 days and 14 days, respectively. Two experiments were done to examine the usefulness of a wavelet transformation in forecasting the sudden rise of PM10concentrations using two different datasets in which their validation periods included normal days (Dataset1) and high PM10 days (Dataset2).

Dataset1 consists of data from 18 December 2013 to 31 December 2014, whereas Dataset2 consists of data from 17 July 2013 to 30 July 2014. Figures2and3are the plots of the PM10time series at the five stations for Dataset1 and Dataset2, respectively. Tables 2 and 3 display the descriptive statistics of Dataset1 and Dataset2, respectively, for all stations.

As depicted in Figs. 2 and 3, all stations recorded high PM10concentration days in some of the months of March, July, September and October. The series illustrated bumps at different scales, so it is believed that DWT can help in the modelling process by segregating them according to different scales. Figure2shows that the air quality was back to normal from November to December, whereas Fig. 3 manifests the unexpected increase in the concentration at the end of the series

G H

Fig. 1 Flow chart of DWT executed by Mallat’s pyramid algorithm

(5)

for all stations. The dissimilarity in both the datasets distinguished the difficulty in forecasting. This is confirmed by the statistics shown in Table3; the means of the validation sets for all stations were much higher than their corresponding modelling sets. This indicated the rapid increase in the PM10levels, which consequently increased the difficulty in forecasting.

The analysis started with the wavelet decomposition of PM10series. There are several classes of DWT such as Daubechies, Coiflet, Symmlet and best localised.

Here, the most popular Daubechies wavelet filter was adopted. After some trials, the Daubechies d8 (wavelet filter with length of 8) wavelet was selected because this

length of filter yielded a rather smooth MRA. Since DWT was sensitive to different input series, the DWT was performed by using a whole-year data to retain the overall patterns of PM10in a year, and the decomposed series for the desired period were attached together (Siwek and Osowski 2012). Theoretically, the series must have a length with a power of two (Percival and Walden 2000), but the whole-year data clearly defied this. Some techniques have been introduced by Percival and Walden (2000), including series extension or trun- cation. A simple way was to pad the series with sample meanY. By doing this, the mean of the padded series remained unchanged and its variance was easily 0

50 100 150 200

PM10concentraon, μg/m3

Month/Year (a) CA09

0 50 100 150 200 250 300

Month/Year (b) CA10

0 100 200 300 400 500

PM10concentron, μg/m3

Month/Year (c) CA11

0 100 200 300 400

Month/Year (d) CA16

0 100 200 300 400

Monthy/Year (e) CA58

Fig. 2 Time series of Dataset1 for all stations

(6)

connected to the variance of the original series. Hence, prior to decomposition, the 1-year series was padded with sample meanYto make the series up to a length of 512 (=2⁹). The padded series was defined (Percival and Walden2000) as

Y⁰ ¼ Yt; t¼1;…;365 Y; t¼366;…;512:

ð6Þ In addition, it is believed that too many levels of decomposition will lose the advantage of decomposition (Siwek and Osowski 2012). To solve this problem, decomposition levelJwas manipulated (Joo and Kim 2015) from two to five. The level with the smallest root mean square error (RMSE) in the validation datasets was chosen.

Next, the natural logarithm was applied to the approximation series AJ resulting from DWT before modelling because they were not Gaussian distributed.

Subsequently, the ARIMA models were fitted toAJand Dj(j= 1, 2,…,J). The order for the ARIMA modelling was restricted top+q≤10, as order which is too high will make the model unstable and did not guarantee a 0

50 100 150 200

7/13 9/13 11/13 1/14 3/14 5/14 7/14 PM10concentraon, μg/m3

Month/Year (a) CA09

0 100 200 300

Month/Year (b) CA10

0 100 200 300 400 500

Month/Year (c) CA11

0 100 200 300 400

Month/Year (d) CA16

0 100 200 300 400

Month/Year (e) CA58

Fig. 3 Time series of Dataset2 for all stations

Table 2 Descriptive statistics of Dataset1 for the five stations CA09 CA10 CA11 CA16 CA58 Modelling Mean

(μg/m³)

49.32 63.80 71.69 54.78 54.92 SD 23.11 30.95 47.34 31.52 31.62 Validation Mean

(μg/m³)

43.33 40.50 57.01 36.32 38.39

SD 6.76 11.41 9.95 9.48 11.92

(7)

better model. If the residual diagnostic exhibited heteroscedasticity, the ARIMA-GARCH models were fitted. The final forecasts for every level of decomposition were obtained by using Eq. 2, and the level of decomposition which produced the smallest RMSE was selected. The time series modelling was also conducted on the original time series after taking the natural logarithm. Four evaluation indices were used to assess the model performance. They were RMSE, mean absolute percentage error (MAPE), probability of detection (POD) and false alarm rate (FAR). Good forecast models were indicated by the low values of RMSE, MAPE, FAR and a high POD. Denoting Fi as the forecast value,Oias the observed value, andn as the number of data points, the formulas of the RMSE and MAPE are shown as below.

RMSE¼

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi

∑ⁿ

i¼1ðF_i−O_iÞ² n vu ut

and ð7Þ

MAPE¼1 n ∑ⁿ

i¼1jO_i−F_i

Oi j 100%: ð8Þ To evaluate the overall performance of the models, the overall RMSE and MAPE were computed from the errors of each monitoring stations. The computations of the overall RMSE, MAPE, POD and FAR are as follows:

RMSE_overall¼

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi

∑^N

j¼1RMSE_j₂

N vu uu

t ; ð9Þ

MAPEoverall¼ 1 N ∑^N

j¼1MAPEj; ð10Þ

POD¼ A

AþB and ð11Þ

FAR¼ C

AþC ð12Þ

whereNis the number of monitoring stations, A, B, C and D are defined as in Table4. Here, the threshold limit used for POD and FAR is 150 μg/m³ of the PM10

concentrations.

The comparative forecast results between the conventional method (ARIMA-GARCH without DWT) and the proposed method (ARIMA-GARCH with DWT) for Dataset1 and Dataset2 are plotted in Figs.4 and5, respectively.

From the figures, it can be seen that the conventional method produces rather flat forecasts when extended to longer periods of forecast for both the datasets and all monitoring stations. In addition, the performance of the forecasts is even worse for Dataset2 as they totally underestimate the true values. This is reasonable as the expectation of the forecasts in the long term for the ARIMA method is just the sample mean. On the other hand, the forecasts generated by the proposed approach follow the general patterns of the observed values more closely as compared to the conventional method. In Dataset1, the forecasts by the proposed method do not differ too much from the forecasts by the conventional method. Both methods are still compatible when the series do not have large variances. However, for Dataset2, the proposed method performs much better Table 3 Descriptive statistics of Dataset2 for the five stations

CA09 CA10 CA11 CA16 CA58

Modelling Mean (μg/m³) 48.51 59.18 67.48 52.58 48.95

SD 20.75 28.15 43.33 28.74 27.57

Validation Mean (μg/m³) 94.27 81.45 150.98 97.78 105.71

SD 44.49 20.24 56.66 32.51 35.37

Table 4 Contingency table of real observed values versus forecast values where‘yes’refers to PM10≥150μg/m³while otherwise is denoted as‘no’

Forecast yes Forecast no

Observed yes A B

Observed no C D

(8)

than the other method as it is able to capture the overall patterns yet still underestimates the peaks.

Tables5 and6 present the comparison of dynamic forecast results in terms of RMSE and MAPE between the conventional method and the proposed method for Dataset1 and Dataset2, respectively, on validation datasets at the five stations.

As shown in the tables, the proposed method outper- forms the conventional method for both the datasets at all stations with a smaller RMSE and MAPE. It is seen that the differences of the RMSE and the MAPE

between both the methods are larger for Dataset2 than those for Dataset1. This means that the DWT method improves the forecast accuracy more effectively for data exhibiting large variances. This is further confirmed by the overall forecast performance of the methods as shown in Table7. Moreover, the overall results of the POD and the FAR for Dataset2 are displayed in Table8, with the specified threshold limits. The POD is used to measure the model’s ability to detect accurately peak concentrations exceeding a certain threshold, whereas the rate of giving mistaken alarms while the 35

40 45 50 55 60

Day/Month (a) CA09

20 30 40 50 60 70

Day/Month (b) CA10

40 50 60 70 80

Day/Month (c) CA11

15 25 35 45 55

Day/Month (d) CA16

20 30 40 50 60 70 80

Day/Month (e) CA58

Fig. 4 Forecast plots of PM10concentrations at the five stations for Dataset1. The solid lines are the observed concentrations, long dashed lines are the forecasts by the conventional method and dotted lines are forecasts by the proposed method

(9)

0 50 100 150 200

Day/Month (a) CA09

40 60 80 100 120

Day/Month (b) CA10

0 50 100 150 200 250 300

Day/Month (c) CA11

0 50 100 150 200

Day/Month (d) CA16

0 50 100 150 200

Day/Month (e) CA58

Fig. 5 Forecast plots of PM10concentrations at the five stations for Dataset2. The solid lines are the observed concentrations, long dashed lines are the forecasts by the conventional method and dotted lines are forecasts by the proposed method

Table 5 Comparison of forecast results between the conventional and the proposed methods for Dataset1 at the five stations

Method Station CA09 CA10 CA11 CA16 CA58

Without DWT RMSE 6.401 14.222 9.709 13.271 13.497

MAPE 11.190 33.298 13.445 38.008 32.803

With DWT RMSE 6.092 11.600 8.062 9.755 10.642

MAPE 10.765 28.722 12.634 27.923 27.217

Best decomposition level for DWT method 3 4 5 5 3

The italics indicate the smaller values of RMSE/MAPE

(10)

concentration is actually under the threshold is deter- mined by the FAR.

From Table7, it can be seen that the proposed method improves the overall RMSE and MAPE for both the datasets, and the percentages of improvement in Dataset2 are more than double those in Dataset1. This evidently supports the usefulness of DWT when there are abrupt changes in the PM10 concentrations. The argument of DWT being able to enhance the forecast accuracy by breaking down the original series into smaller series with lower variability is validated.

Furthermore, two additional measures, namely POD and FAR with threshold limit of 150μg/m³, were used to evaluate the forecast results of Dataset2 only because there were no observed values exceeding this limit in the validation sets of Dataset1. As provided in Table8, the proposed method effectively detected 18.2% (2/11 × 100%) of the high PM10(≥150μg/m³) cases, but none was detected by the conventional method. Out of the nine cases in which the forecasts failed, five of them however had forecast values of more than 120μg/m³which were also adequately in high concentrations to gain attraction.

On the other hand, though there was 33.3% (1/3 × 100%) of the false alarms, the actual PM10concentration on the day was also quite high (> 120 μg/m³), which posed enough threat to human health. So, this false alarm may still be beneficial to the society. The false alarm rate for

the conventional method is undefined as the forecast values never exceed the threshold limits. The zero detection rate and undefined false alarm rates of the dangerous concentration levels by the conventional approach are due to the flat forecasts around the mean level produced by the ARIMA models in the long term. As a result, the conventional method can neither detect the peaks nor generate any false alarms.

The detection rate of 18.2% implies that DWT still faces the problem of underestimation, which means when the real concentrations are beyond the threshold limit, the corresponding forecast levels are in general lower. Therefore, the POD can be improved by reducing the threshold limit to a lower, but still reasonably high concentration to trigger an alarm. Since many of the forecasts were able to reach 120μg/m³, which was still a concentration much higher than the normal readings, when the corresponding actual levels were beyond 150μg/m³, it was suitable to decrease the threshold limit to 120μg/m³. As shown in Table8, the POD increased as anticipated from 18.2 to 44.4% (12/27 × 100%) when the threshold was set to 120μg/m³. Meanwhile, the FAR decreased from 33.3 to 14.3% (2/14 × 100%). Further investigation indicated that by these 12 successful detec- tions, eight cases corresponded to true concentrations higher than 150μg/m³. This indicated that lowering the threshold limit helped in lifting the successful detection from two to eight cases. Therefore, carefully lowering the threshold limit was an effective way in improving the Table 6 Comparison of forecast results between the conventional and the proposed methods for Dataset2 at the five stations

Method Station CA09 CA10 CA11 CA16 CA58

Without DWT RMSE 65.691 28.237 98.940 56.726 64.314

MAPE 46.770 26.813 47.723 41.145 44.512

With DWT RMSE 28.372 17.164 45.154 28.700 35.316

MAPE 28.159 14.722 23.019 19.212 22.824

Best decomposition level for DWT method 5 4 5 5 5

The italics indicate the smaller values of RMSE/MAPE

Table 7 Overall forecast results of the RMSE and the MAPE from all stations for Dataset1 and Dataset2

Dataset1 Dataset2

RMSE MAPE RMSE MAPE

Conventional method 11.797 25.749 66.717 41.393 Proposed method 9.435 21.452 32.278 21.587

% of improvement 20.02 16.69 51.62 47.91 The italics indicate the smaller values of RMSE/MAPE

Table 8 Overall forecast results of the POD and the FAR from all stations for Dataset2

Threshold limit 150μg/m³ 120μg/m³

POD FAR POD FAR

Conventional method 0 – 0 –

Proposed method 0.182 0.333 0.444 0.143

(11)

warning system when the forecasting models faced the shortcoming of underestimation.

Conclusion

The central idea of this paper is the application of DWT to enhance the accuracy in forecasting PM10concentra- tions by partitioning the original series into a few sub- series with a smaller variation. The combined ARIMA- GARCH forecasting model incorporated with DWT was used. This proposed method resulted in more accurate forecasts for both the datasets (normal and high anom- alies periods in the validation sets). Furthermore, de- creasing the threshold limit to activate the warning system has proven to improve the detection rate of the pollution peaks when there was an underestimation problem.

Another point of concern for the applicability of the proposed method is the computational time. Though it may seem complicated to add an additional step of DWT, the decomposition of the 1-year data is done practically within second of using the software. In conclusion, the primary objective of this study which is to investigate the prominence of DWT in improving the forecast accuracy is achieved and has been proven by the results. Therefore, DWT is recommended to enhance the forecasting of PM10 concentrations in Peninsular Malaysia, particularly in the case with unexpected changes such as during haze periods.

Acknowledgements The authors would like to acknowledge the Malaysia Department of Environment for providing the data for this study. The first author also thanks Universiti Sains Malay- sia and the Ministry of Higher Education for their financial support through the Fellowship Scheme and MyMaster.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References

Afroz, R., Hassan, M. N., Ibrahim, N. A., et al. (2003). Review of air pollution and health impacts in Malaysia.Environmental Research, 92, 71–77.

Anderson, J. O., Thundiyil, J. G., Stolbach, A., et al. (2012).

Clearing the air: a review of the effects of particulate matter air pollution on human health. Journal of Medical Toxicology, 8, 166–175.

Bhattacharjee, H., Drescher, M., Good, T., Hartley, Z., Leza, J.-D., Lin, B., Moss, J., Massey, R., Nishino, T., Ryder, S., Sachs, N., Tozan, Y., Taylor, C., & Wu, D. (1999). Section 1:

Introduction. In Particulate Matter in New Jersey.

Princeton: Princeton University.

Brockwell, P. J., & Davis, R. A. (2002).Introduction to time series and forecasting(Second ed.). New York: Springer.

Chelani, A. B., & Devotta, S. (2006). Nonlinear analysis and prediction of coarse particulate matter concentration in ambient air. Journal of the Air & Waste Management Association, 56(1), 78–84.

van der Wal, J. T., & Janssen, L. H. J. M. (2000). Analysis of spatial and temporal variations of PM10 concentrations in the Netherlands using Kalman filtering. Atmospheric Environment, 34, 3675–3687.

Diaz-Robles, L. A., Ortega, J. C., Fu, J. S., Reed, G. D., Chow, J.

C., Watson, J. G., Moncada-Herrera, J. A., et al. (2008). A hybrid ARIMA and artificial neural networks model to forecast particulate matter in urban areas: the case of Temuco, Chile.Atmospheric Environment, 42, 8331–8340.

DOE Department of Environment. (2000). A guide to Air Pollution Index (API) in Malaysia. Kuala Lumpur: Ministry of Science, Technology and the Environment Malaysia.

DOE Department of Environment. (2015). Malaysia Environmental Quality Report 2014. Kuala Lumpur:

Ministry of Natural Resources and Environment Malaysia.

Joo, T. W., & Kim, S. B. (2015). Time series forecasting based on wavelet filtering. Expert Systems with Applications, 42, 3868–3874.

Liew, J., Latif, M. T., Tangang, F. T., Mansor, H., et al. (2009).

Spatio-temporal characteristics of PM10 concentration across Malaysia.Atmospheric Environment, 43, 4584–4594.

Liew, J., Latif, M. T., Tangang, F. T., et al. (2011). Factors influencing the variations of PM10 aerosol dust in Klang Valley, Malaysia during the summer. Atmospheric Environment, 45, 4370–4378.

Liu, P.-W. G. (2009). Simulation of the daily average PM10con- centrations at Ta-Liao with Box- Jenkins time series models and multivariate analysis.Atmospheric Environment, 43, 2104–2113.

Peng, R. D., Chang, H. H., Bell, M. L., McDermott, A., Zeger, S.

L., Samet, J. M., Dominici, F., et al. (2008). Coarse particulate matter air pollution and hospital admissions for cardio- vascular and respiratory diseases among Medicare patients.

American Medical Association, 299(18), 2172–2179.

Percival, D. B., & Walden, A. T. (2000).Wavelet methods for time series. Cambridge: Cambridge University Press.

Reisen, V. A., Sarnaglia, A. J. Q., Reis, N. C., Jr., Levy-Leduc, C., Santos, J. M., et al. (2014). Modeling and forecasting daily average PM10concentrations by a seasonal long-memory model with volatility. Environmental Modelling &

Software, 51, 286–295.

Sahani, M., Zainon, N. A., Wan Mahiyuddin, W. R., Latif, M. T., Hod, R., Khan, M. F., Mohd Tahir, N., Chan, C. C., et al.

(2014). A case-crossover analysis of forest fire haze events and mortality in Malaysia.Atmospheric Environment, 96, 257–265.

Shaharuddin, M., Zaharim, A., Mohd. Nor, M. J., Karim, O. A., Sopian, K., et al. (2008). Application of wavelet transform on airborne suspended particulate matter and meteorological

(12)

temporal variations.WSEAS Transactions on Environment and Development, 4(2), 89–98.

Siwek, K., & Osowski, S. (2012). Improving the accuracy of prediction of PM10pollution by the wavelet transformation and an ensemble of neural predictors. Engineering Applications of Artificial Intelligence, 25, 1246–1258.

Vlachogianni, A., Kassomenos, P., Karppinen, A., Karakitsios, S., Kukkonen, J., et al. (2011). Evaluation of a multiple

regression model for the forecasting of the concentrations of NOx and PM10in Athens and Helsinki.Science of the Total Environment, 409, 1559–1571.

WHO World Health Organization. (2013).Health effects of particulate matter: policy implications for countries in Eastern Europe, Caucasus and Central Asia. Copenhagen: WHO Regional Office for Europe.