Jurnal Sains Kesihatan Malaysia 20 (2) 2022: 23-36
DOI : http://dx.doi.org/10.17576/JSKM-2022-2001-03
Kertas Asli/Original Articles
Projection of Infant Mortality Rate in Malaysia using R
(Unjuran kadar kematian bayi di Malaysia menggunakan R)
NURHASNIZA IDHAM ABU HASAN*, AZLAN ABDUL AZIZ, MOGANA DARSHINI GANGGAYAH, NUR FAEZAH JAMAL, NOR MARIYAH ABDUL GHAFAR
ABSTRACT
Projecting future infant mortality rate (IMR) is an important subject in ensuring the stability of health in one nation or a specific region in general. Secondary data of IMR from December 1950 until December 2020 from United Nations- World Population Prospects were used to project the trend of IMR in Malaysia up to 2023. In this study, five different forecasting models were adopted including Mean model, Naïve model, Autoregressive Integrated Moving Average (ARIMA) model, Exponential State Space model and Neural Network model. The results were analyzed using R programing and RStudio. The out-sample forecasts of mortality rates were evaluated using six error measures namely, Mean Error (ME), Root Mean Square Error (RMSE), Mean Absolute Error (MAE), Mean Percentage Error (MPE), Mean Absolute Percentage Error (MAPE) and Mean Absolute Scaled Error (MASE). Consequently, the keen analysis was focused on the trend and projection of infant mortality rate in the future using the most accurate model. The results showed that the “win” model for this study is ARIMA (0,2,0) model. The model provided a consistent estimate of IMR in relation to a similar decreasing pattern as shown by the original data and hence a reliable projection of IMR. The three ahead forecast values showed that IMR is likely to keep on continuously decreasing in the future. This study could become a guideline for human resource management and health care allocation planning. A forecast of IMR can help the implementation of interventions to reduce the burden of infant mortality within the target range.
Keywords: Forecast; infant mortality rate; Malaysia; trend; ARIMA
INTRODUCTION
MOTIVATION OF THE STUDY
Infant mortality trends in most developing countries including Malaysia experienced a gradual decrease for few decades followed by a gradual increase recently. It is said to be even worse when the COVID-19 has hit the population all over the world. Infant Mortality Rate (IMR) is the primary component that closely resembles in determining the monetary position of a nation's economic growth (Mishra et al. 2019; Khan et al. 2019). It has also been identified as a barometer to measure the well-being of local and state health facilities and health in population from any country (Khan et al. 2019; Ewere & Eke 2020).
IMR is defined as the ratio of deaths per 1,000 live births of children under the age of one year to the overall quantity of live births within the same year (Reidpath &
Allotey 2003; Siah & Lee 2015). According to the Statistics Department's "Children's Statistics Malaysia 2019", the IMR has increased to 7.2 deaths per 1,000 live births in
2018, from 6.9 deaths in 2017 and 6.7 deaths in 2016, respectively. This shows that for every 1,000 babies born in 2018, at least seven of them, on average, died past the age of one.
Infant mortality fluctuates according to the health status of the country in a dynamic process (Mishra et al.
2019). The importance lies inside the fact that it can reflect the health of children and the development of the economy and culture of a country or region. A key feature of the reduction of IMR is the association with population growth (Cervellati & Sunde 2011). The most debated question between demographers and economists is whether the IMR rise is primarily caused by increasing mortality; or whether the rise of IMR is due to the decrease in income and technology advancements (Awad & Yussof 2017).
Theoretically, declination in IMR is associated with declining pattern in fertility rates (Awad & Yussof 2017).
However, from the economic perspective, a decline in IMR might also be associated with other factors that result in economic expansion and technological change.
Nevertheless, the debate on whether the consequence of 1
the observed decline in IMR on population growth is beneficial or detrimental to economic growth (Mehwish et al. 2019). Socio-economic disadvantages are indirectly at higher risk for increased infant mortality via health
resources (Chan & Van 2010; Chan 2015). Most infant deaths in growing countries are avoidable, but if it occurs, that is because of the necessity of household resources, populace services and the lack of information.
FIGURE 1. Trends in Infant Mortality Rate in Malaysia from 1950 to 2021 (Adapted from https://www.macrotrends.net/countries/MYS/malaysia/infant-mortality-rate) Figure 1 shows the declining trend in Malaysia's Infant
Mortality trend from 1950 to 2021. This declining pattern might be due to the effectiveness of government policies in providing a good health care system to the people.
In this regard, the pattern change in the IMR has attracted researchers in governments and corporate sectors across borderless worldwide to investigate further consequences on this situation. A few studies have reported that planning the right programs related to infant mortality based on the current situation produces a good change in reducing infant mortality rate. For example, infant mortality data can indicate maternity health and wellbeing (Zulkifle et al. 2019). Clearly, IMR ties with the government plan and social security where the improvements in the monitoring mortality play a significant role for the authorities to make sure that the rate will not increase.
Furthermore, policy-making bodies like World Health Organization (WHO), United Nations (UN) have been using IMR as an important indicator to measure the socio- economic development of a country. These bodies concern with the movement of worldwide mortality. It has been often used as an indicator to understand the equity of distribution of resources and investment in health and social services (WHO 2001). The UN currently uses IMR as a measure of social well-being and considers the trends of IMR as one of the indicators to achieve the Millennium Development Goals (MDG). One of the targets in sustainable developmental goals is to fight against IMR in reducing the mortality to 25/1000 live births by 2030 which may help to achieve the MDG (De et al. 2016). The study of IMR trends and the possible explanations for such trends are of prime importance to understand socio-economic progress and investment in public health (De 2011). Hence,
an accurate and reliable forecasting tool plays an important role in predicting future mortality for any health organization in ensuring the stability of health in one nation or a specific region in general.
The area of mortality forecasting is continuously evolving. The study on forecasting infant mortality has been in the headline of mortality research when the mortality rate declines dramatically in all developed countries in the twentieth century (Kamaruddin & Ismail 2018). Recently, IMR is a significant public health problem in various countries globally, including Malaysia. The burden of IMR in Malaysia requires a reliable forecast that will guide policymakers in the health sector.
Comprehensively, the IMR is considered a more sensitive indicator of population health because it focuses on the attention of health policy on a small part of the population (Feng et al. 2012). Moreover, infant mortality is correlated with a range of different factors like environment, socio-economic conditions, geographic location, maternal health, and specific demographics (MacDorman et al. 2010; Mathews & Thoma 2015; Khan et al. 2019). On the other hand, factors like basic health service system, female education, safe drinking water, and nutrition have limited effects on an infant or under-five mortality in developed countries (Rajaratnam et al. 2010).
The realization of the importance of infant mortality and the need to reduce or possibly eradicate infant death both at the national and international level will likely result in several policy implications.
The study on infant mortality has been a powerful device to explain changes and variations in a nation's population distribution. The level and trend of IMR indicate the social development of a region and it also shows the
physical, psychological, and economic state of a female.
It has also been increasingly realized, for several reasons that child mortality, i.e. mortality under age five, needs to be examined in advance (Black et al. 2016). Obviously, a decrease in IMR indicates the increase in the probability of survival of a child. It has been seen that the mothers aged less than 20 years and more than 35 are most likely to experience more infant deaths as compared to the mothers aged between 20 to 35 years.
In Malaysia, the studies on IMR are still very limited, especially with regards to time series. Observing infant mortality pattern and trend is an important subject to maintain a good social economy in the next projection years. Furthermore, inaccurate mortality estimates have a big impact on the policies and interventions related to health, life insurance and pensions (Manan et al. 2019).
This study is the first attempt to identify the best forecast model of IMR in the context of Malaysia. Given the importance of infant and its implications for population growth and economic development, it is imperative to project the IMR in Malaysia over time.
For this purpose, several models were analyzed and compared. Then, the best model that suits the current IMR data has been determined using goodness of fit test and forecast performance. In addition, the present study serves as a literature for other researchers who wish to embark studies on IMR in Malaysia. In light of this, the aim of this paper is to identify the best forecast model to project the IMR in Malaysia for the period between December 1950 and December 2020.
The motivation for this study is drawn from the realization that knowledge on reliable and accurate forecasts of IMR is necessary for the planning of suitable intervention programs and preventive measures for the reduction of infant mortality in Malaysia. However, early childhood mortality is still high and turns into a huge problem in some developing countries. In line with this, this paper aims to analyze the trend pattern, to identify a model that best describes, and forecast future trends of IMR in Malaysia. Furthermore, if the forecast value of IMR is available, the intervention could be planned and implemented effectively in a timely manner. Exceptions are noted in the work of Petrevska (2012 & 2013) who argues on the need to applying forecasting methods in
preparing the country for coping with unexpected challenges. This study will be beneficial to both women’s health and children’s survival in Malaysia.
DATA AND METHODOLOGY
The goal of this study is to develop a precise statistical model that helps in forecasting the destiny values of the observations primarily based on the characteristics of the records. The main contribution of this paper is the comparison on extant empirical studies on determining the
“win” model by examining the IMR in the context of Malaysia using R. In this study, five different forecasting techniques were applied to analyze the IMR in Malaysia, which are Mean model, Naïve model, Autoregressive Integrated Moving Average (ARIMA) model, Exponential State Space model, and Neural Network model.
Six error measures, namely, Mean Error (ME), Root Mean Square Error (RMSE), Mean Absolute Error (MAE), Mean Percentage Error (MPE), Mean Absolute Percentage Error (MAPE), and Mean Absolute Scaled Error (MASE) were calculated to identify the best model. All the analyses were done using R programming and RStudio. The detailed description on the model fitting used in this study is illustrated in Figure 2.
In Phase I: The model development started with data cleaning where the data were transformed into time series format and checked for missing values. Then, the time series data were split into two parts which are estimation and evaluation. The common practice is to use 75 percent of the data as estimation part and the remaining 25 percent as evaluation part. However, to increase the accuracy of the forecast values, 5 different sets were used as in Table 1.
TABLE 1. Data Partitioning
Data Partitioning Estimation Part Evaluation Part
Set 1 95% 5%
Set 2 90% 10%
Set 3 85% 15%
Set 4 80% 20%
Set 5 75% 25%
FIGURE 2. Flowchart explaining model fitting to analyse the infant mortality rate Phase II: For each data set, the parameter for every
model (Mean model, Naïve model, Autoregressive Integrated Moving Average (ARIMA) model, Exponential State Space model, and Neural Network model were estimated. Error measures for both estimation and evaluation parts were calculated.
Phase III: Error measures for estimation part give an idea of how well the model fits the data. However, the best model which fits the estimation part will not necessarily forecast well (Hyndman et al. 2018). Therefore, it is very important to look at the performance of error measures in the evaluation part. The “win” model was selected based on the model that produced the lowest error measures values in the evaluation part (Lazim 2011).
Phase IV: The 3 step-ahead forecast values were generated using the “win” model.
DATA COLLECTION
Data on the IMR in Malaysia were obtained from United Nations - World Population Prospects. The set of data
consists of 71 observations from December 1950 until December 2020.
Forecasting Methods Mean model
Mean model also known as constant model is a very simple forecasting model. It can be denoted as,
Ft+k = ȳ (1) where Ft+k is the forecast for k-step-ahead at time, t and ȳ is the arithmetic mean of the actual historical time series.
Naïve model
According to Goodwin (2014) and Dhakal (2018). Naïve model of forecasting was often used as a benchmark when assessing the accuracy of a set of forecasts. The previous period is used to forecast the next period in this model. It can be denoted as below:
Ft+k = yt (2) where yt is the actual value at time, t.
Use the best model to forecast
ARIMA model
The forecast of IMR was made based on Box-Jenkins approach to model time series. The idea of ARIMA model is to capture the autocorrelation in the series which measures the relationship between a variable's current value and its past values. This is supported by Hyndman and Athanasopoulos (2018), where ARIMA model aims to describe the autocorrelations in the data. This model is also based on the idea that the information in the past values of the time series alone can be used to forecast future values.
In other words, ARIMA models use data of past forecast errors and past observations of the variable of interest to forecast its future trend.
According to Scott (2019), the Box-Jenkins Model was created by two mathematicians George Box and Gwilym Jenkins. Box-Jenkins methodology refers to a systematic method of identifying, fitting, checking, and using ARIMA time series models. The general Box-Jenkins ARIMA(p,d,q) model for w is written as (Box et al. 2015):
(3)
where:
Ø and θ : unknown parameters.
ɛt : error terms assumed to be independent
and identically distributed normal errors with zero mean.
p : the number of lagged values of wt (represents the order of autoregressive (AR) dimensions.
d : the number of times w is differed.
q : the number of lagged values of the error terms (represents the order of moving average (MA) dimension of the model.
ARIMA methodology consists of three phases: Phase I (Identification), Phase II (Estimation and testing), and Phase III (Forecasting).
Exponential State Space model
Exponential smoothing state space model consists of a broad family of approaches to perform univariate time series forecasting. Well-known models such as single exponential smoothing, double exponential smoothing, Holt’s linear trend, and the Holt-Winters seasonal method can be shown to be special cases of a single family.
A simple must-read in this space is the taxonomy of exponential smoothing methods. Usually, a three-character string identifying method is used in the framework terminology of Hyndman et al. (2008). The taxonomy is based on characterizing each model against three dimensions: error, trend, and seasonality (hence the function that is used in these models is ets() in the forecast package). Each of those three can be characterized as
“additive”, “multiplicative” or “none”.
Neural Network model
According to Braga et al. (2007), Artificial Neural Networks (ANN) are distributed parallel systems consisting of simple processing units (artificial neurons) which calculate certain mathematical functions (usually nonlinear) and are arranged in one or more layers connected by many connections, usually one-way and with two processing phases, including learning and usage.
There are three distinct interconnected layers in a typical ANN architecture namely, an input layer, hidden layer(s), and an output layer. They are connected via neurons and the strength of each connection is represented by a numeric weight value. For prediction tasks, basically these weight values corresponding to decision boundary, are estimated using various optimization algorithms on the training data set. Once the estimated values are stabilized after validation, trained ANN is tested against a test data set to assess its predictive power.
The functional relationship estimated by the ANN can be written as,
y = f ( x1, x2,...,xp ) (4) where x1, x2,...,xp are p independent variables and y is a dependent variable.
The ANN performs the following function mapping:
y t+1 = f ( y1, yt+1,...,yt+p ) (5) where yt is the observation at time t.
Neural network as in Figure 3 is a network of
"neurons" which are organized in layers with predictor or inputs as bottom layer and forecast or outputs as top layer.
Forecast is obtained from linear combination of inputs.
FIGURE 3. Neural network model
In this study, lagged values of the time series data were used as inputs in a linear auto-regression model and this model is known as Neural Network Autoregression or NNAR model. This study only considered feed-forward networks with hidden layer NNAR (p, k) to indicate there are p lagged inputs and k nodes in the hidden layer.
Error Measures
In this study, six error measures were used namely Mean Error (ME), Root Mean Square Error (RMSE), Mean Absolute Error (MAE), Mean Percentage Error (MPE), Mean Absolute Percentage Error (MAPE), and Mean Absolute Scaled Error (MASE). The formulas used to calculate the errors are:
et = yt - ŷt
(6)
(7)
(8)
(9)
(10)
(11) Where
yt : The actual value in time t ŷt : The fitted value in time t n : Sample size
RESULTS AND DISCUSSION
The R codes for all the analyses are attached in Appendix A.
Data Cleaning
The first step was to import the data into RStudio and convert it into time series format by using the ts() function within the fpp package. Then, any(is.na()) function was applied to check the presence of missing values. There were no missing values detected. Further analyses were performed using the complete data.
Descriptive Analysis
The descriptive analysis using summary R function shows that the maximum value is 114.445, while the minimum value is 5.600. It indicates that the highest infant mortality rate recorded in Malaysia (up to date data have been collected January 1, 2021) was 114.445 deaths per 1000 live births on December 31, 1950, while the minimum value indicates that the lowest infant mortality rate recorded in Malaysia was 5.600 deaths per 1000 live births on December 31, 2020. The pattern of time series data was analyzed to identify the time series components in the data.
The original time series plot (Figure 4) shows that there is a presence of secular trend component with a steady decreasing pattern over time in the data.
FIGURE 4. Original Time Series Plot
Univariate Time Series Modelling
To illustrate, data partitioning Set 1 (Estimation part 95 percent and Evaluation part 5 percent) will be used.
Mean model:
Two R functions were required: (1) meanf() function to formulate the Mean model, and (2) accuracy() function to assess the model performance.
Naive model:
The naive() function was used to define the Naïve model.
ARIMA model:
The auto.arima() function was used to define the ARIMA model.
Exponential State Space model:
The ets() function was used for this model.
Neural Network model:
The nnetar() function was used for this model.
The outputs for these five models using Set 1 (95% - estimation part, 5% - evaluation part) were summarized as in Table 2.
TABLE 2. Outputs Summary of Set 1 (95% Estimation Part and 5% Evaluation Part) Estimation Part
Model ME RMSE MAE MPE MAPE MASE
Mean 2.0.2E-15 30.9901 25.521 -124.942 157.044 15.547
Naive -1.641 2.200 1.641 -4.553 4.553 1.000
ARIMA(0.2,0) 0.061 0.241 0.072 0.257 0.280 0.044
ETS (M, Ad, N) -0.059 0.245 0.155 -0.069 0.487 0.095
NNAR(1,1) 0.001 0.454 0.314 -0.389 1.201 0.1912
Evaluation Part
Model ME RMSE MAE MPE MAPE MASE
Mean -28.671 28.671 28.671 -493.626 493.626 17.466
Naive -0.295 0.332 0.295 -5.147 5.147 0.180
ARIMA(0.2,0) -0.035 0.051 0.035 -0.619 0.619 0.021
ETS (M, Ad, N) -0.-071 0.093 0.071 -1.241 1.241 0.043
NNAR(1,1) -0.519 0.572 0.519 0.519 9.039 0.316
ETS (M,Ad, N) = Additive Damped Trend with Multiplicative Errors (initial = 118.8286, alpha = 0.9999) NNAR (1,1) = Neural Network Auro-Regressive
Based on Table 2, ARIMA (0,2,0) is selected as the best
model for Set 1, since its produce the lowest error measures in evaluation part. The same procedures were used for the next 4 sets of data were summarized as in Table 3.
TABLE 3. Sets of Data Partitioning (Set 2-Set 5)
Data Partitioning Estimation Part Evaluation Part
Set 2 90% 10%
Set 3 85% 12%
Set 4 80% 20%
Set 5 75% 25%
Evaluation Procedures
TABLE 4. Model Comparison
Model Percentage ME RMSE MAE MPE MAPE MASE
Estimation Part
Mean 95% 2.0.2E-15 30.9901 25.521 -124.942 157.044 15.547
Mean 75% 3.13E-15 30.965 25.762 -82/383 112.601 12.595
Naive 95% -1.641 2.200 1.641 -4.553 4.553 1.000
ARIMA(0.2,0) 85% 0.068 0.255 0.081 0.285 0.309 0.044
ETS (M, Ad, N) 90% -0.062 0.251 0.163 -0.070 0.507 0.095
NNAR(1,1) 95% 0.001 0.454 0.314 -0.389 1.201 0.1912
Evaluation Part
Mean 5% -28.671 28.671 28.671 -493.626 493.626 17.466
Mean 25% -35.177 35.182 35.177 -537.435 537.435 17.197
Naive 5% -0.295 0.332 0.295 -5.147 5.147 0.180
ARIMA(0.2,0) 15% 0.011 0.022 0.016 0.185 0.272 0.009
ETS (M, Ad, N) 10% -0.068 0.101 0.069 -1.193 1.198 0.040
NNAR(1,1) 5% -0.519 0.572 0.519 -9.039 9.039 0.316
Lowest -35.177 0.022 0.016 -537.435 0.272 0.009
ETS (M,Ad, N) = Additive Damped Trend with Multiplicative Errors (initial = 118.8286, alpha = 0.9999) NNAR (1,1) = Neural Network Auro-Regressive
Table 4 summarizes the outputs of each model with their respective error measures. Based on the error measures in the evaluation part, model ARIMA (0,2,0) produced the lowest error measures (four out of six error measures except for ME and MPE). Therefore, it is concluded that model ARIMA (0,2,0) is the “win” model to forecast the IMR in Malaysia.
Forecasting Future Values
The “win” model in this study is ARIMA(0,2,0) used to generate three step-ahead forecast values of ARIMA(0,2,0) for 2021-2023 as in Figure 5.
FIGURE 5. Three step-ahead forecast values of “Win” Model The forecast values for the IMR in Malaysia from
December 31, 2021 to December 31, 2023 are tabulated in Table 6.
TABLE 6. Three-step Ahead Forecast Values of “Win” Model
Date Forecast Values 95% CI
December 31, 2021 5.510 (-7.431, 18.451) December 31, 2022 5.396 (-9.130, 19.922) December 31, 2023 5.282 (-10.889, 21.453)
The forecast values show a steady decreasing pattern in the predicted IMR in Malaysia. This depicts a similar secular trend as the original time series plot shown in Figure 4.1.
DISCUSSION
It can be inferred that all models used in this study depict different characteristics in fitting the data. For example,
Mean model, Naïve model, and ETS models are well fitted using the 75 percent, 95 percent, and 80 percent of estimation part respectively. However, ARIMA model shows a different performance in fitting the data as it is well fitted equally in 75 percent and 95 percent of the Estimation Part. Neural Network model also depicts the same performance as it is well fitted using the 85 percent and 95 percent of the estimation part.
Besides, the sets of partitioning data depict similar characteristics in terms of their performance in modelling the data and the forecast accuracy for each model. For example, both Naïve and Neural Network models are well assessed at Set 1 (5 percent). Besides, ARIMA and ETS models are well assessed at Set 3 (15 percent) and Set 2 (10 percent) respectively. However, Mean model shows different performance in assessing the data as it is well assessed at Set 1 (5 percent) and Set 5 (75 percent). It indicates that Set 1 (5 percent) gives a good performance in forecast accuracy compared to other sets (three models out of five models except ARIMA and ETS). ARIMA model
with the partitioning set of 85 percent (estimation part) and 15 percent (evaluation part) has outperformed other models and has the highest accuracy in predicting the short-term forecast of IMR in Malaysia. Hence, it has been selected as the “win” model for this study.
CONCLUSION
This study aimed to select the “win” model for predicting the infant mortality rate (IMR) in Malaysia using Univariate Time Series Models based on the collected data from December 31, 1950 until December 31, 2020.
This study analyzed and presents the results of modelling Univariate Time Series models on five different sets of partitioning data for forecasting purpose. Each partitioning set was modelled based on the five models, Mean model, Naïve model, Autoregressive Integrated Moving Average model, Exponential State Space models, and Neural Network model. The study further evaluated the accuracy of each model using six types of error measurements. The comparison was made to choose the
“win” model based on the lowest error values. The results show that the “win” model for this study is ARIMA (0,2,0) model with partitioning set of 85 percent (estimation part) and 15 percent (evaluation part).
The “win” model for this study produced forecast values that depict a similar decreasing pattern as shown by the original time series plot. Therefore, this model is reliable in forecasting the IMR in Malaysia.
RECOMMENDATION
This study used five univariate models namely Naive model, Mean model, Autoregressive Integrated Moving Average model, Exponential State Space models, and Neural Network models. The R codes for all the models are available online via GitHub repository (https://github.
com/MoganaD/Projection-of-Infant-Mortality-Rate-in- Malaysia-using-R).
The results produced may be the best solution as it depicts a similar decreasing pattern as shown in the original time series plot. However, researchers can use complex univariate methods to get more reliable results for future research on this study. The researchers can use complex univariate methods such as the Generalized ARCH Model (GARCH), Autoregressive Conditional Heteroscedasticity Model (ARCH), and the hybrid of ARIMA/GARCH model.
REFERENCES
Awad, A., & Yussof, I. 2017. Factors Affecting Fertility – New Evidence from Malaysia. Bulletin of Geography.
Socio–economic Series 36: 7–20.
Black, R. E., Laxminarayan, R., Temmerman, M., &
Walker, N. 2016. Reproductive, Maternal, Newborn, and Child Health Disease Control Priorities, Third Edition (Volume 2).
Box, G. E., Jenkins, G. M., Reinsel, G. C., & Ljung, G. M.
2015. Time series analysis: forecasting and control.
John Wiley & Sons.
Braga, A., de Leon Ferreira, A. C. P., & Ludermir, T. B.
2007. Redes neurais artificiais: teoria e aplicações.
Rio de Janeiro, Brazil, LTC editora.
Cervellati, M., & Sunde, U. 2011. The Effect of Life Expectancy on Economic Growth Reconsidered.
Journal of Economic Growth 16(2): 99-133.
Chan, M. F. 2015. The impact of health care resources, socioeconomic status, and demographics on life expectancy: a cross-country study in three Southeast Asian countries. Asia Pacific Journal of Public Health 27(2): 972-983.
Chan, M. F., Ng, W. I., & Van, I.K. 2010. Socioeconomic instability and the availability of health resources:
their effects on infant mortality rates in Macau from 1957–2006. Journal of clinical nursing 19(5‐6): 884- Dhakal, C. P. 2018. A Naïve Approach for Comparing 891.
a Forecast Model. International Journal of Thesis Projects and Dissertation 5(1):1-3.
De, P., Sahu, D., Pandey, A., Gulati, B. K., Chandhiok, N., Shukla, A., et al. 2016. Post Millennium Development Goals Prospect on Child Mortality in India: An Analysis Using Autoregressive Integrated Moving Averages (ARIMA) Model. Health. 08:1845- 1872.
De, P. A. 2011. Temporal and Spatial Dimension of Under Five Mortality with Emphasis on Impact of National Health Programme: For the Period 1976-2004. PhD thesis, Jadhavpur University.
Ewere, F. & Eke, D. O. 2010. Time Series Analysis and Forecast of Infant Mortality Rate in Nigeria: An Arima Modeling Approach. Canadian Journal of Pure and Applied Sciences 14(2):5049-5059.
Feng, X. L., Theodoratou, E., Liu, L., Chan, K. Y., Hipgrave, D., Scherpbier, R., Brixi, H., Guo, S., Chunmei, W., Chopra, M., Black, R. E., et al. 2012.
Social, economic, political and health system and program determinants of child mortality reduction in China between 1990 and 2006: a systematic analysis.
Journal of global health 2(1):1-11.
Goodwin, P. (2014). Using naïve forecasts to assess limits to forecast accuracy and the quality of fit of forecasts to time series data (Working paper).
10.13140/2.1.1979.2324.
Hyndman, R. J., & Athanasopoulos, G. 2018. Forecasting:
principles and practice. OTexts.
Hyndman, R., Koehler, A. B., Ord, J. K., & Snyder, R. D.
2008. Forecasting with exponential smoothing: the state space approach. Springer Science & Business Media.
Kamaruddin, H. S., & Noriszura Ismail, N. 2018.
Forecasting selected specific age mortality rate of Malaysia by using Lee-Carter model. International Conference on Mathematics: Pure, Applied and Computation. Series 974: 012003.
Khan, M. S., Fatima, S., Zia, S. S., Hussain, E., Faraz, T. R., & Khalid, F. 2019. Modeling and Forecasting Infant Mortality Rates of Asian Countries in the Perspective of GDP (PPP). International Journal of Scientific & Engineering Research 10 (3): 18 – 23.
Lazim, A. 2011. Introductory business forecasting: A practical approach (3rd Ed.), University Publication Centre (UPENA), UiTM.
MacDorman, M. F., et al. 2010. International comparisons of infant mortality and related factors: United States and Europe.
Manan M., Hasan, N. I. A, Hasan, N., Jamal, N. F., Rahidin, N. D. A. M. 2019. Impact De-Tarrification in Modeling Motor Insurance Premium in Malaysia.
International Journal of Recent Technology and Engineering 8(3):7394-7400.
Mathews, T., MacDorman, M. F. & Thoma, M. E. 2015.
Infant mortality statistics from the 2013 period linked birth/infant death data set.
Mehwish S. K. et al. 2019. Modeling and Forecasting Infant Mortality Rates of Asian Countries in the Perspective of GDP (PPP). International Journal of Scientific & Engineering Research 10(3):18-23.
Mishra, A. K., Sahanaa, C., & Manikandan, M. 2019.
Forecasting Indian infant mortality rate: An application of autoregressive integrated moving average model. Journal of family & community medicine 26(2), 123–126.
Petrevska, B. 2012. Forecasting international tourism demand: The Evidence of Macedonia. UTMS. Journal of Economics 3: 45–55.
Petrevska, B. 2013. International tourism demand in Macedonia: current status and estimation. Зборник радова Економског факултета 7:51-60.
Rajaratnam, J. K., Marcus, J. R., Flaxman, A. D., Wang, H., Levin-Rector, A., Dwyer, L., Costa, M., Lopez, A. D., Murray, C. J. L. 2010. Neonatal, postneonatal, childhood, and under-5 mortality for 187 countries, 1970–2010: a systematic analysis of progress towards Millennium Development Goal 4. The Lancet 375(9730): 1988-2008.
Reidpath, D. D. and Allotey, P. 2003. Infant mortality rate as an indicator of population health. J Epidemiol Community Health 57(5):344–346.
Scott, G. (2019). Box-Jenkins Model. Retrieved from https://www.investopedia.com/terms/b/box-jenkins- model.asp
Siah, Audrey & Lee, Grace. (2015). Female Labour Force Participation, Infant Mortality and Fertility in Malaysia. Journal of the Asia Pacific Economy.
Forthcoming. 10.1080/13547860.2015.1045326.
UN (United Nations). 2000. United Nations Millennium Declaration: Resolution Adopted by the General Assembly. 55/2. New York: United Nations.
WHO. World health chart. In Karolinka Institute/ Lund University SIDA, 2001.
Zulkifle, H., Yusof, F., and Nor, S. R. M. (2019).
Comparison of Lee Carter Model and Cairns, Blake and Dowd Model in Forecasting Malaysian Higher Age Mortality. Matematika: MJIAM Special Issue (35): 65-77.
Nurhasniza Idham Abu Hasan*
Nur Faezah Jamal
Department of Statistics, Faculty of Computer &
Mathematical Sciences,
Universiti Teknologi MARA, Perak Branch, Tapah Campus, 35400 Tapah Road, Perak,
Malaysia.
Azlan Bin Abdul Aziz
Department of Mathematics and Statistics, Faculty of Computer & Mathematical Sciences,
Universiti Teknologi MARA, Perlis Branch, Arau Campus, 02600 Arau, Perlis,
Malaysia.
Mogana Darshini Ganggayah Centre of Applied Data Science,
Department, Institute of Biological Science,
Faculty of Science, Universiti Malaya, Kuala Lumpur, Malaysia.
Nor Mariyah Binti Abdul Ghafar
Department of Actuarial Science, Faculty of Computer &
Mathematical Sciences,
Universiti Teknologi MARA, Perak Branch, Tapah Campus, 35400 Tapah Road, Perak,
Malaysia.
Corresponding author: [email protected]
APPENDIX A – R CODES 1.0 Data Cleaning
> DATASET_ts <- ts(DATASET, start = 1, frequency = 1)
> head(DATASET_ts) Time Series:
Start = 1 End = 6 Frequency = 1
Deaths per 1,000 Live Births [1,] 114.445 [2,] 110.138 [3,] 105.830 [4,] 101.523 [5,] 97.216 [6,] 92.908
> tail(DATASET_ts) Time Series:
Start = 66 End = 71 Frequency = 1
Deaths per 1000 Live Births [1,] 6.212 [2,] 6.108 [3,] 6.003 [4,] 5.899 [5,] 5.750
2.0 Descriptive analysis
> summary(DATASET_ts) Deaths per 1000 Live Births Min. : 5.600 1st Qu.: 7.851 Median : 18.949 Mean : 32.868 3rd Qu.: 47.648 Max. :114.445
3.0 Time series plot
> estimation95 <- window(DATASET_ts, start = 1, end
= 67)
> estimation95 Time Series:
Start = 1 End = 67 Frequency = 1
> evaluation1 <- window(DATASET_ts, start = 68)
> evaluation1 Time Series:
Start = 68 End = 71 Frequency = 1
Deaths per 1000 Live Births [1,] 6.003 [2,] 5.899 [3,] 5.750 [4,] 5.600
> autoplot(DATASET_ts, series = "Original Time Series Plot") + xlab ("Year") + ylab("Deaths per 1000 Live Births") + theme(legend.position = "bottom")
4.0 Univariate time series modelling 4.1 Mean model
> mean1 <- meanf(estimation95, h=length(evaluation1))
> summary(mean1) Forecast method: Mean Model Information:
$mu [1] 34.48369
$mu.se [1] 3.814652
$sd
[1] 31.22427
$bootstrap [1] FALSE
$call
meanf(y = estimation95, h = length(evaluation1)) attr(,"class")
[1] "meanf"
4.2 Naïve Model
> naive1 <- naive(estimation95, h = length(evaluation1))
> summary(naive1)
Forecast method: Naive method Model Information:
Call: naive(y = estimation95, h = length(evaluation1)) Residual sd: 2.2003
4.3 ARIMA model
> arima1 <- auto.arima(estimation95)
> summary(arima1) Series: estimation95 ARIMA(0,2,0)
sigma^2 estimated as 0.05985: log likelihood=-0.47 AIC=2.94 AICc=3 BIC=5.11
4.4 Exponential State Space model
> ets1 <- ets(estimation95)
> summary(ets1) ETS(M,Ad,N)
Call:
ets(y = estimation95) Smoothing parameters:
alpha = 0.9999 beta = 0.8341 phi = 0.9334 Initial states:
l = 118.837 b = -4.5504 sigma: 0.0084
AIC AICc BIC 66.75223 68.15223 79.98038
4.5 Neural Network Model
> neural1 <- nnetar(estimation95)
> neural1
Series: estimation95 Model: NNAR(1,1)
Call: nnetar(y = estimation95)
Average of 20 networks, each of which is a 1-1-1 network with 4 weights options were - linear output units
sigma^2 estimated as 0.211
> f o r e c a s t 1 _ n e u r a l < - f o r e c a s t ( n e u r a l 1 , h = length(evaluation1))
> forecast1_neural Point Forecast 68 6.219745 69 6.318079 70 6.404646 71 6.480882
5.0 Forecasting future values
The followings are examples of syntax and outputs from R programming and RStudio used for forecasting the 3 step-ahead forecast values.